Tcp offloading (toe) – Dell Emulex Family of Adapters User Manual
Page 657

Emulex Drivers for Windows User Manual
P010077-01A Rev. A
3. Configuration
NIC Driver Configuration
657
RSS may help performance. If the same four-processor server uses RSS, there are four
parallel executing DPCs, one on each processor. The total CPU usage that is available
for networking processing is increased from 25 percent to 100 percent.
Some server machines and some network traffic profiles do not benefit from RSS.
Because the non-offloaded TCP stack includes a data copy during receive processing, it
is possible that memory bandwidth will limit performance before the CPU. In this
situation, the CPU usage is very high while all processors wait for memory accesses. To
overcome this issue, you can reduce the number of RSS CPUs, or disable RSS entirely.
Poor RSS behavior is typical only in network performance testing applications that
receive data, but perform no other processing. For other applications, RSS allows the
application to scale other processing tasks across all CPUs, thereby improving overall
performance. RSS offers the most benefit for applications that create numerous,
short-lived connections. These applications are typically CPU limited instead of
network bandwidth limited.
For information on modifying the RSS Queues parameter, see “Configuring NIC Driver
Note: Microsoft currently does not schedule RSS processing on all hyper-threaded
CPUs. For example, only CPU 1 and 3 have RSS queues on a dual-core,
hyperthreaded CPU.
Enabling Windows to Use Up to Eight Processors
Windows Server 2008 uses only four processors by default. It is possible for adapters to
use up to eight processors. In order for the driver to use up to eight processors, the
registry must be changed and the system restarted.
For Windows Server 2008, set the registry keyword MaxNumRssCpus (a DWORD
type) to 8 at the location:
HKEY_LOCAL_MACHINE\\SYSTEM\CurrentControlSet\Services\Ndis\
Parameters
Note: Do not set the registry keyword to a value greater than the number of
processors in the system or 16, whichever is smaller.
For Windows Server 2008 R2 and Windows Server 2012, the operating system uses all
available CPU cores for RSS without manual configuration.
TCP Offloading (TOE)
Note: TCP Offloading (TOE) is not supported by OCe14000-series adapters.
The adapter and drivers support TCP offload, which provides significant performance
improvements. The performance improvements are:
A zero-copy receive data path exists. In contrast, all non-offloaded TCP packets
are copied in the network stack. This copy dramatically increases the memory
bandwidth and CPU requirements for receive data.
Sending and receiving of ACK packets is handled entirely in hardware,
reducing PCIe bus usage and interrupts.