Dell Emulex Family of Adapters User Manual
Page 656

Emulex Drivers for Windows User Manual
P010077-01A Rev. A
3. Configuration
NIC Driver Configuration
656
To improve network and CPU performance for heavy network loads under these
conditions, you may have to make an appropriate NUMA CPU selection. For example,
in Windows Server 2012 R2, you can use the Task Manager to adjust the “Set Affinity”
property to bind the application to a specific NUMA node for maximum network
performance and CPU efficiency.
Checksum Offloading and Large Send Offloading (LSO)
The adapter supports IP, TCP, and UDP checksum offloading. All these protocols are
enabled by default. You can disable offloading through the Windows Device Manager
Advanced Properties. Disabling checksum offloading is only useful for packet sniffing
applications, such as Ethereal or Microsoft Network Monitor, on the local system where
the adapter is installed and monitored. When packets are sniffed, transmit packets may
appear to have incorrect checksums because the hardware has not yet calculated them.
The adapter supports transmit LSO, which allows the TCP stack to send one large block
of data, and the hardware segments it into multiple TCP packets. This is recommended
for performance, but it can be disabled for packet sniffing applications. LSO sends
appear as giant packets in the packet sniffer, because the hardware has not yet
segmented them.
Note: On Windows Server 2012, Recv Segment Coalescing is enabled by default. You
must disable Recv Segment Coalescing if you want to set the Checksum Offload
setting to anything other than enabled.
For information on modifying the CheckSum Offload or Large Send Offload parameter,
see “Configuring NIC Driver Options” on page 589.
Receive Side Scaling (RSS) for Non-Offloaded IP/TCP Network
Traffic
The adapter can process TCP receive packets on multiple processors in parallel. This is
ideal for applications that are CPU limited. Typically, these applications have
numerous client TCP connections that may be short-lived. Web servers and database
servers are prime examples. RSS typically increases the number of transactions per
second for these applications.
Understanding RSS
To better understand RSS, it helps to understand the interrupt mechanism used in the
network driver. Without RSS, a network driver receives an interrupt when a network
packet arrives. This interrupt may occur on any CPU, or it may be limited to a set of
CPUs for a given device, depending on the server architecture. The network driver
launches one DPC that runs on the same CPU as the interrupt. Only one DPC ever runs
at a time. In contrast, with RSS enabled, the network driver launches multiple parallel
DPCs on different CPUs.
For example, on a four-processor server that interrupts all processors, without RSS the
DPC jumps from CPU to CPU, but it only runs on one CPU at a time. Each processor is
busy only 25 percent of the time. The total reported CPU usage of the system is about 25
percent (perhaps more if other applications are also using the CPU). This is a sign that