Receive side scaling (rss), Analyzing performance issues – Dell Emulex Family of Adapters User Manual
Page 872

Emulex Drivers Version 10.2 for Linux User Manual
P010081-01A Rev. A
3. Configuration
Network Performance Tuning
872
For receive interrupts, disable AIC (since it is enabled by default) and set the interrupt
delay duration using ethtool. For example, to disable AIC and set the constant RX
interrupt delay to 8 microseconds, run
ethtool -C eth
where eth
If your application requires low or predictive latency, Emulex recommends that you
turn off AIC and set rx-usecs to 0.
For transmit interrupts, the default interrupt delay duration is 96 microseconds. You
can change this value using ethtool. For example, to set the TX interrupt delay to 64
microseconds run
ethtool -C eth
where eth
Receive Side Scaling (RSS)
Distributing the incoming traffic across several receive rings with separate interrupt
vectors helps to distribute the receive processing across several CPU cores. This could
reduce the packet drop and improve the packet rate in certain applications. RSS is
enabled in non-SR-IOV and non-multichannel configurations. In multichannel
configurations, RSS is enabled in the first section of each port.
Analyzing Performance Issues
MSI-x interrupts are required for RSS to work. If your motherboard and operating
system version supports MSI-X, the Ethernet driver automatically uses MSI-X
interrupts. If there are not enough MSI-X vectors available, the Ethernet driver uses
INTx interrupts, which may decrease performance. The proc node /proc/interrupts
shows the interrupts and their types.
The Linux performance utility “top” can monitor the CPU utilization while
troubleshooting performance issues. A low idle CPU percentage in any CPU core is an
indication of excessive processing load for that CPU. The proc node /proc/interrupts
shows the distribution of the interrupts across the CPU cores. If you see too many
interrupts per second directed to one CPU, check to see if the irqbalance program is
running. The irqbalance program is normally started at system boot. In some cases, you
can get better performance by disabling irqbalance and manually distributing
interrupts. You can manually distribute the interrupt load across the available CPU
cores by setting the CPU affinity for any interrupt vector by setting the mask in the proc
node /proc/irq/
Use the netstat command to look for excessive TCP retransmits or packet drops in the
network stack.
In systems having more than one NUMA node, you can get better performance by
pinning interrupts to the NUMA node local to the PCIe device.
Use the –S option of ethtool to see all statistics counters maintained by the Ethernet and
driver. Excessive drop or error counters are an indication of a bad link or defective
hardware. See Table E-1, Ethtool -S Option Statistics, on page 975, and Table E-2,