Network performance tuning, Memory bandwidth considerations, Enabling optimal bandwidth options – Dell Emulex Family of Adapters User Manual
Page 868: Populate dimm slots, Disabling memory mirroring, Network memory limits

Emulex Drivers Version 10.2 for Linux User Manual
P010081-01A Rev. A
3. Configuration
Network Performance Tuning
868
Network Performance Tuning
Memory Bandwidth Considerations
The availability of higher memory bandwidth leads to better network performance.
The following sections describe how memory bandwidth can be increased.
Enabling Optimal Bandwidth Options
Most computers offer multiple distinct memory channels, or memory interleaves,
which may not be enabled by default. Check the manufacturer's documentation and
BIOS parameters for details on enabling optimal memory bandwidth options.
Populate DIMM Slots
Typically, all the dual in-line memory module (DIMM) slots must be populated to
make use of all the memory channels. As a general rule, using more DIMMs provides
better performance by allowing a higher degree of memory-access interleaving to
occur.
Disabling Memory Mirroring
Some servers may allow memory mirroring, where the total memory is divided in half
and each location is stored twice. This allows fault recovery if one memory location
detects an error, but it greatly reduces the perceived memory bandwidth of the system.
Consider disabling memory mirroring if it is not needed.
Using a Fast Clock Speed for the Front Side Bus (FSB)
Nearly any desktop or low-end server has enough memory bandwidth for OneConnect
adapters and LPe16202 CFAs in NIC mode to support DMA at 20 Gb/s of data (10
Gb/s read, 10-Gb/s write). However, most of the memory demands come from the
processor accessing the data for either packet copies in the non-offloaded networking
stack or application accesses. All processor memory accesses use the FSB. The clock
speed of this bus is critical for allowing efficient memory bandwidth. A system with a
faster processor FSB clock speed performs better than a system with a slower FSB clock
speed.
Network Memory Limits
The default values of tunable parameters in the Linux network stack are optimal for
most network applications involving several TCP/UDP streams. The optimal size for
the network queues and buffers depend on several factors such as protocol, number of
streams (connections), request size, and application behavior. The following network
configuration settings are a good combination to get the best uni-directional transmit
and receive performance with six or more TCP connections/UDP streams:
echo 4096 87380 4194304 > /proc/sys/net/ipv4/tcp_rmem
echo 4096 16384 4194304 > /proc/sys/net/ipv4/tcp_wmem
echo 64000000 > /proc/sys/net/core/rmem_default