分类: LINUX
2013-11-28 18:15:56
There are many aspects of the Linux kernel configuration that can affect performance. This HOWTO attempts to gather some of these.
When working with a Linux system, there are kernel parameters that limit the maximum send and receive socket buffer sizes.
To see your system's settings, run this command:
The default values are:
(Where "rmem" is for the receive socket, "wmem" is for the send socket.)
For improved DDS performance, we suggest increasing the maximum send and receive socket buffer sizes. For example:
To temporarily change any of this paramaters you can use sysctl or the /proc filesystem. Either way the changes only apply to the running operating system state. They will not survive a reboot.
For example to set the new value of 65536 for net.core.rmem_default, become the root user and use the sysctl command shown below:
The same result can also be changed via the /proc file system with the following command:
Alternatively to make changes to these parameters change permanent across reboots, edit the /etc/sysctl.conf file and add or edit the corresponding variables. For example, to set the net.core.rmem_default edit the /etc/sysctl.conf file and add the line:
This queue is used to hold packets when the interface receives them faster than the kernel can process them. The default setting is around 300. For fast networks (1 GigE and beyond) it is recommended to increase it so that packets bursts do not immediately result in some packets being dropped.
One potential cause of poor performance on a Linux system is the amount of buffer space the kernel uses to reassemble IP fragments. The description here applies to the configuration of the RedHart Enterprise Linux 4.4 but similar considerations apply to other kernel configurations.
The parameters in /proc/sys/net/ipv4 control various aspects of the network, including a parameter that controlls the reassembly buffer size.
ipfrag_high_threshold specifies that maximum amount of memory used to reassemble IP fragments. When the memory used by fragments reaches ipfrag_high_threshold, old entries are removed until the memory used declines to ipfrag_low_threshold. If the output of netstat shows increasing amounts of IP fragment reassemblies failing, we recommend to increase ipfrag_high_threshold. The impact can be significant. In some use cases, we have seen that increasing this buffer space improved throughput from 32MB/sec to 80MB/sec on a 1 Gbit Ethernet. To temporarily change the value of ipfrag_high_threshold, use this command as root:
To make this change permanent across reboots, edit the /etc/sysctl.conf file and add the line:
The configuration of the NIC card can have a big impact on throughput and latency. This depends heavily on the kind of NIC you use. In our experience the Intel 1 Gbit NICs are very sensitive to this setting so this section applies mostly to those. Other NICs may be less sensitive or not even offer the option to configure the setting.
Interrupt coalescing refers to the ability of the NIC to not interrupt the CPU immediately whenever a packet is received, but rather wait a little bit in the how that more packets arrive. That way a single interrupt can be used to process multiple packets. This decision represent a tradeoff between latency and throughput. Coalescing the interrupts amounts to a wait which will enhance throughput but degrade latency. Disabling the Interrupt Coalescing and thus forcing the NIC to interrupt the CPU for each packet will provide the minimal latency but lower throughput.
Out of the box most NICs are configured with an "adaptive" setting (sometimes called "dynamic") which in our experience tends to favor throughput over latency. This depends on the actual NIC used an sometimes also on the Linux distribution.
Depending on the Linux distribution and NIC driver you can use tools such as "ethtool" and "modprobe" to find out and modify the NIC settings. Note that these commands must be executed with "root" priviledges.
The first step is to identify the NIC adapter and driver you are using. The following Adapter and Driver ID Guide" at provides a good reference on how to do this for different systems:
Step 1 is to identify the list of Ethernet ports and their names. In our system:
Step 2 is to we identify the name of the network adapter:
01:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)
03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 22)
Subsystem: Giga-byte Technology Marvell 88E8053 Gigabit Ethernet Controller (Gigabyte)
And finally in Step 3 the driver version. Note that we use the Ethernet port name "eth1" obtained from the first step.:
We can check the current settings by looking at /etc/modprob.conf file. Note how the use of the driver name "e1000" obtained in the previous step:
This setting of "1" corresponds to an adaptive setting which tries to balance latency and throughput, so it will not provide minimal latency. To achieve minimal latency we must disable Interrupt Throttling completely by setting InterruptThrottleRate=0. We can do that either using ethtool (if the system supports that), or alternatively manually editing the /etc/modprobe.conf file and then rebooting the system. In our experience this second mechanism is the more robust way to do it.
Additional information and suggestions for improving the latency on Intel Ethernet controllers can be found at: andhttp://www.kernel.org/doc/Documentation/networking/e1000.txt
This parameter controls the largest packet size (in bytes) that the interface can transmit to the network. Larger packets will be fragmented by the interface, send in separate network packets, and re-assembled on the receiving network interface. This fragmentation and re-assembly will diminish the throughput and increase the latency of the communication.
Out-of-the-box most Linux systems are configured with an MTU of 1500 Bytes. This value derives from old Ethernet NICs and switches that could not handle larger packets. Modern Ethernet hardware (e.g. 1 Gbit/sec NICs) can handle MTU of at least 9000B. Therefore it is recommended that you reconfigure your operating system network settings to match what the hardware can do.
To see your system's settings, run this command as user root:
In the above example you can see that "eth0" which represents the external interface has the MTU set to 1500 Bytes.
For improved DDS performance, increase the MTU to 9000 Bytes. You can do this two ways. One from the command line which will take effect immediately but only work until the next reboot. The other way changes teh boot settings so that the next time you reboot the MTU is set to the new 9000 Byte value.
To do it in the command line type the following as root (note "eth0" should be replaced by the name of your interface as showin in the ifconfig command):
To change it permanently in RedHat Linux edit the file /etc/sysconfig/network-scripts/ifcfg-eth0 and add the following line to it:
This is my file after I added the line:
After the file is changed, re-start the network service with the following command:
Now the change will be preserved even if the system is rebooted.
Note that the actual mechanism to set the MTU depends on the Linux distribution. The instructions here are for RedHat. Ubunto and Debian offer similar functionality but use different commands and files to configure the MTU.