Tips in choosing and using Gigabit adapter:
if (window.showTocToggle) { var tocShowText = "show"; var tocHideText = "hide"; showTocToggle(); }
Using latest driver features
In-kernel driver often miss latest features.
Example, now, 7 Mar 2011, CONFIG_E1000E_SEPARATE_TX_HANDLER option exist only at , e1000e driver.
Bus connection
- Try to not use PCI 33Mhz
- PCI 66Mhz for one card, PCI-X 66Mhz (and higher) is fine for 1-2 cards. PCI (and PCI-X?) bus is shared, so if you have another "bandwidth heavy" device - it can make you a problem.
- PCI-Express just great for 1Gbps. Maybe 1x PCI-Express not enough for 10Gb cards, but it is another story
To see how connected your network card:
dns1 log # lspci -t -v
-[0000:00]-+-00.0 Intel Corporation E7520 Memory Controller Hub
+-00.1 Intel Corporation E7525/E7520 Error Reporting Registers
+-01.0 Intel Corporation E7520 DMA Controller
+-02.0-[0000:01-03]--+-00.0-[0000:02]--+-05.0 LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
| | \-05.1 LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
| \-00.2-[0000:03]--+-04.0 Intel Corporation 82546GB Gigabit Ethernet Controller
| \-04.1 Intel Corporation 82546GB Gigabit Ethernet Controller
+-1d.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1
+-1d.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2
+-1d.2 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3
+-1d.7 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
+-1e.0-[0000:04]----0c.0 ATI Technologies Inc Rage XL
+-1f.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge
+-1f.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller
\-1f.3 Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller
dns1 log # lspci
00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 0c)
00:00.1 Class ff00: Intel Corporation E7525/E7520 Error Reporting Registers (rev 0c)
00:01.0 System peripheral: Intel Corporation E7520 DMA Controller (rev 0c)
00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A (rev 0c)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02)
01:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09)
01:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09)
02:05.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 08)
02:05.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 08)
03:04.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)
03:04.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)
04:0c.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
You can see, network connected over 02.0, which is "00:02.0 PCI
bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A (rev
0c)".
lspci -vvv will show also network adapter details
03:04.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03)
Subsystem: Intel Corporation PRO/1000 MT Dual Port Network Connection
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
SERR- Latency: 64 (63750ns min), Cache Line Size 10
Interrupt: pin A routed to IRQ 177
Region 0: Memory at fcfa0000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at dc00 [size=64]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [e4] PCI-X non-bridge device
Command: DPERE- ERO+ RBC=512 OST=1
Status: Dev=03:04.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz-
Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
Interrupts
Types
What kind of interrupts possible?
- Edge-triggered
- Leveled (ugly. Especially if interrupt shared, on high-bandwidth device it is death)
- MSI (on PCI 2.3 or later, PCI-Express). Instead of playing with PIN's to trigger interrupt, it just write "message" in memory.
Due PCI-Express design, it is important to use MSI for it, for better performance.
Examples
Important points:
- Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
- Capabilities: [e4] PCI-X non-bridge device
This means card PCI-X capable, and have MSI interrupts (some sources
says it is much better than "old style" interrupts. You must make sure
that
they are used, because some motherboards have issues with MSI
interrupts, and disabling them.
As result
dns1 ~ # cat /proc/interrupts
CPU0 CPU1
0: 1363520287 5788 IO-APIC-edge timer
8: 1 1 IO-APIC-edge rtc
9: 0 0 IO-APIC-level acpi
169: 0 0 IO-APIC-level uhci_hcd:usb1
177: 1835909945 1 IO-APIC-level eth0
193: 158579743 56 IO-APIC-level ioc0
201: 26 20 IO-APIC-level ioc1
217: 0 0 IO-APIC-level uhci_hcd:usb2
225: 0 0 IO-APIC-level uhci_hcd:usb3
233: 0 0 IO-APIC-level ehci_hcd:usb4
NMI: 0 0
LOC: 1358270159 1358270158
ERR: 0
MIS: 0
Here is strange thing. eth0 supports MSI, but interrupts is IO-APIC-level
And another server, where MSI working properly, interrupt is
"edge/MSI" triggered. Recent PCI and PCI-Express have MSI
(edge-triggered) interrupts, old PCI - only level interrupts.
212: 1254103 1253084 1253067 1253362 PCI-MSI-edge eth1
213: 2544156492 2544159196 2544163210 2544160813 PCI-MSI-edge eth0
For high performance applications it is unacceptable, when IRQ of network card shared with some other devices.
Card internals
FIFO buffer
Most of Intel gigabit cards have internal FIFO buffer, which is
collecting packets for/from DMA transfers (not sure if correct). It is
important, if you plan high bandwidth and high PPS applications to have
card with enough buffer. As i was thinking before, difference between
Intel gigabit adapters minor, and it is acceptable to use for high-load
applications any of them.
But it was mistake. For example embedded in ICH8 Gigabit adapter
82566DC, have because of bug, or by design internal FIFO only 16K!
It's means 8K for RX, and 8K for TX by default. Maybe because
of that it doesn't support Jumbo frames,for Jumbo transmit you need to
hold two packets in TX fifo, it means for jumbo frame 16110 bytes will
take all FIFO, and nothing will remain for RX). Because of small FIFO
buffer and probably PCI-Express latency, on high packet-rate (around
100Kpps) you will hit a wall. First of all flow-control will not work
properly, second even without flow-control you can have packetloss.
So always you have to look for adapter with enough large
internal "Packet buffer", called also FIFO sometimes. You can look in
sources PBA value, it will show usually half of PBS size. Sometimes RX
is larger than TX, for example PBS is 64K, PBA is 48K. But it is more
important to have enough big PBA (RX buffer).
How to look PBA (usually it define RX FIFO buffer size) in source code:
oid
e1000_reset(struct e1000_adapter *adapter)
{
u32 pba = 0, tx_space, min_tx_space, min_rx_space;
u16 fc_high_water_mark = E1000_FC_HIGH_DIFF;
bool legacy_pba_adjust = false;
/* Repartition Pba for greater than 9k mtu
* To take effect CTRL.RST is required.
*/
switch (adapter->hw.mac_type) {
case e1000_82542_rev2_0:
case e1000_82542_rev2_1:
case e1000_82543:
case e1000_82544:
case e1000_82540:
case e1000_82541:
case e1000_82541_rev_2:
legacy_pba_adjust = true;
pba = E1000_PBA_48K;
break;
case e1000_82545:
case e1000_82545_rev_3:
case e1000_82546:
case e1000_82546_rev_3:
pba = E1000_PBA_48K;
break;
case e1000_82547:
case e1000_82547_rev_2:
legacy_pba_adjust = true;
pba = E1000_PBA_30K;
break;
case e1000_82571:
case e1000_82572:
case e1000_80003es2lan:
pba = E1000_PBA_38K;
break;
case e1000_82573:
pba = E1000_PBA_20K;
break;
case e1000_ich8lan:
pba = E1000_PBA_8K;
first you can see "case e1000_ich8lan:", this is model of card. Then you see PBA size:pba = E1000_PBA_8K;.
Diagnostics live
Errors
If you have packetloss, you have to check ethtool -S ethX.
MegaRouterXeon-KARAM ~ # ethtool -S eth0
NIC statistics:
rx_packets: 3282740738
tx_packets: 3279620759
rx_bytes: 1890275803477
tx_bytes: 1885940915317
rx_broadcast: 243768
tx_broadcast: 26050
rx_multicast: 1894634
tx_multicast: 67842
rx_errors: 0
tx_errors: 0
tx_dropped: 0
multicast: 1894634
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 113102
rx_missed_errors: 10308
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 30151971
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
tx_restart_queue: 4967999
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 102
tx_tcp_seg_failed: 0
rx_flow_control_xon: 30545078
rx_flow_control_xoff: 30730827
tx_flow_control_xon: 2799
tx_flow_control_xoff: 5653
rx_long_byte_count: 1890275803477
rx_csum_offload_good: 3081285221
rx_csum_offload_errors: 140979
rx_header_split: 0
alloc_rx_buff_failed: 0
tx_smbus: 0
rx_smbus: 3
dropped_smbus: 0
- rx_no_buffer_count: 113102
If you see this error, most probably you have to increase "ring" size.
Example
MegaRouterXeon-KARAM ~ # ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 1024
RX Mini: 0
RX Jumbo: 0
TX: 256
Increasing to 2048 packets
ethtool -G eth0 rx 2048
This error can mean many things. Including not enough bus bandwidth,
host is too busy (try to enable flow-control), PBA buffer too small.
This means host got "flow-control" packet and delayed transmission. It is not error actually.
- tx_restart_queue: 4967999
No idea. Researching.
原文链接:
阅读(2367) | 评论(0) | 转发(0) |