Troubleshooting Ethernet bridging on Linux-learn00008888-ChinaUnix博客

learn00008888的ChinaUnix博客

首页　| 　博文目录　| 　关于我

learn00008888

博客访问： 290333
博文数量： 67
博客积分： 0
博客等级：民兵
技术积分： 620
用户组：普通用户
注册时间： 2015-07-12 19:56

文章分类

全部博文（67）

linux（67）
未分配的博文（0）

文章存档

2019年（1）

2018年（1）

2017年（4）

2016年（34）

2015年（27）

我的朋友

wjlkoore

Content

Objective

To diagnose problems arising from use of the Linux bridge module.

Background

An Ethernet bridge (or switch) is a device for forwarding packets between two or more Ethernets so that they behave in most respects as if they were a single network. It could be a physical device, but it is also possible for a bridge to be implemented entirely in software. The Linux kernel has the ability to perform bridging by means of thebridge module.

Symptoms

The most likely symptoms of a bridging problem are that:

the bridge does not forward traffic,
the bridge forwards traffic intermittently,
the bridge causes a storm of duplicate traffic, or
the machine hosting the bridge appears to freeze.

Investigation

Strategy

If the bridge is not forwarding traffic then there are at least six possibilities to consider:

The bridge has not been created.
The appropriate interfaces have not been attached to the bridge.
The bridge or the attached interfaces are not in the ‘up’ state.
The bridge ports are not in the ‘forwarding’ state.
The traffic to be bridged is not reaching the relevant interface.
The traffic is being filtered by ebtables.

Intermittent forwarding usually has some form of intermittent connectivity as its root cause, however there are two ways in which the use of bridging can exacerbate what might otherwise have been a less serious problem:

If STP is enabled then the spanning tree may become unstable due to the topology changing faster than the tree can converge.
Even without STP, the bridge forwarding delay typically adds 15 seconds to the recovery time for even the briefest of outages.

If the problem is likely to reoccur frequently then it may be possible to tune the bridge parameters so that the network is more resiliant to outages of this nature.

A storm of duplicate traffic almost certainly indicates that the network contains one or more loops. You then have a choice between:

finding the loops and breaking them manually, or
enabling STP (the Spanning Tree Protocol) or an equivalent, which automatically disables any link that would cause a loop.

(Be aware that loops are sometimes created deliberately in order to provide redundancy. It is then necessary to have either some form of failover or load balancing mechanism. STP can be used to provide failover, whereas load balancing requires use of a protocol such as LACP.)

If the machine appears to freeze after adding a network interface to a bridge then this could be because:

you are administering it remotely via that interface (for example using SSH), or
the machine depends on that interface for vital services (for example NFS or LDAP).

Removing the interface from the bridge will solve the immediate problem. The underlying issue is that when an interface is attached to a bridge then any network addresses need to be bound to the bridge, not to the interface.

Remember that rule changes made using the brctl or ifconfig commands are not persistent. Most GNU/Linux distributions provide a mechanism for creating a persistent bridge, however the configuration method varies.

Check that the bridge has been created and the appropriate interfaces attached to it

A list of bridges can be displayed using the brctl show command:

brctl show

the output from which should be of the form:

bridge name     bridge id               STP enabled     interfaces
br0             8000.0200c0a80091       no              eth0
                                                        eth1

Verify that the bridge exists, has the name you expect, and is attached to the appropriate interfaces.

Check whether the bridge and attached interfaces are up or down

Bridges, like network interfaces, have an ‘up’ state and a ‘down’ state and they will not pass any traffic unless they are up. You can check whether a bridge is up or down using the ifconfig command:

ifconfig br0

Here is an example of the output from this command for an interface that is down, with the relevant line highlighted:

br0       Link encap:Ethernet  HWaddr 36:0a:79:b5:4e:66 BROADCAST MULTICAST  MTU:1500  Metric:1  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

and for the same interface when up:

br0       Link encap:Ethernet  HWaddr 36:0a:79:b5:4e:66
          inet6 addr: fe80::340a:79ff:feb5:4e66/64 Scope:Link UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:168 (168.0 B)

If the bridge needs to be brought up then this can be done using the ifconfig command:

ifconfig br0 up

The same considerations apply to each of the attached Ethernet interfaces: these can be brought up or down independently of the bridge, and they will only pass traffic if they are up.

Check whether the bridge ports are in the ‘forwarding’ state

At any given time, a Linux bridge port will be in one of five possible states: ‘disabled’, ‘listening’, ‘learning’, ‘forwarding’ or ‘blocking’. You can find out which using the brctl showstp command:

brctl showstp br0

the output from which should be of the form:

br0
 bridge id              8000.e0699577868f
 designated root        8000.e0699577868f
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.01
 hello timer               0.64                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  15.64
 flags


eth0 (1)
 port id                8001 state                forwarding designated root        8000.e0699577868f       path cost                  4
 designated bridge      8000.e0699577868f       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

eth1 (2)
 port id                8002 state                forwarding designated root        8000.e0699577868f       path cost                100
 designated bridge      8000.e0699577868f       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

The relevant fields have been highlighted. ‘forwarding’ is the state you want the port to be in so that it can carry traffic. In this state you can be reasonably confident that the bridge and interface are up, that there is a physical connection, and that forwarding has not been blocked by STP.

‘blocked’ indicates that the port has been prevented from forwarding traffic by STP or an equivalent in order to avoid a bridge loop from being formed. This should only happen when there is another path that the network traffic can take (excepting the brief loss of connectivity that occurs when the spanning tree changes).

If you want a particular network segment to be used in preference to any other paths that might be available then there are two ways to achieve that safely: either change the network topology manually so that it becomes the only path, or adjust the STP path costs so that it becomes the cheapest path. Otherwise, be assured that the ‘blocking’ state is a normal part of the operation of STP and does not by itself indicate that there is a problem.

‘listening’ indicates that the STP implementation has not yet decided whether the port should enter the ‘forwarding’ or ‘blocked’ state. ‘learning’ indicates that it is about to enter the ‘forwarding’ state, but it attempting to populate its MAC address table first to avoid an immediate burst of packets echoed to all ports. These should be transient states that last for a few tens of seconds at most. If you find that a link is spending an excessive amount of time in one or other of these states then that could indicate there is a link that is flapping up and down (not necessarily a local one), or that the size and complexity of the network has cause the spanning tree to become unstable.

‘disabled’ indicates that the port is non-operational for some other reason, for example because the corresponding network interface is down or has been physically disconnected from the network. This is clearly a problem if you want the port to carry traffic, but it is not a problem with the bridge as such.

Check which MAC addresses have been seen by the bridge

In the course of its operation a bridge must attempt to determine which MAC addresses are reachable through each of its attached interfaces. It does this by inspecting the source address of each packet that arrives at the bridge and recording it in a table. In the case of the Linux bridging module it is possible to inspect the content of this table using the brctl showmacs command:

brctl showmacs br0

The output is typically of the form:

port no mac addr                is local?       ageing timer
  1     02:54:65:73:74:31       no                 1.42
  1     02:54:65:73:74:32       no                 3.34
  1     02:54:65:73:74:33       no                 2.46
  1     02:54:65:73:74:34       yes                0.00
  2     02:54:65:73:74:35       no                 1.42
  2     02:54:65:73:74:36       no                 3.34
  2     02:54:65:73:74:37       no                 2.46

The value of this information for troubleshooting is that it tells you whether any packets from a given machine are being processed by the bridge. Possible explanations for the non-appearance of a MAC address are that:

packets from the machine in question are not reaching the bridge for some reason;
the receiving interface (see above);
the bridge port is disabled (see above); or
the address was in the table but has since expired.

Addresses typically expire after 5 minutes, so this is unlikely to be an issue if packets are being actively sent at the time you check the table, but it is a point to bear in mind if there has been any substantial delay between sending and checking.

Check ebtables

Ebtables is a packet filter that is similar in concept to iptables, except that it operates at the link layer rather than the network layer (acting on Ethernet frames as they are bridged as opposed to IP datagrams as they are routed). Ebtables is transparent by default, and this is the state you are likely to find it in on most machines, but it is worth checking because ebtables rules are capable of blocking or altering bridge traffic in an almost arbitrary manner.

If the ebtables command is available then you can view the rulebase using the -L option. For example, for the filtertable:

ebtables -t filter -L

Normally you would expect this to be empty:

Bridge table: filter

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 0, policy: ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

The same applies to the nat and broute tables.

If the ebtables command is not installed then that strongly suggests ebtables is not being used, although it is conceivable that rules could have been added by some other means.

If the rulebase is non-empty then you can obtain some insight into what effect it might be having by inspecting the counters associated with each rule:

ebtables -t filter -L --Lc

Each rule has two counters: pcnt (the number of packets) and bcnt (the number of bytes). As with iptables, it can be helpful to insert additional rules for the purpose of monitoring.

Finding bridge loops

Bridge loops are by their nature difficult to track down because the resulting packet storms will propagate throughout the entire network unless stopped. The packet source addresses are unlikely to be helpful because they say only where the traffic was originally sent from and not where it was replicated. Tracing the packet flow with a tool like tcpdump is not straightforward when all copies of a given packet are identical.

A more effective method is to partition the network until the symptoms disappear, then cautiously reconnect it one link at a time until they reappear. This is not something you would normally want to do to a network of any importance, but if it has already been incapacitated by a bridge loop then the potential for further harm is likely to be limited.

A packet capture tool such as tcpdump or Wireshark can be used for monitoring. It does not matter greatly where this is attached, provided that you are not using bridges that have active protection against packet storms (see below). You should disable reverse DNS lookup of captured IP addresses (the -n option in the case of tcpdump) because the DNS is unlikely to work reliably if there is a bridge loop. If you do not do this then there is likely to be a delay between when packets are captured and when they are displayed, which will prevent you from obtaining a timely view of what is happening on the network.

You should also ensure that there is a source of broadcast traffic on the network, so that a packet storm will occur promptly whenever a loop is created. An ongoing attempt to ping a non-existant IP address on a local subnet will have the required effect. (If a responsive IP address were chosen then the traffic would largely consist of unicast ICMP echo requests which would not necessarily be amplified. Choosing an address that does not exist will instead result in broadcast ARP requests.)

If the act of reconnecting a network segment causes the symptoms to reappear then there are two possibilities to consider:

The segment may form part of the loop that you are investigating, in which case reconnecting will have caused the loop to be reestablished.
The loop may lie beyond that segment, in which case it existed throughout the test but was unreachable from the monitoring point while the segment was disconnected.

A characteristic of the first condition is that there will be connectivity between the two parts of the network even when the segment under test is disconnected. (This connectivity might be unidirectional, but for a loop to form there must be a return path of some description.) By sending a stream of test packets from one side of the disconnected segment to the other it should be a relatively straightforward matter to trace the path they are taking.

If the loop lies beyond the disconnected segment then you can reconnect the remainder of the network then repeat this diagnostic procedure for the problematic region in isolation.

A complicating factor is that some networking equipment attempts to detect packet storms and actively protect against them, typically by disabling the port receiving the traffic. In the best case, where the loop is located at the edge of the network, this can both contain the effects of the loop and greatly simplify diagnosis (as the cause may be obvious once you know which port has been disabled). In other circumstances it can hinder troubleshooting by making the network more stateful, and there may be a case for temporarily turning this feature off despite the immediate negative consequences.

Using STP

Rather than attempting to find and break loops manually you can use the Spanning Tree Protocol (STP) to achieve the same result automatically:

brctl stp br0 yes

Ideally STP should be enabled on all bridges throughout the network. Failing that, any bridges that form part of a loop and are not STP-aware must be transparent to it. For obvious reasons there must be at least one STP-aware bridge in each loop.

For small networks STP should just work without further configuration. If the network is larger, or has frequent topology changes, then some tuning may be necessary to achieve acceptable results.

The machine appears to freeze

As noted above, adding an interface to a bridge causes it to stop acting as an Internet Protocol endpoint. This could result in the machine appearing to freeze if:

you are administering it remotely via that network interface, for example using SSH, or
the machine depends on the network for vital services, for example NFS or LDAP.

The solution is to remove the interface from the bridge by the most graceful means possible. In order of preference:

Log on using the console and issue a brctl delif command, for example brctl delif br0 eth0.
Reboot the machine gracefully, for example by sending control-alt-delete to the console. Note that, depending on what has been disabled, this may take considerably longer than it would do if the network were available.
Forcibly reboot the machine, for example by power-cycling it.

If the bridging commands have been inserted into the startup scripts then you will need to remove them. You may be able to do this by booting into a recovery mode or from a live CD, however for a remotely hosted machine you may have to resort to reimaging it (with loss of all data).