Techniques for P2P Communication over middleboxes-biti-leaf-ChinaUnix博客

大学之道，在明明德，在亲民，在止于至善。werewolf.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

biti-leaf

博客访问： 630542
博文数量： 127
博客积分： 6136
博客等级：准将
技术积分： 1461
用户组：普通用户
注册时间： 2009-10-24 00:32

文章分类

全部博文（127）

三菱（1）
Go（0）
AngularJS（0）
网络技术（1）
python学习（0）
shell学习（0）
《编程之美》读书（0）
/proc文件系统（0）
数据库（0）
数据结构与算法（0）
常用资源（0）
编程工具简介（2）
编程专栏（3）
linux C常用函数（0）
日记（0）
windows编程（0）
windows驱动开发（0）
linux命令使用（0）
社会生活（0）
车队骑行（0）
游戏相关（0）
linux学习（0）
未分配的博文（120）

推荐博文

相关博文

Techniques for P2P Communication over middleboxes

分类：系统运维

2010-10-27 17:03:49

1. Relaying（转发）

   The most reliable, but least efficient, method of implementing peer-
   to-peer communication in the presence of a middlebox is to make the
   peer-to-peer communication look to the network like client/server
   communication through relaying. For example, suppose two client
   hosts, A and B, have each initiated TCP or UDP connections with a
   well-known server S having a permanent IP address. The clients
   reside on separate private networks, however, and their respective
   middleboxes prevent either client from directly initiating a
   connection to the other.

                                Server S
                                   |
                                   |
            +----------------------+----------------------+
            |                                             |
          NAT A                                         NAT B
            |                                             |
            |                                             |
         Client A                                      Client B

   Instead of attempting a direct connection, the two clients can simply
   use the server S to relay messages between them. For example, to
   send a message to client B, client A simply sends the message to
   server S along its already-established client/server connection, and
   server S then sends the message on to client B using its existing
   client/server connection with B.

   This method has the advantage that it will always work as long as

   both clients have connectivity to the server. Its obvious
   disadvantages are that it consumes the server's processing power and
   network bandwidth unnecessarily, and communication latency between
   the two clients is likely to be increased even if the server is well-
   connected. The TURN protocol [TURN] defines a method of implementing
   relaying in a relatively secure fashion.

2. Connection reversal（反向连接）

   The second technique works if only one of the clients is behind a
   middlebox. For example, suppose client A is behind a NAT but client
   B has a globally routable IP address, as in the following diagram:

                                Server S
                            18.181.0.31:1235
                                   |
                                   |
            +----------------------+----------------------+
            |                                             |
          NAT A                                           |
    155.99.25.11:62000                                    |
            |                                             |
            |                                             |
         Client A                                      Client B
      10.0.0.1:1234                               138.76.29.7:1234

   Client A has private IP address 10.0.0.1, and the application is
   using TCP port 1234. This client has established a connection with
   server S at public IP address 18.181.0.31 and port 1235. NAT A has
   assigned TCP port 62000, at its own public IP address 155.99.25.11,
   to serve as the temporary public endpoint address for A's session
   with S: therefore, server S believes that client A is at IP address
   155.99.25.11 using port 62000. Client B, however, has its own
   permanent IP address, 138.76.29.7, and the peer-to-peer application
   on B is accepting TCP connections at port 1234.

   Now suppose client B would like to initiate a peer-to-peer
   communication session with client A. B might first attempt to
   contact client A either at the address client A believes itself to
   have, namely 10.0.0.1:1234, or at the address of A as observed by
   server S, namely 155.99.25.11:62000. In either case, however, the
   connection will fail. In the first case, traffic directed to IP
   address 10.0.0.1 will simply be dropped by the network because
   10.0.0.1 is not a publicly routable IP address. In the second case,
   the TCP SYN request from B will arrive at NAT A directed to port
   62000, but NAT A will reject the connection request because only
   outgoing connections are allowed.

   After attempting and failing to establish a direct connection to A,
   client B can use server S to relay a request to client A to initiate
   a "reversed" connection to client B. Client A, upon receiving this
   relayed request through S, opens a TCP connection to client B at B's
   public IP address and port number. NAT A allows the connection to
   proceed because it is originating inside the firewall, and client B
   can receive the connection because it is not behind a middlebox.

   A variety of current peer-to-peer systems implement this technique.
   Its main limitation, of course, is that it only works as long as only
   one of the communicating peers is behind a NAT: in the increasingly
   common case where both peers are behind NATs, the method fails.
   Because connection reversal is not a general solution to the problem,
   it is NOT recommended as a primary strategy. Applications may choose
   to attempt connection reversal, but should be able to fall back
   automatically on another mechanism such as relaying if neither a
   "forward" nor a "reverse" connection can be established.

3.UDP hole punching

   The third technique, and the one of primary interest in this
   document, is widely known as "UDP Hole Punching." UDP hole punching
   relies on the properties of common firewalls and cone NATs to allow
   appropriately designed peer-to-peer applications to "punch holes"
   through the middlebox and establish direct connectivity with each
   other, even when both communicating hosts may lie behind middleboxes.
   This technique was mentioned briefly in section 5.1 of RFC 3027 [NAT-
   PROT], and has been informally described elsewhere on the Internet
   [KEGEL] and used in some recent protocols [TEREDO, ICE]. As the name
   implies, unfortunately, this technique works reliably only with UDP.

   We will consider two specific scenarios, and how applications can be
   designed to handle both of them gracefully. In the first situation,
   representing the common case, two clients desiring direct peer-to-
   peer communication reside behind two different NATs. In the second,
   the two clients actually reside behind the same NAT, but do not
   necessarily know that they do.

3.1. Peers behind different NATs

   Suppose clients A and B both have private IP addresses and lie behind
   different network address translators. The peer-to-peer application
   running on clients A and B and on server S each use UDP port 1234. A
   and B have each initiated UDP communication sessions with server S,
   causing NAT A to assign its own public UDP port 62000 for A's session
   with S, and causing NAT B to assign its port 31000 to B's session
   with S, respectively.

                                Server S
                            18.181.0.31:1234
                                   |
                                   |
            +----------------------+----------------------+
            |                                             |
          NAT A                                         NAT B
    155.99.25.11:62000                            138.76.29.7:31000
            |                                             |
            |                                             |
         Client A                                      Client B
      10.0.0.1:1234                                 10.1.1.3:1234

   Now suppose that client A wants to establish a UDP communication
   session directly with client B. If A simply starts sending UDP
   messages to B's public address, 138.76.29.7:31000, then NAT B will
   typically discard these incoming messages (unless it is a full cone
   NAT), because the source address and port number does not match those
   of S, with which the original outgoing session was established.
   Similarly, if B simply starts sending UDP messages to A's public
   address, then NAT A will typically discard these messages.

   Suppose A starts sending UDP messages to B's public address, however,
   and simultaneously relays a request through server S to B, asking B
   to start sending UDP messages to A's public address. A's outgoing
   messages directed to B's public address (138.76.29.7:31000) cause NAT
   A to open up a new communication session between A's private address
   and B's public address. At the same time, B's messages to A's public
   address (155.99.25.11:62000) cause NAT B to open up a new
   communication session between B's private address and A's public
   address. Once the new UDP sessions have been opened up in each
   direction, client A and B can communicate with each other directly
   without further burden on the "introduction" server S.

   The UDP hole punching technique has several useful properties. Once
   a direct peer-to-peer UDP connection has been established between two
   clients behind middleboxes, either party on that connection can in
   turn take over the role of "introducer" and help the other party
   establish peer-to-peer connections with additional peers, minimizing
   the load on the initial introduction server S. The application does
   not need to attempt to detect explicitly what kind of middlebox it is
   behind, if any [STUN], since the procedure above will establish peer-
   to-peer communication channels equally well if either or both clients
   do not happen to be behind a middlebox. The hole punching technique
   even works automatically with multiple NATs, where one or both
   clients are removed from the public Internet via two or more levels
   of address translation.

3.2. Peers behind the same NAT

   Now consider the scenario in which the two clients (probably
   unknowingly) happen to reside behind the same NAT, and are therefore
   located in the same private IP address space. Client A has
   established a UDP session with server S, to which the common NAT has
   assigned public port number 62000. Client B has similarly

   established a session with S, to which the NAT has assigned public
   port number 62001.

                                Server S
                            18.181.0.31:1234
                                   |
                                   |
                                  NAT
                         A-S 155.99.25.11:62000
                         B-S 155.99.25.11:62001
                                   |
            +----------------------+----------------------+
            |                                             |
         Client A                                      Client B
      10.0.0.1:1234                                 10.1.1.3:1234

   Suppose that A and B use the UDP hole punching technique as outlined
   above to establish a communication channel using server S as an
   introducer. Then A and B will learn each other's public IP addresses
   and port numbers as observed by server S, and start sending each
   other messages at those public addresses. The two clients will be
   able to communicate with each other this way as long as the NAT
   allows hosts on the internal network to open translated UDP sessions
   with other internal hosts and not just with external hosts. We refer
   to this situation as "loopback translation," because packets arriving
   at the NAT from the private network are translated and then "looped
   back" to the private network rather than being passed through to the
   public network. For example, when A sends a UDP packet to B's public
   address, the packet initially has a source IP address and port number
   of 10.0.0.1:124 and a destination of 155.99.25.11:62001. The NAT
   receives this packet, translates it to have a source of
   155.99.25.11:62000 (A's public address) and a destination of
   10.1.1.3:1234, and then forwards it on to B. Even if loopback
   translation is supported by the NAT, this translation and forwarding
   step is obviously unnecessary in this situation, and is likely to add
   latency to the dialog between A and B as well as burdening the NAT.

   The solution to this problem is straightforward, however. When A and
   B initially exchange address information through server S, they
   should include their own IP addresses and port numbers as "observed"
   by themselves, as well as their addresses as observed by S. The
   clients then simultaneously start sending packets to each other at
   each of the alternative addresses they know about, and use the first
   address that leads to successful communication. If the two clients
   are behind the same NAT, then the packets directed to their private
   addresses are likely to arrive first, resulting in a direct
   communication channel not involving the NAT. If the two clients are
   behind different NATs, then the packets directed to their private

   addresses will fail to reach each other at all, but the clients will
   hopefully establish connectivity using their respective public
   addresses. It is important that these packets be authenticated in
   some way, however, since in the case of different NATs it is entirely
   possible for A's messages directed at B's private address to reach
   some other, unrelated node on A's private network, or vice versa.

3.3. Peers separated by multiple NATs

   In some topologies involving multiple NAT devices, it is not
   possible for two clients to establish an "optimal" P2P route between
   them without specific knowledge of the topology. Consider for
   example the following situation.

                                Server S
                            18.181.0.31:1234
                                   |
                                   |
                                 NAT X
                         A-S 155.99.25.11:62000
                         B-S 155.99.25.11:62001
                                   |
                                   |
            +----------------------+----------------------+
            |                                             |
          NAT A                                         NAT B
    192.168.1.1:30000                             192.168.1.2:31000
            |                                             |
            |                                             |
         Client A                                      Client B
      10.0.0.1:1234                                 10.1.1.3:1234

   Suppose NAT X is a large industrial NAT deployed by an internet
   service provider (ISP) to multiplex many customers onto a few public
   IP addresses, and NATs A and B are small consumer NAT gateways
   deployed independently by two of the ISP's customers to multiplex
   their private home networks onto their respective ISP-provided IP
   addresses. Only server S and NAT X have globally routable IP
   addresses; the "public" IP addresses used by NAT A and NAT B are
   actually private to the ISP's addressing realm, while client A's and
   B's addresses in turn are private to the addressing realms of NAT A
   and B, respectively. Each client initiates an outgoing connection to
   server S as before, causing NATs A and B each to create a single
   public/private translation, and causing NAT X to establish a
   public/private translation for each session.

   Now suppose clients A and B attempt to establish a direct peer-to-

   peer UDP connection. The optimal method would be for client A to
   send messages to client B's public address at NAT B,
   192.168.1.2:31000 in the ISP's addressing realm, and for client B to
   send messages to A's public address at NAT B, namely
   192.168.1.1:30000. Unfortunately, A and B have no way to learn these
   addresses, because server S only sees the "global" public addresses
   of the clients, 155.99.25.11:62000 and 155.99.25.11:62001. Even if A
   and B had some way to learn these addresses, there is still no
   guarantee that they would be usable because the address assignments
   in the ISP's private addressing realm might conflict with unrelated
   address assignments in the clients' private realms. The clients
   therefore have no choice but to use their global public addresses as
   seen by S for their P2P communication, and rely on NAT X to provide
   loopback translation.

3.4. Consistent port bindings

   The hole punching technique has one main caveat: it works only if
   both NATs are cone NATs (or non-NAT firewalls), which maintain a
   consistent port binding between a given (private IP, private UDP)
   pair and a (public IP, public UDP) pair for as long as that UDP port
   is in use. Assigning a new public port for each new session, as a
   symmetric NAT does, makes it impossible for a UDP application to
   reuse an already-established translation for communication with
   different external destinations. Since cone NATs are the most
   widespread, the UDP hole punching technique is fairly broadly
   applicable; nevertheless a substantial fraction of deployed NATs are
   symmetric and do not support the technique.

4. UDP port number prediction

   A variant of the UDP hole punching technique discussed above exists
   that allows peer-to-peer UDP sessions to be created in the presence
   of some symmetric NATs. This method is sometimes called the "N+1"
   technique [BIDIR] and is explored in detail by Takeda [SYM-STUN].
   The method works by analyzing the behavior of the NAT and attempting
   to predict the public port numbers it will assign to future sessions.
   Consider again the situation in which two clients, A and B, each
   behind a separate NAT, have each established UDP connections with a
   permanently addressable server S:

                                  Server S
                              18.181.0.31:1234
                                     |
                                     |
              +----------------------+----------------------+
              |                                             |
       Symmetric NAT A                               Symmetric NAT B
   A-S 155.99.25.11:62000                        B-S 138.76.29.7:31000
              |                                             |
              |                                             |
           Client A                                      Client B
        10.0.0.1:1234                                 10.1.1.3:1234

   NAT A has assigned its own UDP port 62000 to the communication
   session between A and S, and NAT B has assigned its port 31000 to the
   session between B and S. By communicating through server S, A and B
   learn each other's public IP addresses and port numbers as observed
   by S. Client A now starts sending UDP messages to port 31001 at
   address 138.76.29.7 (note the port number increment), and client B
   simultaneously starts sending messages to port 62001 at address
   155.99.25.11. If NATs A and B assign port numbers to new sessions
   sequentially, and if not much time has passed since the A-S and B-S
   sessions were initiated, then a working bi-directional communication
   channel between A and B should result. A's messages to B cause NAT A
   to open up a new session, to which NAT A will (hopefully) assign
   public port number 62001, because 62001 is next in sequence after the
   port number 62000 it previously assigned to the session between A and
   S. Similarly, B's messages to A will cause NAT B to open a new
   session, to which it will (hopefully) assign port number 31001. If
   both clients have correctly guessed the port numbers each NAT assigns
   to the new sessions, then a bi-directional UDP communication channel
   will have been established as shown below.

                                  Server S
                              18.181.0.31:1234
                                     |
                                     |
              +----------------------+----------------------+
              |                                             |
            NAT A                                         NAT B
   A-S 155.99.25.11:62000                        B-S 138.76.29.7:31000
   A-B 155.99.25.11:62001                        B-A 138.76.29.7:31001
              |                                             |
              |                                             |
           Client A                                      Client B
        10.0.0.1:1234                                 10.1.1.3:1234

   Obviously there are many things that can cause this trick to fail.
   If the predicted port number at either NAT already happens to be in
   use by an unrelated session, then the NAT will skip over that port
   number and the connection attempt will fail. If either NAT sometimes
   or always chooses port numbers non-sequentially, then the trick will
   fail. If a different client behind NAT A (or B respectively) opens
   up a new outgoing UDP connection to any external destination after A
   (B) establishes its connection with S but before sending its first

   message to B (A), then the unrelated client will inadvertently
   "steal" the desired port number. This trick is therefore much less
   likely to work when either NAT involved is under load.

   Since in practice a P2P application implementing this trick would
   still need to work if the NATs are cone NATs, or if one is a cone NAT
   and the other is a symmetric NAT, the application would need to
   detect beforehand what kind of NAT is involved on either end [STUN]
   and modify its behavior accordingly, increasing the complexity of the
   algorithm and the general brittleness of the network. Finally, port
   number prediction has no chance of working if either client is behind
   two or more levels of NAT and the NAT(s) closest to the client are
   symmetric. For all of these reasons, it is NOT recommended that new
   applications implement this trick; it is mentioned here for
   historical and informational purposes.

5. Simultaneous TCP open

   There is a method that can be used in some cases to establish direct
   peer-to-peer TCP connections between a pair of nodes that are both
   behind existing middleboxes. Most TCP sessions start with one
   endpoint sending a SYN packet, to which the other party responds with
   a SYN-ACK packet. It is possible and legal, however, for two
   endpoints to start a TCP session by simultaneously sending each other
   SYN packets, to which each party subsequently responds with a
   separate ACK. This procedure is known as a "simultaneous open."

   If a middlebox receives a TCP SYN packet from outside the private
   network attempting to initiate an incoming TCP connection, the
   middlebox will normally reject the connection attempt by either
   dropping the SYN packet or sending back a TCP RST (connection reset)
   packet. If, however, the SYN packet arrives with source and
   destination addresses and port numbers that correspond to a TCP
   session that the middlebox believes is already active, then the
   middlebox will allow the packet to pass through. In particular, if
   the middlebox has just recently seen and transmitted an outgoing SYN
   packet with the same addresses and port numbers, then it will
   consider the session active and allow the incoming SYN through. If
   clients A and B can each correctly predict the public port number
   that its respective middlebox will assign the next outgoing TCP
   connection, and if each client initiates an outgoing TCP connection
   with the other client timed so that each client's outgoing SYN passes
   through its local middlebox before either SYN reaches the opposite
   middlebox, then a working peer-to-peer TCP connection will result.

   Unfortunately, this trick may be even more fragile and timing-
   sensitive than the UDP port number prediction trick described above.
   First, unless both middleboxes are simple firewalls or implement cone

   NAT behavior on their TCP traffic, all the same things can go wrong
   with each side's attempt to predict the public port numbers that the
   respective NATs will assign to the new sessions. In addition, if
   either client's SYN arrives at the opposite middlebox too quickly,
   then the remote middlebox may reject the SYN with a RST packet,
   causing the local middlebox in turn to close the new session and make
   future SYN retransmission attempts using the same port numbers
   futile. Finally, even though support for simultaneous open is
   technically a mandatory part of the TCP specification [TCP], it is
   not implemented correctly in some common operating systems. For this
   reason, this trick is likewise mentioned here only for historical
   reasons; it is NOT recommended for use by applications. Applications
   that require efficient, direct peer-to-peer communication over
   existing NATs should use UDP.

阅读(1444) | 评论(0) | 转发(0) |

上一篇：BitTorrent协议规范(二)

下一篇：TCP连接的建立和结束

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6