分类: BSD
2007-03-15 12:36:54
如果你自己不处理,过一段时间如果数据发不完,套接字也会被自动关闭。 具体参考(unp1 4.9节) |
假设最后一个ACK丢失了,服务器会重发它发送的最后一个FIN,所以客户端必须维持一个状态信息,以便能够重发ACK;如果不维持这种状态,客户端在接收到FIN后将会响应一个RST,服务器端接收到RST后会认为这是一个错误。如果TCP协议能够正常完成必要的操作而终止双方的数据流传输,就必须完全正确的传输四次握手的四个节,不能有任何的丢失。这就是为什么socket在关闭后,仍然处于 TIME_WAIT状态,因为他要等待以便重发ACK。
如果目前连接的通信双方都已经调用了close(),假定双方都到达CLOSED状态,而没有TIME_WAIT状态时,就会出现如下的情况。现在有一个新的连接被建立起来,使用的IP地址与端口与先前的完全相同,后建立的连接又称作是原先连接的一个化身。还假定原先的连接中有数据报残存于网络之中,这样新的连接收到的数据报中有可能是先前连接的数据报。为了防止这一点,TCP不允许从处于TIME_WAIT状态的socket建立一个连接。处于TIME_WAIT状态的socket在等待两倍的MSL时间以后(之所以是两倍的MSL,是由于MSL是一个数据报在网络中单向发出到认定丢失的时间,一个数据报有可能在发送图中或是其响应过程中成为残余数据报,确认一个数据报及其响应的丢弃的需要两倍的MSL),将会转变为CLOSED状态。这就意味着,一个成功建立的连接,必然使得先前网络中残余的数据报都丢失了。
由于TIME_WAIT状态所带来的相关问题,我们可以通过设置SO_LINGER标志来避免socket进入TIME_WAIT状态,这可以通过发送RST而取代正常的TCP四次握手的终止方式。但这并不是一个很好的主意,TIME_WAIT对于我们来说往往是有利的。
socket-faq中的这一段讲的也很好,摘录如下:
2.7. Please explain the TIME_WAIT state.
that TCP guarantees all data transmitted will be delivered,
if at all possible. When you close a socket, the server goes into a
TIME_WAIT state, just to be really really sure that all the data has
gone through. When a socket is closed, both sides agree by sending
messages to each other that they will send no more data. This, it
seemed to me was good enough, and after the handshaking is done, the
socket should be closed. The problem is two-fold. First, there is no
way to be sure that the last ack was communicated successfully.
Second, there may be "wandering duplicates" left on the net that must
be dealt with if they are delivered.
Andrew Gierth (andrew@erlenstar.demon.co.uk) helped to explain the
closing sequence in the following usenet posting:
Assume that a connection is in ESTABLISHED state, and the client is
about to do an orderly release. The client's sequence no. is Sc, and
the server's is Ss. Client Server
====== ======
ESTABLISHED ESTABLISHED
(client closes)
ESTABLISHED ESTABLISHED
------->>
FIN_WAIT_1
<<--------
FIN_WAIT_2 CLOSE_WAIT
<<-------- (server closes)
LAST_ACK
, ------->>
TIME_WAIT CLOSED
(2*msl elapses...)
CLOSED
Note: the +1 on the sequence numbers is because the FIN counts as one
byte of data. (The above diagram is equivalent to fig. 13 from RFC
793).
Now consider what happens if the last of those packets is dropped in
the network. The client has done with the connection; it has no more
data or control info to send, and never will have. But the server does
not know whether the client received all the data correctly; that's
what the last ACK segment is for. Now the server may or may not care
whether the client got the data, but that is not an issue for TCP; TCP
is a reliable rotocol, and must distinguish between an orderly
connection close where all data is transferred, and a connection abort
where data may or may not have been lost.
So, if that last packet is dropped, the server will retransmit it (it
is, after all, an unacknowledged segment) and will expect to see a
suitable ACK segment in reply. If the client went straight to CLOSED,
the only possible response to that retransmit would be a RST, which
would indicate to the server that data had been lost, when in fact it
had not been.
(Bear in mind that the server's FIN segment may, additionally, contain
data.)
DISCLAIMER: This is my interpretation of the RFCs (I have read all the
TCP-related ones I could find), but I have not attempted to examine
implementation source code or trace actual connections in order to
verify it. I am satisfied that the logic is correct, though.
More commentarty from Vic:
The second issue was addressed by Richard Stevens (rstevens@noao.edu,
author of "Unix Network Programming", see ``1.5 Where can I get source
code for the book [book title]?''). I have put together quotes from
some of his postings and email which explain this. I have brought
together paragraphs from different postings, and have made as few
changes as possible.
From Richard Stevens (rstevens@noao.edu):
If the duration of the TIME_WAIT state were just to handle TCP's full-
duplex close, then the time would be much smaller, and it would be
some function of the current RTO (retransmission timeout), not the MSL
(the packet lifetime).
A couple of points about the TIME_WAIT state.
o The end that sends the first FIN goes into the TIME_WAIT state,
because that is the end that sends the final ACK. If the other
end's FIN is lost, or if the final ACK is lost, having the end that
sends the first FIN maintain state about the connection guarantees
that it has enough information to retransmit the final ACK.
o Realize that TCP sequence numbers wrap around after 2**32 bytes
have been transferred. Assume a connection between A.1500 (host A,
port 1500) and B.2000. During the connection one segment is lost
and retransmitted. But the segment is not really lost, it is held
by some intermediate router and then re-injected into the network.
(This is called a "wandering duplicate".) But in the time between
the packet being lost & retransmitted, and then reappearing, the
connection is closed (without any problems) and then another
connection is established between the same host, same port (that
is, A.1500 and B.2000; this is called another "incarnation" of the
connection). But the sequence numbers chosen for the new
incarnation just happen to overlap with the sequence number of the
wandering duplicate that is about to reappear. (This is indeed
possible, given the way sequence numbers are chosen for TCP
connections.) Bingo, you are about to deliver the data from the
wandering duplicate (the previous incarnation of the connection) to
the new incarnation of the connection. To avoid this, you do not
allow the same incarnation of the connection to be reestablished
until the TIME_WAIT state terminates.
Even the TIME_WAIT state doesn't complete solve the second problem,
given what is called TIME_WAIT assassination. RFC 1337 has more
details.
o The reason that the duration of the TIME_WAIT state is 2*MSL is
that the maximum amount of time a packet can wander around a
network is assumed to be MSL seconds. The factor of 2 is for the
round-trip. The recommended value for MSL is 120 seconds, but
Berkeley-derived implementations normally use 30 seconds instead.
This means a TIME_WAIT delay between 1 and 4 minutes. Solaris 2.x
does indeed use the recommended MSL of 120 seconds.
A wandering duplicate is a packet that appeared to be lost and was
retransmitted. But it wasn't really lost ... some router had
problems, held on to the packet for a while (order of seconds, could
be a minute if the TTL is large enough) and then re-injects the packet
back into the network. But by the time it reappears, the application
that sent it originally has already retransmitted the data contained
in that packet.
Because of these potential problems with TIME_WAIT assassinations, one
should not avoid the TIME_WAIT state by setting the SO_LINGER option
to send an RST instead of the normal TCP connection termination
(FIN/ACK/FIN/ACK). The TIME_WAIT state is there for a reason; it's
your friend and it's there to help you :-)
I have a long discussion of just this topic in my just-released
"TCP/IP Illustrated, Volume 3". The TIME_WAIT state is indeed, one of
the most misunderstood features of TCP.
I'm currently rewriting "Unix Network Programming" (see ``1.5 Where
can I get source code for the book [book title]?''). and will include
lots more on this topic, as it is often confusing and misunderstood.
An additional note from Andrew:
Closing a socket: if SO_LINGER has not been called on a socket, then
close() is not supposed to discard data. This is true on SVR4.2 (and,
apparently, on all non-SVR4 systems) but apparently not on SVR4; the
use of either shutdown() or SO_LINGER seems to be required to
guarantee delivery of all data.
来源:http://blog.chinaunix.net/space.php?uid=20196318&do=blog&id=171096
1. 重用已使用的地址
问题描述:在刚刚关闭了测试程序后,再启动服务器时提示bind失败,返回错误EADDRINUSE。
原因分析:套接字(主动关闭一端)在关闭套接字后会停留在TIME_WAIT状态一端时间,由于我在同一机器上同时运行客户端与服务器,故服务器在重新启动执行bind时,可能上次关闭连接还没有完成,连接依然存在,故bind失败。通过设置套接口的SO_REUSEADDR可重用已绑定的地址,通常所有的TCP服务器都应该指定本套接口选项。具体方法为:
int flag = 1;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &flag, sizeof(flag));
2.IO地址复用
直接调用read/write读写套接口和先调用select/poll在调用read/write都属于阻塞IO,只不过前者阻塞在读写系统调用上,而者阻塞在select/poll上。由于select需要两个系统调用,IO复用还稍有劣势,使用select/poll的优势在于我们可以等待多个描述字就绪。
IO复用的编程模型通常为:(以poll为例,应用实例请参考UNP第158页)
1. 创建一个pollfd结构数组,数组长度为进程可能打开的最大描述符个数,可简单的使用OPEN_MAX
2. 置数组的第一个元素为监听套接字的就绪条件,并将其它的元素都清空。
3. 调用poll,等待poll返回。
4. 对于每一个已就绪的描述字:
l 如果是监听描述字,则调用accept,得到连接描述字,并在pollfd数组第一个空位中加入连接描述字的就绪条件,并将就绪描述字数目减1,当减到0时转到3。
l 如果是连接描述字,则接受来自该描述字的请求信息,并发送响应信息,将该描述字从pollfd数组中移除,并将就绪描述字数目减1,当减到0时转到3。
3. 同一地址启动TCP与UDP服务
1. 创建TCP套接字,并绑定地址。
2. 创建UDP套接字,并绑定地址。
3. 调用select/poll检查TCP、UDP套接字是否就绪。
l 如果TCP套接字可读,则调用accept获取连接套接字,读取并响应请求。
l 如果UDP套接字可读,则直接读取请求,并发送响应。
具体应用实例参见UNP第223页。