tcp_tw_recycle和NAT-zzjlzx-ChinaUnix博客

zzjlzxzzjlzx.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

zzjlzx

博客访问： 10533454
博文数量： 1669
博客积分： 16831
博客等级：上将
技术积分： 12594
用户组：普通用户
注册时间： 2011-02-25 07:23

个人简介

柔中带刚，刚中带柔，淫荡中富含柔和，刚猛中荡漾风骚，无坚不摧，无孔不入！

文章分类

全部博文（1669）

NGINX（1）
MongoDB（2）
docker（8）
shadowsocks（1）
CloudFoundry（11）
CloudStack（102）
openstack（61）
PHP（0）
mail（0）
hadoop（25）
GemFire（1）
文件系统（4）
CDN（7）
下载及资源（15）
数据缓存（8）
web 加速（9）
分布式文件系统架（23）
虚拟化（133）
同步（6）
网站架构（50）
windows 监控（15）
mysql 监控（5）
oracle 监控（2）
linux 监控（24）
web 监控（35）
其他数据库（27）
备份恢复（28）
VPN及认证（24）
云系统（29）
windows（29）
WEB 故障（13）
mysql 备份（10）
oracle 集群（15）
HA及负载均衡（52）
存储（66）
shell（39）
web 应用（19）
mysql 优化（16）
mysql 故障（14）
mysql 安全（8）
mysql 配置（29）
mysql 应用（10）
web 安全（21）
SAN交换路由（26）
web 优化（46）
基础解释（24）
linux 安全（37）
linux 故障（22）
linux 应用（85）
linux 配置（64）
web 配置（23）
oracle 备份（33）
UNIX（12）
Solaris（5）
Aix（1）
oracle 配置（69）
oracle 优化（62）
oracle 安全（10）
oracle 应用（30）
交流（29）
oracle 故障（59）
linux 优化（31）
未分配的博文（4）

文章存档

2023年（4）

2022年（1）

2021年（10）

2020年（24）

2019年（4）

2018年（19）

2017年（66）

2016年（60）

2015年（49）

2014年（201）

2013年（221）

2012年（638）

2011年（372）

我的朋友

最近访客

推荐博文

tcp_tw_recycle和NAT

分类：

2012-10-19 16:08:48

tcp_tw_recycle和NAT

2012-10-15 17:25:15| 分类： rhel_apache | 标签： |字号大中小

Van Jacobson在RFC 1323里有这么一段话

An additional mechanism could be added to the TCP, a per-host cache of the last timestamp received from any connection. This value could then be used in the PAWS mechanism to reject old duplicate segments from earlier incarnations of the connection, if the timestamp clock can be guaranteed to have ticked at least once since the old connection was open. This would require that the TIME-WAIT delay plus the RTT together must be at least one tick of the sender's timestamp clock. Such an extension is not part of the proposal of this RFC.

Linux实现了这个机制。只是要同时启用timestamp和tcp_tw_recycle。具体的实现代码在net/ipv4/tcp_ipv4.c里的tcp_v4_conn_request函数里：

1347 if (tmp_opt.saw_tstamp && 1348 tcp_death_row.sysctl_tw_recycle && 1349 (dst = inet_csk_route_req(sk, req)) != NULL && 1350 (peer = rt_get_peer((struct rtable *)dst)) != NULL && 1351 peer->v4daddr == saddr) { 1352 if ((u32)get_seconds() - peer->tcp_ts_stamp < TCP_PAWS_MSL && 1353 (s32)(peer->tcp_ts - req->ts_recent) > 1354 TCP_PAWS_WINDOW) { 1355 NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED); 1356 goto drop_and_release; 1357 } 1358 }

这个机制依赖于客户端机器的timestamp单调递增。如果服务器在负载均衡器后面，同时这个负载均衡器做了NAT且不改变数据包的timestamp，那么有可能导致某个客户端发出的syn包被丢弃，造成连接请求超时。因为timestamp的值来自于源机器的jiffies。不同的机器开机时间很难是完全相同的。此时，除了客户端请求超时外，在服务器上还可以观察到netstat -s的结果里passive connections rejected by timestamp这一行的数值在增长。

所以在NAT后面的机器不应该启用tcp_tw_recycle。

这里还有另外一个小插曲，请看这个表达式，这里两个数都是无符号32位整数，这里可能造成underflow，也就是前者比后者小2的31次方以上，结果就成了正数。我当时分析的时候恰恰出现了这种情况，险些不能自圆其说，囧……

(s32)(peer->tcp_ts - req->ts_recent)

我写了个补丁，想消除这种情况，可是用了我的方法就不能正确处理wrap-around，而之前之所以那么写就是为了可以正确处理wrap-around。所以恐怕除了加一点警告之外，其他的也没什么能做的了。

阅读(921) | 评论(0) | 转发(0) |

上一篇：apache tomcat mod_proxy_ajp整合的问题

下一篇：time_wait太多导致[error] (99)Cannot assign requested address: proxy

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6