<?xml version="1.0" encoding="gb2312"?>
	<rss version="2.0">
		<channel>
		<title><![CDATA[CUDev]]></title>
		<description><![CDATA[Hello,all like Unix Linux and CU !
 ^_^
]]></description>
		<link>http://www.cublog.cn/u/12592/</link>
		<language>zh-cn</language>
		<generator>www.cublog.cn</generator>
		<copyright>Copyright 2010 ChinaUnix.Net All Rights Reserved</copyright>
		<pubDate>Fri, 03 Sep 2010 02:14:47 GMT</pubDate>
	
		<item>
			<title><![CDATA[使用GeoIP获得IP地址的地理地址信息]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2239884]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Mon, 24 May 2010 07:54:20 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[最近要开始写毕业论文，开始处理一些程序的日志。网络程序日志中，网络四元组是很常见的，在P2P网络中，经常要对IP地址的地理位置进行统计。之前，写过使用纯真IP数据库的程序，但是感觉不太正式，在翻看一些p2p开源软件的代码的时候发现了GeoIP.c，就google了一下，发现了GeoIP这个开源项目。<div><br></div><div>GeoIP有很多的数据库，主要有Country、City、Org等ip地址对应的信息。</div><div><br></div><div>接口非常的方便，以常用的Country查询为例。结果有三种：2字节Country Code，3字节Country Code，不定长Country Code。</div><div><br></div><div><meta http-equiv="content-type" content="text/html; charset=utf-8"><span class="Apple-style-span" style="font-size: 13px; ">char *GeoIP_country_code_by_name(</span><span class="Apple-style-span" style="font-size: 13px; ">GeoIP* gi, char *name</span><span class="Apple-style-span" style="font-size: 13px; ">);</span></div><div><span class="Apple-style-span" style="font-size: 13px; "><meta http-equiv="content-type" content="text/html; charset=utf-8"><span class="Apple-style-span" style="font-size: 12px; "><div><span class="Apple-style-span" style="font-size: 13px; ">char *GeoIP_country_code_by_addr(</span><span class="Apple-style-span" style="font-size: 13px; ">GeoIP* gi, char *addr</span><span class="Apple-style-span" style="font-size: 13px; ">);</span></div><div><span class="Apple-style-span" style="font-size: 13px; "><meta http-equiv="content-type" content="text/html; charset=utf-8"><span class="Apple-style-span" style="font-size: 12px; "><div><span class="Apple-style-span" style="font-size: 13px; ">char *GeoIP_country_code3_by_name(</span><span class="Apple-style-span" style="font-size: 13px; ">GeoIP* gi, char *name</span><span class="Apple-style-span" style="font-size: 13px; ">);</span></div><div><span class="Apple-style-span" style="font-size: 13px; "><span class="Apple-style-span" style="font-size: 12px; "><div><span class="Apple-style-span" style="font-size: 13px; ">char *GeoIP_country_code3_by_addr(</span><span class="Apple-style-span" style="font-size: 13px; ">GeoIP* gi, char *addr</span><span class="Apple-style-span" style="font-size: 13px; ">);</span></div></span></span></div></span></span></div></span></span></div><meta http-equiv="content-type" content="text/html; charset=utf-8"><div><span class="Apple-style-span" style="font-size: 13px; ">char *</span><span class="Apple-style-span" style="font-size: 13px; ">GeoIP_country_code_by_ipnum(GeoIP* gi, char *ipnum);</span></div><div><span class="Apple-style-span" style="font-size: 13px; "><meta http-equiv="content-type" content="text/html; charset=utf-8">char *GeoIP_country_name_by_ipnum(GeoIP* gi, char *ipnum);</span></div><div><span class="Apple-style-span" style="font-size: 13px; "><br></span></div><div>注意：</div><div>GeoIP_country_xxx_by_name()中的name，可以为域名、ip地址字符串；</div><div>GeoIP_country_xxx_by_addr()中的addr，为ip地址字符串；</div><div>GeoIP_country_xxx_by_ipnum()中ipnum，为ip地址主机字节序。</div><div><br></div><meta http-equiv="content-type" content="text/html; charset=utf-8"><div><span class="Apple-style-span" style="font-size: 13px; "><br></span></div><div>示例代码：</div><div><br></div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;GeoIP.h&gt;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define DATAFILE "/home/wangyao/GeoIP.dat"</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">int main ()</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;//char ipAddress[30]={0};</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;char *ipAddress = "202.118.224.241";</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;//char *ipAddress = "4.2.2.1";</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;const char * returnedCountry = NULL;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;GeoIP * gi = NULL;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;/* Read from filesystem, check for updated file */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;//gi = GeoIP_open(DATAFILE, GEOIP_STANDARD | GEOIP_CHECK_CACHE);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;/* Read from memory, faster but takes up more memory */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;gi = GeoIP_open(DATAFILE, GEOIP_MEMORY_CACHE);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;if (gi == NULL)&nbsp;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp;fprintf(stderr, "Error opening database\n");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp;exit(1);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;returnedCountry = GeoIP_country_code_by_name(gi, ipAddress);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;printf("%s: %s\n", ipAddress, returnedCountry);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;returnedCountry = GeoIP_country_code_by_addr(gi, ipAddress);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;printf("%s: %s\n", ipAddress, returnedCountry);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;returnedCountry = GeoIP_country_code3_by_addr(gi, ipAddress);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;printf("%s: %s\n", ipAddress, returnedCountry);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;returnedCountry = GeoIP_country_name_by_addr(gi, ipAddress);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;printf("%s: %s\n", ipAddress, returnedCountry);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;struct in_addr addr;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;inet_aton(ipAddress, &amp;addr);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;returnedCountry = GeoIP_country_code_by_ipnum(gi, ntohl(addr.s_addr) );</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;printf("%s: %s\n", ipAddress, returnedCountry);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;returnedCountry = GeoIP_country_name_by_ipnum(gi, ntohl(addr.s_addr) );</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;printf("%s: %s\n", ipAddress, returnedCountry);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;GeoIP_delete(gi);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp;return 0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div></div><div><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px;"><br></span></span></div>  ]]></description>
		</item>	
			<item>
			<title><![CDATA[pthread_cond_wait的spurious wakeup问题]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2213910]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Wed, 14 Apr 2010 02:55:21 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
		最近在温习pthread的时候，忽然发现以前对pthread_cond_wait的了解太肤浅了。昨晚在看《Programming With POSIX Threads》的时候，看到了<span class="Apple-style-span" style="font-size: 13px; ">pthread_cond_wait的通常使用方法：</span><div>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><table style="border:1px solid #999;width:80%;font-size:12px;" align="center"><tbody><tr><td><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; font-size: 12px; "><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">pthread_mutex_lock();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">while(condition_is_false)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;pthread_cond_wait();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">pthread_mutex_unlock();</p></span></td></tr></tbody></table></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">为什么在pthread_cond_wait()前要加一个while循环来判断条件是否为假呢？</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">APUE中写道:</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">传递给pthread_cond_wait的互斥量对条件进行保护，调用者把锁住的互斥量传给函数。函数把调用线程放到等待条件的线程列表上，然后对互斥量解锁，这两个操作是原子操作。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">线程释放互斥量，等待其他线程发给该条件变量的信号（唤醒一个等待者）或广播该条件变量（唤醒所有等待者）。当等待条件变量时，互斥量必须始终为释放的，这样其他线程才有机会锁住互斥量，修改条件变量。当线程从条件变量等待中醒来时，它重新继续锁住互斥量，对临界资源进行处理。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">条件变量的作用是发信号，而不是互斥。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><font class="Apple-style-span" face="黑体"><span class="Apple-style-span" style="font-size: large;">wait前检查</span></font></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">对于多线程程序，不能够用常规串行的思路来思考它们，因为它们是完全异步的，会出现很多临界情况。比如：pthread_cond_signal的时间早于pthread_cond_wait的时间，这样pthread_cond_wait就会一直等下去，漏掉了之前的条件变化。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">对于这种情况，解决的方法是在锁住互斥量之后和等待条件变量之前，检查条件变量是否已经发生变化。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><table style="border:1px solid #999;width:80%;font-size:12px;" align="center"><tbody><tr><td><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; font-size: 12px; "><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">if(condition_is_false)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;pthread_cond_wait();</p></span></td></tr></tbody></table></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">这样在等待条件变量前检查一下条件变量的值，如果条件变量已经发生了变化，那么就没有必要进行等待了，可以直接进行处理。这种方法在并发系统中比较常见，例如之前<span class="Apple-style-span" style="font-size: 12px; "><a href="http://blog.chinaunix.net/u/12592/showart_2207904.html" target="_blank">PACKET_MMAP中poll的竞争条件的解决方法</a></span>。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">-----------------------------------------------------------------------</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">忽然想起了设计模式中的单件模式的"双重检查加锁"：</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><table style="border:1px solid #999;width:80%;font-size:12px;" align="center"><tbody><tr><td><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; font-size: 12px; "><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">Singleton *getInstance()</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">{</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;if(ptr==NULL)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;{</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;LOCK();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;if(ptr==NULL)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ptr = new Singleton();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;UNLOCK();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;}</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;return ptr;</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">}</p></span></td></tr></tbody></table></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">这样只有在第一次的时候会进行锁(应该是第一轮，如果刚开始有多个线程进入了最上层的ptr==NULL代码块，就会有多次锁，只不过之后就不会锁了)，之后就不会锁了。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">pthread_once()的实现也是基于单件模式的。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">pthread_once函数首先检查控制变量，以判断是否已经完成初始化。如果完成，pthread_once简单的返回；否则，pthread_once调用初始化函数(没有参数)，并记录下初始化被完成。如果在一个线程初始化时，另外的线程调用pthread_once，则调用线程将等待，直到那个线程完成初始化后返回。换句话，当调用pthread_once成功返回时，调用者能够肯定所有的状态已经初始化完毕。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px;"><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><table style="border:1px solid #999;width:80%;font-size:12px;" align="center"><tbody><tr><td><meta http-equiv="content-type" content="text/html; charset=utf-8"><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">int</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">__pthread_once (once_control, init_routine)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; pthread_once_t *once_control;</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; void (*init_routine) (void);</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">{</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp;/* XXX Depending on whether the LOCK_IN_ONCE_T is defined use a</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; global lock variable or one which is part of the pthread_once_t</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; object. &nbsp;*/</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp;if (*once_control == PTHREAD_ONCE_INIT)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;{</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp;lll_lock (once_lock, LLL_PRIVATE);</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><br style="font-size: 10pt; "></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp;/* XXX This implementation is not complete. &nbsp;It doesn't take</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-tab-span" style="white-space: pre; ">	</span>&nbsp;cancelation and fork into account. &nbsp;*/</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp;if (*once_control == PTHREAD_ONCE_INIT)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-tab-span" style="white-space: pre; ">	</span>{</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-tab-span" style="white-space: pre; ">	</span>&nbsp;&nbsp;init_routine ();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><br style="font-size: 10pt; "></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-tab-span" style="white-space: pre; ">	</span>&nbsp;&nbsp;*once_control = !PTHREAD_ONCE_INIT;</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-tab-span" style="white-space: pre; ">	</span>}</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><br style="font-size: 10pt; "></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp;lll_unlock (once_lock, LLL_PRIVATE);</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp; &nbsp;}</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><br style="font-size: 10pt; "></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">&nbsp;&nbsp;return 0;</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">}</p></td></tr></tbody></table></p></span></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">-----------------------------------------------------------------------</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><font class="Apple-style-span" face="黑体"></font></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">pthread_cond_wait中的while()不仅仅在等待条件变量前检查条件变量，实际上在等待条件变量后也检查条件变量。pthread_cond_wait返回后，还需要检查条件变量，这是为什么呢？难道pthread_cond_wait不是pthread_cond_signal触发了某个condition导致的吗？</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">这个地方有些迷惑人，实际上pthread_cond_wait的返回不仅仅是pthread_cond_signal和pthread_cond_broadcast导致的，还会有一些假唤醒，也就是spurious wakeup。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">何为假唤醒？顾名思义就是虚假的唤醒，与pthread_cond_signal和pthread_cond_broadcast的唤醒相对。那么什么情况下会导致假唤醒呢？可以阅读参考1。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span class="Apple-style-span" style="font-size: medium; "><font class="Apple-style-span" face="黑体">signal</font></span></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">大致意思是：</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">在linux中，pthread_cond_wait底层是futex系统调用。在linux中，任何慢速的阻塞的系统调用当接收到信号的时候，就会返回-1，并且设置errno为EINTR。在系统调用返回前，用户程序注册的信号处理函数会被调用处理。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span class="Apple-style-span" style="font-size: 12px; "></span></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><i><br></i></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><i>注:什么有样的系统调用会出现接收信号后发挥EINTR呢？</i></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">慢速阻塞的系统调用，有可能会永远阻塞下去的那种。当接收到信号的时候，认为是一个返回并执行其他代码的一个时机。</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><br></p><p></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span class="Apple-style-span" style="font-size: 12px; "></span></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-style-span" style="font-family: song, Verdana; font-size: 12px; border-collapse: collapse; ">信号的处理也不简单，因为有些慢系统调用被信号中断后是会自动重启的，所以我们通常需要用siginterrupt(signo, 1)来关闭重启或者在用sigaction安装信号处理函数的时候取消SA_RESTART标志，之后就可以通过判断信号的返回值是否是-1和errno是否为EINTR来判断是否有信号抵达。</span></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-style-span" style="font-family: song, Verdana; font-size: 12px; border-collapse: collapse; "><br></span></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-style-span" style="font-family: song, Verdana; font-size: 12px; border-collapse: collapse; "><span class="Apple-style-span" style="border-collapse: separate; font-family: 'Courier New', Courier, 宋体; "></span></span></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; ">如果关闭了SA_RESTART的一些使用慢速系统调用的应用，一般都采用while()循环，检测到EINTR后就重新调用。</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><table align="center" style="border-top-width: 1px; border-right-width: 1px; border-bottom-width: 1px; border-left-width: 1px; border-top-style: solid; border-right-style: solid; border-bottom-style: solid; border-left-style: solid; border-top-color: rgb(153, 153, 153); border-right-color: rgb(153, 153, 153); border-bottom-color: rgb(153, 153, 153); border-left-color: rgb(153, 153, 153); width: 466px; font-size: 12px; "><tbody><tr style="font-size: 10pt; "><td style="font-size: 10pt; "><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">while(1)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">{</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">&nbsp;&nbsp; int ret = syscall();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">&nbsp;&nbsp; if(ret&lt;0 &amp;&amp; errno==EINTR)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; continue;</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">&nbsp;&nbsp; else</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp; break;</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">}</p></td></tr></tbody></table></p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; "><span class="Apple-style-span" style="font-size: 12px; "></span></p><p></p><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><p></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">但是，对于futex这种方法不行，因为futex结束后，再重新运行的过程中，会出现一个时间窗口，其他线程可能会在这个时间窗口中进行pthread_cond_signal，这样，再进行pthread_cond_wait的时候就丢失了一次条件变量的变化。解决方法就是在pthread_cond_wait前检查条件变量，也就是</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><table align="center" style="border-top-width: 1px; border-right-width: 1px; border-bottom-width: 1px; border-left-width: 1px; border-top-style: solid; border-right-style: solid; border-bottom-style: solid; border-left-style: solid; border-top-color: rgb(153, 153, 153); border-right-color: rgb(153, 153, 153); border-bottom-color: rgb(153, 153, 153); border-left-color: rgb(153, 153, 153); width: 466px; font-size: 12px; "><tbody><tr style="font-size: 10pt; "><td style="font-size: 10pt; "><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">pthread_mutex_lock();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">while(condition_is_false)</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">&nbsp;&nbsp; &nbsp;pthread_cond_wait();</p><p style="font-size: 10pt; margin-top: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; text-indent: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">pthread_mutex_unlock();</p></td></tr></tbody></table></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span class="Apple-style-span" style="font-size: medium;"><font class="Apple-style-span" face="黑体">pthread_cond_broadcast</font></span></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">实际上，不仅仅信号会导致假唤醒，pthread_cond_broadcast也会导致假唤醒。加入条件变量上有多个线程在等待，pthread_cond_broadcast会唤醒所有的等待线程，而pthread_cond_signal只会唤醒其中一个等待线程。这样，pthread_cond_broadcast的情况也许要在pthread_cond_wait前使用while循环来检查条件变量。</p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span class="Apple-style-span" style="font-size: large;"><font class="Apple-style-span" face="黑体">参考：</font></span></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><a href="http://vladimir_prus.blogspot.com/2005/07/spurious-wakeups.html">http://vladimir_prus.blogspot.com/2005/07/spurious-wakeups.html</a></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><a href="http://www.lambdacs.com/cpt/FAQ.html#Q94">http://www.lambdacs.com/cpt/FAQ.html#Q94</a></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><a href="http://groups.google.de/group/comp.programming.threads/msg/bb8299804652fdd7">http://groups.google.de/group/comp.programming.threads/msg/bb8299804652fdd7</a></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><a href="http://www.win.tue.nl/~aeb/linux/lk/lk-4.html#ss4.5">http://www.win.tue.nl/~aeb/linux/lk/lk-4.html#ss4.5</a></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><a href="http://blog.chinaunix.net/u/5251/showart_309061.html">http://blog.chinaunix.net/u/5251/showart_309061.html</a></p><p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><!--EndFragment--></p></div>
		
		
		  ]]></description>
		</item>	
			<item>
			<title><![CDATA[connection socket pool]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2211958]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Sun, 11 Apr 2010 05:36:29 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[<meta http-equiv="content-type" content="text/html; charset=utf-8"><span class="Apple-style-span" style="font-family: song, Verdana; border-collapse: collapse; "><div><meta http-equiv="content-type" content="text/html; charset=utf-8">转载注：测试代码和tcpdump的内容稍微改动了一下。</div><div><br></div><div>Linux connect(2)还有一个不怎么为人所知的属性：如果地址的family是AF_UNSPEC，对于TCP连接将通过发送RST包关闭连接，对于UDP连接将去掉目的地址关联，共同的结果是留下一个未连接的socket，这个socket和刚通过socket(2)创建的socket一样。这使得我们可以通过connect(2)到family是AF_UNSPEC的地址来替代close(2)关闭一个连接，并把这个连接的socket放到一个pool里面，以备后用，因为这能省去一个socket(2)调用的开销。</div></span><div><font class="Apple-style-span" face="song, Verdana"><span class="Apple-style-span" style="border-collapse: collapse;"><br></span></font></div><div><span class="Apple-style-span" style="border-collapse: collapse;"><font class="Apple-style-span" face="黑体"><b><span class="Apple-style-span" style="font-size: small;">测试代码如下：</span></b></font></span></div><div><span class="Apple-style-span" style="font-family: song, Verdana; font-size: 13px; border-collapse: collapse; ">#include &lt;stdio.h&gt;</span></div><div><font class="Apple-style-span" face="song, Verdana" size="3"><span class="Apple-style-span" style="border-collapse: collapse; font-size: 12px;"><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;stdlib.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;string.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;errno.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;sys/types.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;sys/socket.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;netinet/in.h&gt;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define xconnect(fd, addr, len) \</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;do { \</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;if (connect(fd, addr, len) &lt; 0) { \</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;fprintf(stderr, "connect error with: %s\n", \</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;strerror(errno)); \</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;exit(EXIT_FAILURE); \</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;} \</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;} while (0)</span></font></div><div><br></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">int main(int argc, char *argv[])</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;int retval;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;struct sockaddr_in addr;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;socklen_t len;</span></font></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">if(argc!=2)</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">{</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">printf("Usage: xconnect xx.xx.xx.xx\n");</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">return 0;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">char *addr_str = argv[1];</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;retval = socket(AF_INET, SOCK_STREAM, 0);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;memset(&amp;addr, 0, sizeof(&amp;addr));</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;addr.sin_family = AF_INET;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;addr.sin_addr.s_addr = inet_addr(addr_str);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;addr.sin_port = htons(22);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;len = sizeof(addr);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;xconnect(retval, (struct sockaddr*)&amp;addr, len);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;getsockname(retval, (struct sockaddr*)&amp;addr, &amp;len);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;printf("Old Port: %d\n", ntohs(addr.sin_port));</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;addr.sin_family = AF_UNSPEC;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;xconnect(retval, (struct sockaddr*)&amp;addr, len);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;addr.sin_family = AF_INET;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;addr.sin_addr.s_addr = inet_addr(addr_str);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;addr.sin_port = htons(22);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;xconnect(retval, (struct sockaddr*)&amp;addr, len);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;getsockname(retval, (struct sockaddr*)&amp;addr, &amp;len);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;printf("New Port: %d\n", ntohs(addr.sin_port));</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;return 0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div><div><br></div><div><span class="Apple-style-span" style="font-size: small;"><font class="Apple-style-span" face="黑体"><b>tcpdump的抓包如下：</b></font></span></div><div>debian-wangyao:/home/wangyao# tcpdump -i lo -nvv tcp</div></span></font><div><div style="font-family: song, Verdana; border-collapse: collapse; ">tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.919082 IP (tos 0x0, ttl 64, id 52187, offset 0, flags [DF], proto TCP (6), length 60)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.39206 &gt; 127.0.0.1.22: Flags [S], cksum 0x0338 (correct), seq 629688948, win 32792, options [mss 16396,sackOK,TS val 24019375 ecr 0,nop,wscale 6], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.919108 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.22 &gt; 127.0.0.1.39206: Flags [S.], cksum 0x51bd (correct), seq 637536356, ack 629688949, win 32768, options [mss 16396,sackOK,TS val 24019375 ecr 24019375,nop,wscale 6], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.919135 IP (tos 0x0, ttl 64, id 52188, offset 0, flags [DF], proto TCP (6), length 52)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.39206 &gt; 127.0.0.1.22: Flags [.], cksum 0x38e0 (correct), seq 1, ack 1, win 513, options [nop,nop,TS val 24019375 ecr 24019375], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; ">第一次三次握手</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.919661 IP (tos 0x0, ttl 64, id 52189, offset 0, flags [DF], proto TCP (6), length 52)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.39206 &gt; 127.0.0.1.22: Flags [R.], cksum 0x38dc (correct), seq 1, ack 1, win 513, options [nop,nop,TS val 24019375 ecr 24019375], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; ">RST中断第一个连接</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.919687 IP (tos 0x0, ttl 64, id 19416, offset 0, flags [DF], proto TCP (6), length 56)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.39207 &gt; 127.0.0.1.22: Flags [S], cksum 0x9740 (correct), seq 629721719, win 32792, options [mss 16396,sackOK,TS val 24019375 ecr 0], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.919704 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 56)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.22 &gt; 127.0.0.1.39207: Flags [S.], cksum 0xc014 (correct), seq 631516785, ack 629721720, win 32768, options [mss 16396,sackOK,TS val 24019375 ecr 24019375], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.919723 IP (tos 0x0, ttl 64, id 19417, offset 0, flags [DF], proto TCP (6), length 52)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.39207 &gt; 127.0.0.1.22: Flags [.], cksum 0x1513 (correct), seq 1, ack 1, win 32792, options [nop,nop,TS val 24019375 ecr 24019375], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; ">第二次三次握手</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.920877 IP (tos 0x0, ttl 64, id 19418, offset 0, flags [DF], proto TCP (6), length 52)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.39207 &gt; 127.0.0.1.22: Flags [F.], cksum 0x1512 (correct), seq 1, ack 1, win 32792, options [nop,nop,TS val 24019375 ecr 24019375], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.924008 IP (tos 0x0, ttl 64, id 45021, offset 0, flags [DF], proto TCP (6), length 52)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.22 &gt; 127.0.0.1.39207: Flags [.], cksum 0x1529 (correct), seq 1, ack 2, win 32768, options [nop,nop,TS val 24019376 ecr 24019375], length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.933517 IP (tos 0x0, ttl 64, id 45022, offset 0, flags [DF], proto TCP (6), length 84)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.22 &gt; 127.0.0.1.39207: Flags [P.], seq 1:33, ack 2, win 32768, options [nop,nop,TS val 24019378 ecr 24019375], length 32</div><div style="font-family: song, Verdana; border-collapse: collapse; ">13:27:48.933539 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)</div><div style="font-family: song, Verdana; border-collapse: collapse; ">&nbsp;&nbsp; &nbsp;127.0.0.1.39207 &gt; 127.0.0.1.22: Flags [R], cksum 0x289f (correct), seq 629721721, win 0, length 0</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; ">最后只是由于程序关闭，用FIN关闭连接，接受到server端响应，由于程序已经关闭，协议栈响应RST</div><div style="font-family: song, Verdana; border-collapse: collapse; ">--------------------------------------------------------------</div><div style="font-family: song, Verdana; border-collapse: collapse; "><br></div><div style="font-family: song, Verdana; border-collapse: collapse; "><span class="Apple-style-span" style="font-size: medium;"><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">我们尽量在收到对方的FIN包，也就是read(2)等返回0的时候再调用connect(AF_UNSPEC)关闭连接，防止造成误会。</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">可能的应用：反向HTTP代理中，由代理服务器发往HTTP服务器的连接。</span></font></div></span></div></div></div>  ]]></description>
		</item>	
			<item>
			<title><![CDATA[PACKET_MMAP使用一例]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2208072]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Mon, 05 Apr 2010 03:28:07 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
								根据上一篇文章<a href="http://blog.chinaunix.net/u/12592/showart_2207904.html" target="_blank">PACKET_MMAP实现原理分析</a>中PACKET_MMAP使用一节，写了一个简单的演示程序。<div><br></div><div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;stdio.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;sys/types.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;sys/socket.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;sys/mman.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;linux/if_packet.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;poll.h&gt;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#include &lt;net/ethernet.h&gt; /* the L2 protocols */</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">void CallBackPacket(char *data)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;printf("Recv A Packet.\n");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">int main()</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;int fd = 0, ret = 0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;char *buff = NULL;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;//fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;//可以使用ARP进行一下测试</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;fd = socket(PF_PACKET, SOCK_DGRAM, htons (ETH_P_ARP));</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;if(fd&lt;0)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;perror("socket");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;goto failed_2;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;}</span></font></div><div><br></div><div>//PACKET_VERSION和<span class="Apple-style-span" style="font-size: 13px; ">SO_BINDTODEVICE可以省略</span></div><meta http-equiv="content-type" content="text/html; charset=utf-8"><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#if 0</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;const int tpacket_version = TPACKET_V1;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;/* set tpacket hdr version. */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;ret = setsockopt(fd, SOL_PACKET, PACKET_VERSION, &amp;tpacket_version, sizeof (int));</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;if(ret&lt;0)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;perror("setsockopt");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;goto failed_2;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">//#define NETDEV_NAME "wlan0"</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define NETDEV_NAME "eth0"</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;/* bind to device. */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;ret = setsockopt(fd, SOL_SOCKET, SO_BINDTODEVICE, NETDEV_NAME, sizeof (NETDEV_NAME));</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;if(ret&lt;0)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;perror("setsockopt");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;goto failed_2;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;}</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#endif</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;struct tpacket_req req;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define PER_PACKET_SIZE 2048</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;const int BUFFER_SIZE = 1024*1024*16; //16MB的缓冲区</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;req.tp_block_size = 4096;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;req.tp_block_nr = BUFFER_SIZE/req.tp_block_size;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;req.tp_frame_size = PER_PACKET_SIZE;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;req.tp_frame_nr = BUFFER_SIZE/req.tp_frame_size;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;ret = setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *)&amp;req, sizeof(req));</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;if(ret&lt;0)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;perror("setsockopt");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;goto failed_2;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;buff = (char *)mmap(0, BUFFER_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;if(buff == MAP_FAILED)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;perror("mmap");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;goto failed_2;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;int nIndex=0, i=0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;while(1)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;//这里在poll前先检查是否已经有报文被捕获了</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;struct tpacket_hdr* pHead = (struct tpacket_hdr*)(buff+ nIndex*PER_PACKET_SIZE);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;//如果frame的状态已经为TP_STATUS_USER了，说明已经在poll前已经有一个数据包被捕获了，如果poll后不再有数据包被捕获，那么这个报文不会被处理，这就是所谓的竞争情况。</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;if(pHead-&gt;tp_status == TP_STATUS_USER)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;goto process_packet;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;//poll检测报文捕获</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;struct pollfd pfd;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;pfd.fd = fd;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;//pfd.events = POLLIN|POLLRDNORM|POLLERR;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;pfd.events = POLLIN;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;pfd.revents = 0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;ret = poll(&amp;pfd, 1, -1);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;if(ret&lt;0)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;perror("poll");</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;goto failed_1;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">process_packet:</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;//尽力的去处理环形缓冲区中的数据frame，直到没有数据frame了</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;for(i=0; i&lt;req.tp_frame_nr; i++)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;struct tpacket_hdr* pHead = (struct tpacket_hdr*)(buff+ nIndex*PER_PACKET_SIZE);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;//XXX: 由于frame都在一个环形缓冲区中，因此如果下一个frame中没有数据了，后面的frame也就没有frame了</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if(pHead-&gt;tp_status == TP_STATUS_KERNEL)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;break;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;//处理数据frame</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;CallBackPacket((char*)pHead+pHead-&gt;tp_net);</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;//重新设置frame的状态为TP_STATUS_KERNEL</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;pHead-&gt;tp_len = 0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;pHead-&gt;tp_status = TP_STATUS_KERNEL;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;//更新环形缓冲区的索引，指向下一个frame</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;nIndex++;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;nIndex%=req.tp_frame_nr;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;}</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">success:</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;close(fd);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;munmap(buff, BUFFER_SIZE);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;return 0;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">failed_1:</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;munmap(buff, BUFFER_SIZE);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">failed_2:</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;close(fd);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;return -1;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div></div></div><div><br></div><div><br></div><div><br></div><div><font class="Apple-style-span" color="#FF0102"><i>注</i></font>:没有加filter，感觉直接使用bpf巨麻烦，实际上可以直接使用libpcap的关于bpf的api，就像参考1中实现的那样。</div><div><br></div><div>参考：</div><div><a href="http://hi.baidu.com/ah__fu/blog/item/8aadf895fad570007af48000.html">http://hi.baidu.com/ah__fu/blog/item/8aadf895fad570007af48000.html</a></div><div><a href="http://linux.chinaunix.net/bbs/viewthread.php?tid=1134588">http://linux.chinaunix.net/bbs/viewthread.php?tid=1134588</a></div>
		
		
		
		
		
		
		
		
		
		
		
		  ]]></description>
		</item>	
			<item>
			<title><![CDATA[PACKET_MMAP实现原理分析]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2207904]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Thu, 08 Apr 2010 06:43:22 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
																						在上一篇文章中，已经提到了在libpcap-1.0.0中已经增加了部分平台的PACKET_MMAP支持，就一直想写一篇关于PACKET_MMAP实现的文章。<div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;">PACKET_MMAP实现的代码都在net/packet/af_packet.c中，其中一些宏、结构等定义在include/linux/if_packet.h中。</span></font></div><div><br></div><div><div><span class="Apple-style-span" style="font-size: medium;"><font class="Apple-style-span" face="黑体">PACKET_MMAP的实现原理</font></span></div><div>PACKET_MMAP在内核空间中分配一块内核缓冲区，然后用户空间程序调用mmap映射到用户空间。将接收到的skb拷贝到那块内核缓冲区中，这样用户空间的程序就可以直接读到捕获的数据包了。</div><div>如果没有开启PACKET_MMAP，只是依靠AF_PACKET非常的低效。它有缓冲区的限制，并且每捕获一个报文就需要一个系统调用，如果为了获得packet的时间戳就需要两个系统调用了（获得时间戳还需要一个系统调用，libpcap就是这样做的）。</div><div>PACKET_MMAP非常高效，它提供一个映射到用户空间的大小可配置的环形缓冲区。这种方式，读取报文只需要等待报文就可以了，大部分情况下不需要系统调用（其实poll也是一次系统调用）。通过内核空间和用户空间共享的缓冲区还可以起到减少数据拷贝的作用。</div><div>当然为了提高捕获的性能，不仅仅只是PACKET_MMAP。如果你在捕获一个高速网络中的数据，你应该检查NIC是否支持一些中断负载缓和机制或者是NAPI，确定开启这些措施。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>PACKET_MMAP减少了系统调用，不用recvmsg就可以读取到捕获的报文，相比原始套接字+recvfrom的方式，减少了一次拷贝和一次系统调用。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div><span class="Apple-style-span" style="font-size: medium;"><font class="Apple-style-span" face="黑体">PACKET_MMAP的使用</font></span>：</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>从系统调用的角度来看待如何使用PACKET_MMAP，可以从<a href="http://blog.chinaunix.net/u/12592/showart_2207614.html" target="_blank">libpcap底层实现变化的分析</a>中strace的分析中看出来：</div><div>[setup]:</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>socket()<span class="Apple-tab-span" style="white-space: pre;">	</span>------&gt; 捕获socket的创建</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>setsockopt()<span class="Apple-tab-span" style="white-space: pre;">	</span>------&gt; 环形缓冲区的分配</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>mmap()<span class="Apple-tab-span" style="white-space: pre;">	</span>------&gt; 将分配的缓冲区映射到用户空间中</div><div>[capture]</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>poll()<span class="Apple-tab-span" style="white-space: pre;">	</span>------&gt; 等待新进的报文</div><div>[shutdown]</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>close<span class="Apple-tab-span" style="white-space: pre;">	</span>------&gt; 销毁捕获socket和所有相关的资源</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>接下来的这些内容，翻译自Document/networking/packet_mmap.txt，但是根据需要有所删减</div><div><br></div><div>1. socket的创建和销毁如下，与不使用PACKET_MMAP是一样的:</div><div>int fd;</div><div>fd = socket(PF_PACKET, mode, htons(ETH_P_ALL))</div><div><br></div><div>如果mode设置为SOCK_RAW，链路层信息也会被捕获；如果mode设置为SOCK_DGRAM，那么对应接口的链路层信息捕获就不会被支持，内核会提供一个虚假的头部。</div><div>销毁socket和释放相关的资源，可以直接调用一个简单的close()系统调用就可以了。</div><div><br></div><div>2. PACKET_MMAP的设置</div><div>用户空间设置PACKET_MMAP只需要下面的系统调用就可以了:</div><div>setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *)&amp;req, sizeof(req));</div><div><br></div><div>上面系统调用中最重要的就是req参数，其定义如下：</div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;struct tpacket_req</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;unsigned int &nbsp; &nbsp;tp_block_size; &nbsp;/* Minimal size of contiguous block */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;unsigned int &nbsp; &nbsp;tp_block_nr; &nbsp; &nbsp;/* Number of blocks */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;unsigned int &nbsp; &nbsp;tp_frame_size; &nbsp;/* Size of frame */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;unsigned int &nbsp; &nbsp;tp_frame_nr; &nbsp; &nbsp;/* Total number of frames */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;};</span></font></div></div><div><br></div><div>这个结构被定义在include/linux/if_packet.h中，在捕获进程中建立一个不可交换(unswappable)内存的环形缓冲区。通过被映射的内存，捕获进程就可以无需系统调用就可以访问到捕获的报文和报文相关的元信息，像时间戳等。</div><div><br></div><div>捕获frame被划分为多个block，每个block是一块物理上连续的内存区域，有tp_block_size/tp_frame_size个frame。block的总数是tp_block_nr。其实tp_frame_nr是多余的，因为我们可以计算出来：</div><div><div>&nbsp;&nbsp; &nbsp;frames_per_block = tp_block_size/tp_frame_size</div></div><div>实际上，packet_set_ring检查下面的条件是否正确：</div><div><div>&nbsp;&nbsp; &nbsp;frames_per_block * tp_block_nr == tp_frame_nr</div></div><div><br></div><div>下面我们可以一个例子：</div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; tp_block_size= 4096</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; tp_frame_size= 2048</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; tp_block_nr &nbsp;= 4</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; tp_frame_nr &nbsp;= 8</span></font></div><div><br></div><div>得到的缓冲区结构应该如下：</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;block #1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; block #2 &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">+---------+---------+ &nbsp; &nbsp;+---------+---------+ &nbsp; &nbsp;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">| frame 1 | frame 2 | &nbsp; &nbsp;| frame 3 | frame 4 | &nbsp; &nbsp;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">+---------+---------+ &nbsp; &nbsp;+---------+---------+ &nbsp; &nbsp;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;block #3 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; block #4</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">+---------+---------+ &nbsp; &nbsp;+---------+---------+</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">| frame 5 | frame 6 | &nbsp; &nbsp;| frame 7 | frame 8 |</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">+---------+---------+ &nbsp; &nbsp;+---------+---------+</span></font></div></div><div><br></div><div>每个frame必须放在一个block中，每个block保存整数个frame，也就是说一个frame不能跨越两个block。</div><div><br></div><div>3. 映射和使用环形缓冲区</div><div>在用户空间映射缓冲区可以直接使用方便的mmap()函数。虽然那些buffer在内核中是由多个block组成的，但是映射后它们在用户空间中是连续的。</div><div><div>&nbsp;&nbsp; &nbsp;mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);</div></div><div><br></div><div>如果tp_frame_size能够整除tp_block_size，那么每个frame都将会是tp_frame_size长度；如果不是，那么tp_block_size/tp_frame_size个frame之间就会有空隙，那是因为一个frame不会跨越两个block。</div><div><br></div><div>在每一个frame的开始有一个status域(可以查看struct tpacket_hdr)，这些status定义在include/linux/if_packet.h中：</div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define TP_STATUS_KERNEL</span></font><span class="Apple-tab-span" style="white-space: pre;"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">0</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define TP_STATUS_USER</span></font><span class="Apple-tab-span" style="white-space: pre;"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">1</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define TP_STATUS_COPY</span></font><span class="Apple-tab-span" style="white-space: pre;"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">2</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define TP_STATUS_LOSING</span></font><span class="Apple-tab-span" style="white-space: pre;"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">4</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#define TP_STATUS_CSUMNOTREADY</span></font><span class="Apple-tab-span" style="white-space: pre;"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">8</span></font></div></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><br></span></font></div><div>这里我们只关心前两个，TP_STATUS_KERNEL和TP_STATUS_USER。如果status为TP_STATUS_KERNEL，表示这个frame可以被kernel使用，实际上就是可以将存放捕获的数据存放在这个frame中；如果status为TP_STATUS_USER，表示这个frame可以被用户空间使用，实际上就是这个frame中存放的是捕获的数据，应该读出来。</div><div><br></div><div>内核将所有的frame的status初始化为TP_STATUS_KERNEL，当内核接受到一个报文的时候，就选一个frame，把报文放进去，然后更新它的状态为TP_STATUS_USER（这里假设不出现其他问题，也就是忽略其他的状态）。用户程序读取报文，一旦报文被读取，用户必须将frame对应的status设置为0，也就是设置为TP_STATUS_KERNEL，这样内核就可以再次使用这个frame了。</div><div><br></div><div>用户可以通过poll或者是其他机制来检测环形缓冲区中的新报文：</div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;struct pollfd pfd;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;pfd.fd = fd;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;pfd.revents = 0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;pfd.events = POLLIN|POLLRDNORM|POLLERR;</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp;if (status == TP_STATUS_KERNEL)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;retval = poll(&amp;pfd, 1, timeout);</span></font></div></div><div><br></div><div>先检查状态值，然后再对frame进行轮循，这样就可以避免竞争条件了（如果status已经是TP_STATUS_USER了，也就是说在调用poll前已经有了一个报文到达。这个时候再调用poll，并且之后不再有新报文到达的话，那么之前的那个报文就无法读取了，这就是所谓的竞争条件）。</div><div>在libpcap-1.0.0中是这么设计的：</div><div>pcap-linux.c中的pcap_read_linux_mmap:</div><div>//<span class="Apple-style-span" style="font-family: song,Verdana; font-size: 13px; border-collapse: collapse;">如果frame的状态<span class="Apple-style-span" style="border-collapse: separate; font-family: 'Courier New',Courier,宋体; font-size: 12px;">在poll前<span class="Apple-style-span" style="font-family: song,Verdana; font-size: 13px; border-collapse: collapse;">已经为TP_STATUS_USER了，说明已经在poll前已经有一个数据包被捕获了，如果poll后不再有数据包被捕获，那么这个报文不会被处理，这就是所谓的竞争情况。</span></span></span></div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; white-space: pre;">if ((handle-&gt;md.timeout &gt;= 0) &amp;&amp;
    !<font class="Apple-style-span" color="#ff0102">pcap_get_ring_frame</font>(handle, TP_STATUS_USER)) {
	struct pollfd pollinfo;
	int ret;

	pollinfo.fd = handle-&gt;fd;
	pollinfo.events = POLLIN;

	do {
		/* poll() requires a negative timeout to wait forever */
		ret = <font class="Apple-style-span" color="#ff0102">poll</font>(&amp;pollinfo, 1, (handle-&gt;md.timeout &gt; 0)?
		 			handle-&gt;md.timeout: -1);
		if ((ret &lt; 0) &amp;&amp; (errno != EINTR)) {
			return -1;
		}
		......
	} while (ret &lt; 0);
}</span></font>
</div></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; white-space: pre;">//依次处理捕获的报文</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; white-space: pre;">while ((pkts &lt; max_packets) || (max_packets &lt;= 0)) {

        ......
        //如果frame的状态为TP_STATUS_USER就读出数据frame，否则就退出循环。注意这里是环形缓冲区
	h.raw = <font class="Apple-style-span" color="#ff0102">pcap_get_ring_frame</font>(handle, TP_STATUS_USER);
	if (!h.raw)
		break;

        ......

	/* pass the packet to the user */
	pkts++;
	callback(user, &amp;pcaphdr, bp);
	handle-&gt;md.packets_read++;
skip:
	/* next packet */
	switch (handle-&gt;md.tp_version) {
	case TPACKET_V1:</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; white-space: pre;">                //重新设置frame的状态为TP_STATUS_KERNEL
<font class="Apple-style-span" color="#ff0102">		h.h1-&gt;tp_status = TP_STATUS_KERNEL;
</font>		break;
	......
        }

}</span></font></div><div><br></div><div><span class="Apple-style-span" style="font-size: medium;"><font class="Apple-style-span" face="黑体">PACKET_MMAP源码分析</font></span></div><div></div><div>这里就不再像上一篇文章中那样大段大段的粘贴代码了，只是分析一下流程就可以了，需要的同学可以对照着follow一下代码;-)</div><div>数据包进入网卡后，创建了skb，之后会进入软中断处理，调用netif_receive_skb，并调用dev_add_pack注册的一些func。很明显可以看到af_packet.c中的tpacket_rcv和packet_rcv就是我们找的目标。</div><div>tpacket_rcv是PACKET_MMAP的实现，packet_rcv是普通AF_PACKET的实现。</div><div><br></div><div>tpacket_rcv:</div><div>1. 进行些必要的检查</div><div>2. 运行run_filter，通过BPF过滤中我们设定条件的报文，得到需要捕获的长度snaplen</div><div>3. 在ring buffer中查找TP_STATUS_KERNEL的frame</div><div>4. 计算macoff、netoff等信息</div><div>5. 如果snaplen+macoff&gt;frame_size，并且skb为共享的，那么就拷贝skb<span class="Apple-tab-span" style="white-space: pre;">	</span>&lt;一般不会拷贝&gt;</div><div>if(skb_shared(skb))</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>skb_clone()</div><div>6. 将数据从skb拷贝到kernel Buffer中<span class="Apple-tab-span" style="white-space: pre;">	</span>&lt;拷贝&gt;</div><div>skb_copy_bits(skb, 0, &nbsp;h.raw+macoff, snaplen);</div><div>7. 设置拷贝到frame中报文的头部信息，包括时间戳、长度、状态等信息</div><div>8. flush_dcache_page()把某页在data cache中的内容同步回内存。</div><div>x86应该不用这个，这个多为RISC架构用的</div><div>9. 调用sk_data_ready，通知睡眠进程，调用poll</div><div>10. 应用层在调用poll返回后，就会调用pcap_get_ring_frame获得一个frame进行处理。这里面没有拷贝也没有系统调用。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>开销分析：1次拷贝+1个系统调用(poll)</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>packet_rcv:</div><div>1. 进行些必要的检查</div><div>2. 运行run_filter，通过BPF过滤中我们设定条件的报文，得到需要捕获的长度snaplen</div><div>3. 如果skb为共享的，那么就拷贝skb<span class="Apple-tab-span" style="white-space: pre;">	</span>&lt;一般都会拷贝&gt;</div><div>if(skb_shared(skb))</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>skb_clone()</div><div>4. 设置拷贝到frame中报文的头部信息，包括时间戳、长度、状态等信息</div><div>5. 将skb追加到socket的sk_receive_queue中</div><div>6. 调用sk_data_ready，通知睡眠进程有数据到达</div><div>7. 应用层睡眠在recvfrom上，当数据到达，socket可读的时候，调用packet_recvmsg，其中将数据拷贝到用户空间。<span class="Apple-tab-span" style="white-space: pre;">	</span>&lt;拷贝&gt;</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>skb_recv_datagram()从sk_receive_queue中获得skb</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>skb_copy_datagram_iovec()将数据拷贝到用户空间</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>开销分析：2次拷贝+1个系统调用(recvfrom)</div><div><br></div><div><i><font class="Apple-style-span" color="#ff0102">注:</font></i>其实在packet处理之前还有一次拷贝过程，在NIC Driver中，创建一个skb，然后NIC把数据DMA到skb的data中。</div><div>在另外一些ZeroCopy实现中(例如<a href="http://code.google.com/p/ntzc/" target="_blank">ntz</a>)，如果不希望NIC数据进入协议栈的话，就可以不用考虑skb_shared的问题了，直接将数据从NIC Driver中DMA到制定的一块内存，然后使用mmap到用户空间。这样就只有一次DMA过程，当然DMA也是一种拷贝;-)</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>关于数据包如何从NIC Driver到packet_rcv/tpacket_rcv，数据包经过中断、软中断等处理，进入netif_receive_skb中对skb进行分发，就会调用dev_add_pack注册的packet_type-&gt;func。</div><div>关于数据包接受的流程可以阅读一些关于NAPI等相关的资料：</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">http://linux.chinaunix.net/bbs/thread-1133017-1-2.html</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><br></span></font></div><div>关于如何从sk_data_ready到用户进程的poll，可以阅读：</div><div>http://simohayha.javaeye.com/blog/559506</div><div>http://donghao.org/2009/08/linuxiapolliepollaueouaeaeeio.html</div></div><div><br></div>
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		  ]]></description>
		</item>	
			<item>
			<title><![CDATA[libpcap底层实现变化的分析]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2207614]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Fri, 02 Apr 2010 11:12:44 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
																										一个很偶然的机会，我看到一个关于Monkey系列开发包的PPT《<a href="http://monkey.org/~jose/presentations/recon05.d/packet%20mastering%20-%20recon.pdf" target="_blank">Packet Mastering the Monkey Way</a>》。其中讲到了将libevent和libpcap结合起来用。libevent和libpcap都是有自己的loop，要将两个结合起来写代码的话，必须砍掉一个libpcap的loop，将libpcap的fd就绪事件整合到libevent中，这样就可以使用libevent的loop来搞了。<div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px; "><br></span></font></div><div>直接从jscan中摘出一段代码来：</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><table style="border:1px solid #999;width:80%;font-size:12px;" align="center"><tbody><tr><td><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; font-size: 12px; "><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px; "><div>ctx.p = pcap_open_live(intf, 1500,&nbsp;(ctx.flags == SCAN_FLAGS_PASSIVE), 500,&nbsp;ebuff);</div></span></font></div><div><font class="Apple-style-span" size="3"><div style="font-size: 12px; ">event_init();</div><div style="font-size: 12px; ">ctx.tv.tv_sec = 0;</div><div style="font-size: 12px; ">ctx.tv.tv_usec = 500;</div><div style="font-size: 12px; "><i>p_fd = pcap_fileno(ctx.p);</i></div><div style="font-size: 12px; "><i>event_set(&amp;ctx.recv_ev, p_fd, EV_READ,&nbsp;_recv, (void *) &amp;ctx);</i></div><div style="font-size: 12px; "><i>event_add(&amp;ctx.recv_ev, &amp;ctx.tv);</i></div></font></div></span></span></span></td></tr></tbody></table></span></font></div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><div>看了这个后，想了解一下libpcap的具体实现，本来猜测是原始套接字，用strace去看了一下。</div><div><br></div><div><span class="Apple-style-span" style="font-size: large;"><font class="Apple-style-span" face="黑体">strace分析</font></span></div><div>拿之前那个《<a href="http://blog.chinaunix.net/u/12592/showart_2079182.html" target="_blank">试用pypcap</a>》中写的那个C代码，进行了一下strace：</div><div><div>execve("./t-1.1.0", ["./t-1.1.0", "eth0", "172.16.11.11", "./DHT_nodes.sav"], [/* 52 vars */]) = 0</div><div>brk(0) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0x8a2e000</div><div>access("/etc/ld.so.nohwcap", F_OK) &nbsp; &nbsp; &nbsp;= -1 ENOENT (No such file or directory)</div><div>mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fdf000</div><div>access("/etc/ld.so.preload", R_OK) &nbsp; &nbsp; &nbsp;= -1 ENOENT (No such file or directory)</div><div>open("/etc/ld.so.cache", O_RDONLY) &nbsp; &nbsp; &nbsp;= 3</div><div>fstat64(3, {st_mode=S_IFREG|0644, st_size=112216, ...}) = 0</div><div>mmap2(NULL, 112216, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7fc3000</div><div>close(3) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>access("/etc/ld.so.nohwcap", F_OK) &nbsp; &nbsp; &nbsp;= -1 ENOENT (No such file or directory)</div><div>open("/usr/lib/libpcap.so.0.8", O_RDONLY) = 3</div><div>read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000-\0\0004\0\0\0"..., 512) = 512</div><div>fstat64(3, {st_mode=S_IFREG|0644, st_size=182240, ...}) = 0</div><div>mmap2(NULL, 187136, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7f95000</div><div>mmap2(0xb7fc1000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2b) = 0xb7fc1000</div><div>close(3) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>access("/etc/ld.so.nohwcap", F_OK) &nbsp; &nbsp; &nbsp;= -1 ENOENT (No such file or directory)</div><div>open("/lib/i686/cmov/libc.so.6", O_RDONLY) = 3</div><div>read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260l\1\0004\0\0\0"..., 512) = 512</div><div>fstat64(3, {st_mode=S_IFREG|0755, st_size=1331684, ...}) = 0</div><div>mmap2(NULL, 1337704, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e4e000</div><div>mmap2(0xb7f8f000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x141) = 0xb7f8f000</div><div>mmap2(0xb7f92000, 10600, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f92000</div><div>close(3) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e4d000</div><div>mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e4c000</div><div>set_thread_area({entry_number:-1 -&gt; 6, base_addr:0xb7e4db10, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0</div><div>mprotect(0xb7f8f000, 8192, PROT_READ) &nbsp; = 0</div><div>mprotect(0xb7ffe000, 4096, PROT_READ) &nbsp; = 0</div><div>munmap(0xb7fc3000, 112216) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>brk(0) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0x8a2e000</div><div>brk(0x8a4f000) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0x8a4f000</div><div>open("./DHT_nodes.sav", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3</div><div>socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4</div><div>ioctl(4, SIOCGIFADDR, {ifr_name="eth0", ???}) = -1 EADDRNOTAVAIL (Cannot assign requested address)</div><div>close(4) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0</div><div>mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fde000</div><div>write(1, "Device: eth0\n", 13Device: eth0</div><div>) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 13</div><div>write(1, "Filter: ip dst 172.16.11.11 and "..., 36Filter: ip dst 172.16.11.11 and udp</div><div>) = 36</div><div><font class="Apple-style-span" color="#FF0102">socket(PF_PACKET, SOCK_RAW, 768) &nbsp; &nbsp; &nbsp; &nbsp;= 4</font></div><div>ioctl(4, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0</div><div>ioctl(4, SIOCGIFHWADDR, {ifr_name="eth0", ifr_hwaddr=00:e0:60:b0:a3:f6}) = 0</div><div>ioctl(4, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=2}) = 0</div><div>bind(4, {sa_family=AF_PACKET, proto=0x03, if2, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0</div><div>getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0</div><div>setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, "\2\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16) = 0</div><div>setsockopt(4, SOL_PACKET, 0x8 /* PACKET_??? */, [1], 4) = 0</div><div>getsockopt(4, SOL_PACKET, 0xb /* PACKET_??? */, [28], [4]) = 0</div><div>setsockopt(4, SOL_PACKET, 0xa /* PACKET_??? */, [1], 4) = 0</div><div>setsockopt(4, SOL_PACKET, 0xc /* PACKET_??? */, [4], 4) = 0</div><div><font class="Apple-style-span" color="#FF0102">setsockopt(4, SOL_PACKET, PACKET_RX_RING, "\0@\0\0\376\0\0\0@ \0\0\376\0\0\0", 16) = 0</font></div><div><font class="Apple-style-span" color="#FF0102">mmap2(NULL, 4161536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0xb7a54000</font></div><div>setsockopt(4, SOL_SOCKET, SO_ATTACH_FILTER, "\1\0\0\0\204!\374\267", 8) = 0</div><div>fcntl64(4, F_GETFL) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 0x2 (flags O_RDWR)</div><div><font class="Apple-style-span" color="#000102">fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) &nbsp;= 0</font></div><div>recv(4, 0xbfb8183f, 1, MSG_TRUNC) &nbsp; &nbsp; &nbsp; = -1 EAGAIN (Resource temporarily unavailable)</div><div>fcntl64(4, F_SETFL, O_RDWR) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 0</div><div>setsockopt(4, SOL_SOCKET, SO_ATTACH_FILTER, "\16\0\374\267\240\350\242\10", 8) = 0</div><div><font class="Apple-style-span" color="#FF0102">poll([{fd=4, events=POLLIN}], 1, -1^C &lt;unfinished ...&gt;</font></div></div><div><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px;"><br></span></span></div><div><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px;"><span class="Apple-style-span" style="font-family: 黑体; font-size: large; ">libpcap-0.9.8源码跟踪</span></span></span></div><div>正好手头有一份libpcap-0.9.8的源代码，我就决定follow一下代码，看是不是如strace那样的，但是很失望，虽然我对自己的源码阅读能力很有信心，但是没有找到有调用poll的地方;-(</div><div><font class="Apple-style-span" face="黑体"><i>pcap.c中的pcap_loop:</i></font></div><div><div>/*</div><div>&nbsp;* XXX keep reading until we get something</div><div>&nbsp;* (or an error occurs)</div><div>&nbsp;*/</div><div>do {</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>n = p-&gt;read_op(p, cnt, callback, user);</div><div>} while (n == 0);</div></div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的pcap_open_live:</i></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">handle-&gt;read_op = pcap_read_linux;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><br></span></font></div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的pcap_read_linux:</i></font></div><div><div>static int</div><div>pcap_read_linux(pcap_t *handle, int max_packets, pcap_handler callback, u_char *user)</div><div>{</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>/*</div><div><span class="Apple-tab-span" style="white-space:pre">	</span> * Currently, on Linux only one packet is delivered per read,</div><div><span class="Apple-tab-span" style="white-space:pre">	</span> * so we don't loop.</div><div><span class="Apple-tab-span" style="white-space:pre">	</span> */</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>return pcap_read_packet(handle, callback, user);</div><div>}</div></div><div><br></div><div><i><font class="Apple-style-span" face="黑体">pcap-linux.c中的pcap_read_packet:</font></i></div><div><div>do {</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>/*</div><div><span class="Apple-tab-span" style="white-space:pre">	</span> * Has "pcap_breakloop()" been called?</div><div><span class="Apple-tab-span" style="white-space:pre">	</span> */</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>if (handle-&gt;break_loop) {</div><div><span class="Apple-tab-span" style="white-space:pre">		</span>/*</div><div><span class="Apple-tab-span" style="white-space:pre">		</span> * Yes - clear the flag that indicates that it</div><div><span class="Apple-tab-span" style="white-space:pre">		</span> * has, and return -2 as an indication that we</div><div><span class="Apple-tab-span" style="white-space:pre">		</span> * were told to break out of the loop.</div><div><span class="Apple-tab-span" style="white-space:pre">		</span> */</div><div><span class="Apple-tab-span" style="white-space:pre">		</span>handle-&gt;break_loop = 0;</div><div><span class="Apple-tab-span" style="white-space:pre">		</span>return -2;</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>}</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>fromlen = sizeof(from);</div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" color="#FF0102">	</font></span><font class="Apple-style-span" color="#FF0102">packet_len = recvfrom(</font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" color="#FF0102">		</font></span><font class="Apple-style-span" color="#FF0102">handle-&gt;fd, bp + offset,</font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" color="#FF0102">		</font></span><font class="Apple-style-span" color="#FF0102">handle-&gt;bufsize - offset, MSG_TRUNC,</font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" color="#FF0102">		</font></span><font class="Apple-style-span" color="#FF0102">(struct sockaddr *) &amp;from, &amp;fromlen);</font></div><div>} while (packet_len == -1 &amp;&amp; errno == EINTR);</div></div><div><br></div><div><br></div><div>看来libpcap-0.9.8没有想象中那样调用poll，同时查看了系统中的libpcap的版本是libpcap-1.0.0。strace一下用libpcap-0.9.8编译的那个程序：</div><div><div>execve("./t-0.9.8", ["./t-0.9.8", "eth0", "172.16.11.11", "./DHT_nodes.sav"], [/* 52 vars */]) = 0</div><div>brk(0) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0x84e7000</div><div>access("/etc/ld.so.nohwcap", F_OK) &nbsp; &nbsp; &nbsp;= -1 ENOENT (No such file or directory)</div><div>mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ef9000</div><div>access("/etc/ld.so.preload", R_OK) &nbsp; &nbsp; &nbsp;= -1 ENOENT (No such file or directory)</div><div>open("/etc/ld.so.cache", O_RDONLY) &nbsp; &nbsp; &nbsp;= 3</div><div>fstat64(3, {st_mode=S_IFREG|0644, st_size=112216, ...}) = 0</div><div>mmap2(NULL, 112216, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7edd000</div><div>close(3) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>access("/etc/ld.so.nohwcap", F_OK) &nbsp; &nbsp; &nbsp;= -1 ENOENT (No such file or directory)</div><div>open("/lib/i686/cmov/libc.so.6", O_RDONLY) = 3</div><div>read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260l\1\0004\0\0\0"..., 512) = 512</div><div>fstat64(3, {st_mode=S_IFREG|0755, st_size=1331684, ...}) = 0</div><div>mmap2(NULL, 1337704, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7d96000</div><div>mmap2(0xb7ed7000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x141) = 0xb7ed7000</div><div>mmap2(0xb7eda000, 10600, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7eda000</div><div>close(3) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d95000</div><div>set_thread_area({entry_number:-1 -&gt; 6, base_addr:0xb7d958d0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0</div><div>mprotect(0xb7ed7000, 8192, PROT_READ) &nbsp; = 0</div><div>mprotect(0xb7f18000, 4096, PROT_READ) &nbsp; = 0</div><div>munmap(0xb7edd000, 112216) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>brk(0) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0x84e7000</div><div>brk(0x8508000) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0x8508000</div><div>open("./DHT_nodes.sav", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3</div><div>socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4</div><div>ioctl(4, SIOCGIFADDR, {ifr_name="eth0", ???}) = -1 EADDRNOTAVAIL (Cannot assign requested address)</div><div>close(4) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 0</div><div>fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0</div><div>mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ef8000</div><div>write(1, "Device: eth0\n", 13Device: eth0</div><div>) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= 13</div><div>write(1, "Filter: ip dst 172.16.11.11 and "..., 36Filter: ip dst 172.16.11.11 and udp</div><div>) = 36</div><div><font class="Apple-style-span" color="#FF0102">socket(PF_PACKET, SOCK_RAW, 768) &nbsp; &nbsp; &nbsp; &nbsp;= 4</font></div><div>ioctl(4, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0</div><div>ioctl(4, SIOCGIFHWADDR, {ifr_name="eth0", ifr_hwaddr=00:e0:60:b0:a3:f6}) = 0</div><div>ioctl(4, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=2}) = 0</div><div>bind(4, {sa_family=AF_PACKET, proto=0x03, if2, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0</div><div>getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0</div><div>setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, "\2\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16) = 0</div><div>setsockopt(4, SOL_SOCKET, SO_ATTACH_FILTER, "\1\0\0\0\250\265\6\10", 8) = 0</div><div>fcntl64(4, F_GETFL) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 0x2 (flags O_RDWR)</div><div>fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) &nbsp;= 0</div><div>recv(4, 0xbfebc55b, 1, MSG_TRUNC) &nbsp; &nbsp; &nbsp; = -1 EAGAIN (Resource temporarily unavailable)</div><div>fcntl64(4, F_SETFL, O_RDWR) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 0</div><div>setsockopt(4, SOL_SOCKET, SO_ATTACH_FILTER, "\n\0\4\10H\224N\10", 8) = 0</div><div><font class="Apple-style-span" color="#FF0102">recvfrom(4, ^C &lt;unfinished ...&gt;</font></div></div><div><br></div><div><i>注：查看libpcap的changelog发现</i></div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">Mon. &nbsp; &nbsp;October 27, 2008. &nbsp;ken@netfunctional.ca. &nbsp;Summary for 1.0.0 libpcap release</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">Support for zerocopy BPF on platforms that support it</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><span class="Apple-style-span" style="font-size: 12px; "><br></span></span></font></div><div>恩，这样来看，是在libpcap-1.0.0中引入了zerocopy BPF，那么这个zerocopy BPF又是什么呢？</div><div><br></div><div><span class="Apple-style-span" style="font-size: medium;"><font class="Apple-style-span" face="黑体">PACKET_MMAP</font></span></div><div>查看两个版本libpcap编译的程序的strace的差异，除了poll之外，对于setsockopt还有一个差异：</div><div><div><font class="Apple-style-span" color="#FF0102">setsockopt(4, SOL_PACKET, PACKET_RX_RING, "\0@\0\0\376\0\0\0@ \0\0\376\0\0\0", 16) = 0</font></div><div><font class="Apple-style-span" color="#FF0102">mmap2(NULL, 4161536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0xb7a54000</font></div></div><div>恩，我们从字面上来猜猜看：</div><div>setsockopt设置socket的PACKET_RX_RING选项，至于这个选项是做什么的，只能够猜测是一个接收环形缓冲区相关的东西，具体其他的要看其他的参数了。</div><div>mmap2将一段内核空间地址映射到用户空间，这样用户空间就可以直接操作内核缓冲区中的数据了，至于内核缓冲区中的数据如何来的，就是所谓的zerocopy BPF底层实现的了。</div><div><br></div><div>查阅资料后，我们知道这个zerocopy叫做PACKET_MMAP，之前也叫做PACKET_RING，查看kernel的config文件的话是：</div><div>CONFIG_PACKET_MMAP=y</div><div><br></div><div>以前的时候有一个专门的PACKET_MMAP版本的libpcap，但是在libpcap-1.0.0中已经增加了部分平台的PACKET_MMAP/PACKET_RING支持。之前那个PACKET_MMAP版本的libpcap在:</div><div><a href="http://public.lanl.gov/cpw/">http://public.lanl.gov/cpw/</a></div><div><br></div><div><span class="Apple-style-span" style="font-size: medium; "><font class="Apple-style-span" face="黑体">llibpcap-1.0.0源码跟踪</font></span></div><div><div><i><font class="Apple-style-span" face="黑体">pcap.c中的pcap_loop:</font></i></div><div><div></div><div><div>/*</div><div>&nbsp;* XXX keep reading until we get something</div><div>&nbsp;* (or an error occurs)</div><div>&nbsp;*/</div><div>do {</div><div><span class="Apple-tab-span" style="white-space: pre; "><font class="Apple-style-span" color="#FF0102">	</font></span><font class="Apple-style-span" color="#FF0102">n = p-&gt;read_op(p, cnt, callback, user);</font></div><div>} while (n == 0);</div></div></div></div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap.c中的pcap_open_live:</i></font></div><div><div><span class="Apple-tab-span" style="white-space:pre">	</span>p = pcap_create(source, errbuf);</div></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;......</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>status = pcap_activate(p);</div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的pcap_create:</i></font></div><div><div><span class="Apple-tab-span" style="white-space:pre">	</span>handle = pcap_create_common(device, ebuf);</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>if (handle == NULL)</div><div><span class="Apple-tab-span" style="white-space:pre">		</span>return NULL;</div></div><div><div><span class="Apple-tab-span" style="white-space:pre">	</span>handle-&gt;activate_op = pcap_activate_linux;</div></div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的pcap_create_common:</i></font></div><div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" color="#FF0102">	</font></span><font class="Apple-style-span" color="#FF0102">p-&gt;read_op = (read_op_t)pcap_not_initialized;</font></div></div><div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap.c中的pcap_active:</i></font></div></div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">int</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">pcap_activate(pcap_t *p)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">{</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">int status;</span></font></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">status = p-&gt;activate_op(p);</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">if (status &gt;= 0)</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">p-&gt;activated = 1;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">return (status);</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div></div><div><br></div><div>到这里实际上调用了pcap_create中设置的active_op，即pcap_active_linux了。</div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的pcap_active_linux:</i></font></div><div><div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#FF0102">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#FF0102">handle-&gt;read_op = pcap_read_linux;</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">......</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">/*</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> * Current Linux kernels use the protocol family PF_PACKET to</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> * allow direct access to all packets on the network while</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> * older kernels had a special socket type SOCK_PACKET to</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> * implement this feature.</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> * While this old implementation is kind of obsolete we need</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> * to be compatible with older kernels for a while so we are</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> * trying both methods with the newer method preferred.</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#009902"> */</font></span></font></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">if ((status = activate_new(handle)) == 1) {</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">activate_ok = 1;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">/*</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"> * Try to use memory-mapped access.</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"> */</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#FF0102">		</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#FF0102">if (activate_mmap(handle) == 1)</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">			</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">return 0;</span></font><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">/* we succeeded; nothing more to do */</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">else if (status == 0) {</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">/* Non-fatal error; try old way */</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">if ((status = activate_old(handle)) == 1)</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">			</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">activate_ok = 1;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div></div></div><div><br></div><div>关于active_new具体的就不分析了，只不过是创建了一个使用PF_PACKET的socket而已。</div><div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的active_new:</i></font></div><div>/*</div><div>&nbsp;* Try to open a packet socket using the new kernel PF_PACKET interface.</div><div>&nbsp;* Returns 1 on success, 0 on an error that means the new interface isn't</div><div>&nbsp;* present (so the old SOCK_PACKET interface should be tried), and a</div><div>&nbsp;* PCAP_ERROR_ value on an error that means that the old mechanism won't</div><div>&nbsp;* work either (so it shouldn't be tried).</div><div>&nbsp;*/</div><div>static int</div><div>activate_new(pcap_t *handle)</div></div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的active_mmap:</i></font></div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">static int&nbsp;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">activate_mmap(pcap_t *handle)</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">{</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#ifdef HAVE_PACKET_RING</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">int ret;</span></font></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">if (handle-&gt;opt.buffer_size == 0) {</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">/* by default request 2M for the ring buffer */</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">handle-&gt;opt.buffer_size = 2*1024*1024;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">ret = prepare_tpacket_socket(handle);</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">if (ret == 0)</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">return ret;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">ret = create_ring(handle);</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">if (ret == 0)</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">		</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">return ret;</span></font></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">/* override some defaults and inherit the other fields from</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"> * activate_new</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"> * handle-&gt;offset is used to get the current position into the rx ring&nbsp;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"> * handle-&gt;cc is used to store the ring size */</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#FF0102">	</font></span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><font class="Apple-style-span" color="#FF0102">handle-&gt;read_op = pcap_read_linux_mmap;</font></span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">handle-&gt;cleanup_op = pcap_cleanup_linux_mmap;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">handle-&gt;setfilter_op = pcap_setfilter_linux_mmap;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">handle-&gt;setnonblock_op = pcap_setnonblock_mmap;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">handle-&gt;getnonblock_op = pcap_getnonblock_mmap;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">handle-&gt;selectable_fd = handle-&gt;fd;</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">return 1;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#else /* HAVE_PACKET_RING */</span></font></div><div><span class="Apple-tab-span" style="white-space:pre"><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">	</span></font></span><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">return 0;</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#endif /* HAVE_PACKET_RING */</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">}</span></font></div><div><br></div></div><div>到这里，终于找到了pcap_loop运行的read_op了;-)</div><div><br></div><div><font class="Apple-style-span" face="黑体"><i>pcap-linux.c中的pcap_read_linux_mmap:</i></font></div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">ret = poll(&amp;pollinfo, 1, (handle-&gt;md.timeout &gt; 0)?&nbsp;handle-&gt;md.timeout: -1);</span></font></div></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">......</span></font></div><div><div>h.raw = pcap_get_ring_frame(handle, TP_STATUS_USER);</div></div><div>......</div><div><div>callback(user, &amp;pcaphdr, bp);</div></div><div><br></div><div><br></div><div>当poll检测socket可读，也就是环形缓冲区中有数据的时候，调用pcap_get_ring_frame获得数据，进行些头部处理，然后调用callback进行处理。</div><div><br></div><div>至此，整个libpcap-1.0.0的调用流程已经分析结束了，基本上核心内容在active_mmap和后续的调用函数，例如在<span class="Apple-style-span" style="font-family: song, Verdana; font-size: 13px; border-collapse: collapse; ">prepare_tpacket_socket和create_ring中实现的就是正如strace中看到的种种setsockopt。<span class="Apple-style-span" style="border-collapse: separate; font-family: 'Courier New', Courier, 宋体; font-size: 12px; ">如果自己造轮子的话应该参考这一部分。</span></span></div></div><div><br></div><div><br></div><div><span class="Apple-style-span" style="font-size: medium;"><font class="Apple-style-span" face="黑体">参考</font></span>：</div><div><a href="http://www.tcpdump.org/libpcap-changes.txt">http://www.tcpdump.org/libpcap-changes.txt</a></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; ">linux-2.6.30/Documentation/networking/packet_mmap.txt</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"><a href="http://hi.baidu.com/ah__fu/blog/item/8aadf895fad570007af48000.html">http://hi.baidu.com/ah__fu/blog/item/8aadf895fad570007af48000.html</a></span></font></div><div><font class="Apple-style-span" size="3"><a href="http://public.lanl.gov/cpw/">http://public.lanl.gov/cpw/</a></font></div></span></font></div></div>
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		  ]]></description>
		</item>	
			<item>
			<title><![CDATA[gearman杂谈]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2207481]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Tue, 03 Aug 2010 06:26:53 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
												<div>从08年开始，所谓的云计算开始流行起来，什么分布式计算模型、分布式消息队列、分布式存储系统各种新鲜事物。</div><div><br></div><div>gearman，从名字上看叫做“齿轮工”，就是通过齿轮把不同的组件组合在一起。通常，多语言多系统之间的集成是项目开发中一个比较头疼的问题。一般会采用RPC风格或者是REST风格的WebService。但是总感觉比较麻烦。gearman就应运而生了，作为一个任务分发架构，它能够轻松的将前端的任务通过Job Server分发给后端的Worker处理。<span class="Apple-style-span" style="font-family: Arial; font-size: small; line-height: 20px;">Gearman请求的处理过程涉及三个角色：Client -&gt; Job Server -&gt; Worker。</span></div><div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;"><br></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;">Client：请求的发起者，可以是C，PHP，Perl，MySQL UDF等等。</span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;">Job Server：请求的调度者，用来负责协调把Client发出的请求转发给合适的Worker。</span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;">Worker：请求的处理者，可以是C，PHP，Perl等等。</span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="line-height: 20px;"><br></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="line-height: 20px; font-size: small;"><span class="Apple-style-span" style="font-family: 'Lucida Grande',Geneva,Arial,Verdana,'Lucida Sans Unicode',Helvetica,sans-serif; line-height: 24px; font-size: 14px; color: rgb(51, 51, 51);"><strong>工作原理图：</strong></span></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;"><div align="center"><img src="http://blogimg.chinaunix.net/blog/upfile2/100402115539.png" onload="javascript:if(this.width>500)this.width=500;" border="0"></div></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="line-height: 20px;"><br></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="line-height: 20px;"><span class="Apple-style-span" style="font-family: 'Lucida Grande',Geneva,Arial,Verdana,'Lucida Sans Unicode',Helvetica,sans-serif; line-height: 24px; font-size: 14px; color: rgb(51, 51, 51);"><strong>工作流：</strong></span></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="line-height: 20px;"><div align="center"><img src="http://blogimg.chinaunix.net/blog/upfile2/100402115652.png" onload="javascript:if(this.width>500)this.width=500;" border="0"></div></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="line-height: 20px;"><br></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;">因为Client，Worker并不限制用一样的语言，所以有利于多语言多系统之间的集成。</span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;">甚至我们通过增加更多的Worker，可以很方便的实现应用程序的分布式负载均衡架构。</span></font></div></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;"><span class="Apple-style-span" style="font-family: 'Lucida Grande',Geneva,Arial,Verdana,'Lucida Sans Unicode',Helvetica,sans-serif; line-height: 24px; font-size: 14px; color: rgb(51, 51, 51);"><p style="margin: 0px 0px 0.75em; padding: 0px;"><strong>集群架构：</strong></p><p style="margin: 0px 0px 0.75em; padding: 0px;"><strong></strong></p><strong><div align="center"><img src="http://blogimg.chinaunix.net/blog/upfile2/100402115747.png" onload="javascript:if(this.width>500)this.width=500;" border="0"></div></strong><p></p></span></span></font></div><div><font class="Apple-style-span" face="Arial"><span class="Apple-style-span" style="font-size: small; line-height: 20px;"><br></span></font></div><div>关于gearman的分布式任务处理：</div><div>1. 其实每一个任务处理的时间并没有降低，相反会稍稍有所增加，主要是数据在网络上传输的一些时间。</div><div>2. 前端Client（通常是web服务器）的负载降低了，但是转移到了后端Worker上。</div><div><span class="Apple-tab-span" style="white-space: pre;">	</span>计算不可能凭空消失，只不过从一台机器转移到了另外一台机器。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>3. 同步方式的话，前端Client（通常是Web服务器）等待的时间与后端Worker的数量与当前任务数有关。如果任务数量&lt;=Worker数量，前端Client等待的时间约等于一个任务处理的时间。</div><div>但是当任务数&gt;=Worker数量时，就会出现某些Client等待的情况，某个Client只有等到一个空闲的Worker，才会将任务交给它进行处理。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>设想一下，在任务数&lt;=Worker数量的时候，使用gearman是可以提高响应时间的。如果采用单机话，N个任务还是在一台机器上运行，每个任务需要</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>现在有N个任务(Client)，M个Worker，每个任务执行时间为t。如果不是用gearman的话，需要的时间为N*t，平均等待时间为N*t/2。</div><div>如果使用了gearman的话，并且N&lt;=M，需要的时间为t，平均等待时间为t；如果N&gt;M的话，需要的时间为(N/M)*t or (N/M+1)*t。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>test.sh</div><div>#!/bin/sh</div><div>sleep 10</div><div>echo "TEST"</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>开启2个Worker</div><div>./gearman -w -f test /home/wangyao/test.sh</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>开启3个Client：</div><div>date;./gearman -f test;date</div><div>三个结果分别为：</div><div>--------------------------------------------</div><div>Wed Mar 17 14:41:31 CST 2010</div><div>TEST&nbsp;</div><div>Wed Mar 17 14:41:41 CST 2010</div><div>--------------------------------------------</div><div>Wed Mar 17 14:41:32 CST 2010</div><div>TEST&nbsp;</div><div>Wed Mar 17 14:41:42 CST 2010</div><div>--------------------------------------------</div><div>Wed Mar 17 14:41:34 CST 2010</div><div>TEST&nbsp;</div><div>Wed Mar 17 14:41:51 CST 2010</div><div>--------------------------------------------</div><div>可以看出前两个Client的任务都用了10s，但是第3个任务却花了17s。主要在于当第3个任务执行的时候，没有空闲的Worker执行任务，必须等到一个Worker空闲下来，最先空闲的Worker要在14:41:41，在这个时刻执行第3个任务，现在已经过去了7s，再需要10s完成任务，因此第3个任务最终用了17s。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>4. 异步方式的话，任务交给后端Worker后，前端Client就返回了，这样用户体验比较好。但是，存在任务执行失败或者是任务结果反馈的问题。</div><div>一般采用数据库作为存储，将任务执行的状态信息、结果等存入数据库；前端Client定期检查数据库中的结果，再进一步进行操作，是向用户呈现结果还是重新执行任务。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>一个应用实例：</div><div>Client将log发送到专门的log服务器:</div><div>tail -f access_log | gearman -h host -f logger</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>可能会有多个log服务器，log服务器将接受到的log信息写入到一个文件中：</div><div>gearman -w -h host -f logger &gt; log_file</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>进行分布式grep，需要在每台log服务器设置一个function：</div><div>gearman -w host -f logger1 ./dgrep.sh</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>#!/bin/sh</div><div>read pattern</div><div>grep $pattern log_file</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>相应的在其他log服务器上创建logger2，logger3等。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>现在，日志已经在日志服务器上了。我们在其他机器上，想对日志进行分布式的grep。只需要作为一个client，调用worker logger1、logger2、logger3等。</div><div>gearman -h host -f logger1 -f logger2 -f logger3 KEYWORD&nbsp;</div><div>这就可以可以在所有的机器上grep KEYWORD了。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>这种方法不是很灵活，每台机器设置一个func，对上层不透明。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div>总体来看，gearman适合于那种task数量远远大于worker数量的应用。理论上来看，将计算开销转移到Worker上，从而实现任务的并发执行，表现为client计算负载减轻，用户的等待时间减少。</div><div>对Client而言是task，对于Worker而言是job。</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><br></span></font></div><div><i><font class="Apple-style-span" face="黑体">后记</font></i>：</div><div>gearman的源码还是值得阅读的，对于设计高并发网络程序有些帮助，其中的架构跟我现在项目中用的差不多，但是有一点感觉不错，只有一个维护对应表的逻辑线程，并且尽量减少系统调用。<br>gearman的消息传递模式是一对一的，不能实现一对多，一个client通过job server最后只能够到达一个worker上。如果需要一对多，需要定义多个worker的function，依次向这些worker进行发送，非常的不方便。这一点就不如ZeroMQ，ZeroMQ支持的模式很多，能够满足各种消息队列需求。<br></div><div><br></div><div>参考：</div><div>http://hqlong.com/2010/01/1222.html#more-1222</div><div>http://timyang.net/linux/gearman-monitor/</div><div>http://blog.s135.com/dips/</div><div>http://oddments.org/notes/GearmanOSCONTutorial2009.pdf</div><div><br></div>
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		  ]]></description>
		</item>	
			<item>
			<title><![CDATA[RHEl5下搭建cacti监控系统]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2207391]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Fri, 02 Apr 2010 03:25:57 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
												给服务器机群安装cacti监控是上个月的事情了，今天为了打发无聊的时间，将上个月的一些东西整理一下。<div>由于自己习惯于debian系，但是项目平台确采用了RHEL5，主要是考虑到多家单位共同使用，其他单位都是fedora教的，就选了RHEL5。</div><div><br></div><div>整篇文章分成三块：安装配置yum、安装配置snmp和安装配置cacti三部分。</div><div><br></div><div><span class="Apple-style-span" style="font-size: large;"><font class="Apple-style-span" face="黑体">安装配置yum</font></span></div><div>基本上可以按照参考1中的顺序进行配置Yum，使用centos的源。基本思路与debian系差不多，设置sourcelist，导入key。</div><div>这里就不详细叙述了。</div><div><br></div><div><span class="Apple-style-span" style="font-size: large;"><font class="Apple-style-span" face="黑体">安装配置snmp</font></span></div><div>配置好了yum之后，就可以尽情的享受yum安装软件带来的快感;-)</div><div>安装snmp相关的软件包：</div><div><table style="border:1px solid #999;width:80%;font-size:12px;" align="center"><tbody><tr><td><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px;"><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; ">yum install net-snmp net-snmp-utils net-snmp-devel -y</span></span></span></td></tr></tbody></table></div><div><br></div><div><font class="Apple-style-span" face="黑体">配置snmp</font></div><div>关于snmp的配置，其实主要是设置不同snmp版本协议中，相应的访问权限。</div><div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">打开默认的/etc/snmp/snmpd.conf文件,更改如下配置:</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">1、查找以下字段：</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"># sec.name source community</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">com2sec notConfigUser default public</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">将"comunity"字段改为你要设置的密码.比如"public".</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">将“default”改为你想哪台机器可以看到你的snmp信息,如192.168.1.254。</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">2、查找以下字段：</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">####</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"># Finally, grant the group read-only access to the systemview view.</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;"># group context sec.model sec.level prefix read write notif</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">access notConfigGroup "" any noauth exact all none none</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">将"read"字段改为all.</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#access notConfigGroup "" any noauth exact systemview none none</span></font></div><div><br></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">3、查找以下字段：</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">## incl/excl subtree mask</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">#view all included .1 80</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px;">将该行前面的"#"去掉.</span></font></div></div><div><br></div><div>对于安全性设置，可以创建一个只读帐号，也就是read-only community，在snmpd.conf中添加以下内容：</div><div>rocommunity public 192.168.1.254</div><div>其中中间的字段相当于password，通常习惯设置为public。</div><div><br></div><div>在监控机192.168.1.254上测试一下：</div><div><table style="border:1px solid #999;width:80%;font-size:12px;" align="center"><tbody><tr><td><span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; font-size: 12px; ">snmpwalk -v 1 -c public 192.168.1.13 system</span></td></tr></tbody></table></div><div><i>注意：可以关于版本，这里可以设置1和2c，对于3的设置可以参考6中监控宝的wiki中的设置。</i></div><div><br></div><div><br></div><div><span class="Apple-style-span" style="font-size: large;"><font class="Apple-style-span" face="黑体">安装配置cacti</font></span></div><div>snmp已经配置好了之后，就可以安装cacti了。Cacti是一套基于PHP,MySQL,SNMP及RRDTool开发的网络流量监测图形分析工具。它通过snmpget来获取数据，保存在rrd文件中..使用RRDtool读取rrd文件获取信息绘画图形。</div><div><br></div><div><div>1. 安装cacti</div><div>#把解压后的包移动到你的相应的web目录</div><div>tar xvf cacti-0.8.7e.tar.gz</div><div>mv cacti-0.8.7e /var/www/html/cacti</div><div><br></div><div>2. 在数据库中建库、授权、导入数据库结构</div><div>#注意导入cacti.sql时该文件的路径</div><div>mysql -p</div><div>mysql&gt; create database cacti;</div><div>mysql&gt; grant all privileges on cacti.* to cacti@localhost identified by 'cacti' with grant option;</div><div>mysql&gt; grant all privileges on cacti.* to cacti@127.0.0.1 identified by 'cacti' with grant option;</div><div>mysql&gt; use cacti;</div><div>mysql&gt; source /var/www/html/cacti/cacti.sql;</div><div>#配置cacti以连接数据库</div><div>vim /var/www/html/cacti/include/config.php</div><div><br></div><div>3. 浏览器下配置</div><div>#用浏览器打开 http://192.168.1.254/cacti ，会显示 cacti的安装指南，设置好就不会再出现了。</div><div>#点击 “Next”</div><div>#选择“New Install”，点击“Next”</div><div>#指定 rrdtool、 php、 snmp 工具的 Binary 文件路径，确保所有的路径都是显示“ FOUND”，没有 “NOT FOUND”的，点击 Finish 完成安装。</div><div><br></div><div>注意：Cacti 默认的用户名与密码是 admin，输入用户名与密码，点击 login</div><div>为了安全的原因，第一次登录成功后，cacti 会强制要求你更改一个新的 password ，输入新密码并确认密码，点击 save ,进入 cacti 控制台界面：</div><div><br></div><div>4. 增加入一个计划任务，使得 cacti 每五分钟生成一个监控图表。</div><div>crontab -e</div><div>#加入如下内容。注意poller.php的路径</div><div>*/5 * * * * php /var/www/html/cacti/poller.php &gt; /dev/null 2&gt;&amp;1</div><div>#确保 /var/www/html/cacti/rra/目录存在</div><div>#如果暂时未看到图表，可以手工执行，生成图表</div><div>#php /var/www/html/cacti/poller.php &gt; /dev/null 2&gt;&amp;1</div></div><div><br></div><div><br></div><div>关于cacti的一些使用方面的问题可以参考：</div><div><a href="http://blog.sina.com.cn/s/blog_4e424e2101000b6o.html">http://blog.sina.com.cn/s/blog_4e424e2101000b6o.html</a></div><div><div>cacti超时，在采集netstat信息的时候：</div><div><a href="http://polygun2000.spaces.live.com/blog/cns!182B490BAC7D9686!404.entry">http://polygun2000.spaces.live.com/blog/cns!182B490BAC7D9686!404.entry</a></div><div>Cacti 流量图显示断断续续解决方案</div><div><a href="http://hi.baidu.com/qu6zhi/blog/item/3dab3d03c21cb47e3912bbb3.html">http://hi.baidu.com/qu6zhi/blog/item/3dab3d03c21cb47e3912bbb3.html</a></div><div>Cacti 常用调试方法</div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px;"><a href="http://guanhongbo2006.spaces.live.com/blog/cns!C2C4DE09B46D38D9!317.entry">http://guanhongbo2006.spaces.live.com/blog/cns!C2C4DE09B46D38D9!317.entry</a></span></font></div></div><div><font class="Apple-style-span" size="3"><br></font></div><div><font class="Apple-style-span" face="黑体"><i>后记</i></font>：对于监控软件，我认为cacti未必是最优秀的，因为有些功能它实现起来比较吃力。我曾经想统计每台机器上TCP连接数，找到了几个template，但是很遗憾，它们都是通过snmp拿到所有的连接信息，然后拉到本地去统计处理。当连接数比较多的时候，像我们的应用中平时就有几万的连接，经常超时。如果能够在被监控机器上先统计出单机的连接信息，再汇总到监控机上，这样就可以节省带宽了。</div><div>还存在一类监控系统，不基于snmp，而是自己实现了客户端和服务器，或许相应的信息只需要实现相应的插件就可以了，像munin。</div><div>其他一些知名的监控系统有：nagios、cfengine等，如果只是监控一两台机器的话，监控宝还是不错的;-)</div><div><br></div><div><span class="Apple-style-span" style="font-size: large; "><font class="Apple-style-span" face="黑体">参考</font></span></div><div><div><div><span class="Apple-style-span" style="font-size: 13px; ">http://www.blogjava.net/Skynet/archive/2009/04/29/268105.html</span></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; ">http://purpen.javaeye.com/blog/380645</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; ">http://forums.solidhost.com/showthread.php?p=1795</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; ">http://kbase.redhat.com/faq/docs/DOC-8797</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; ">http://www.joecen.com/article/cacti/setsnmp/</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; ">http://wiki.jiankongbao.com/doku.php/%E6%96%87%E6%A1%A3:%E5%AE%89%E5%85%A8%E6%8C%87%E5%BC%95#linux_snmp</span></font></div><div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 13px; "><a href="http://www.21andy.com/blog/20100204/1615.html">http://www.21andy.com/blog/20100204/1615.html</a></span></font></div><div><font class="Apple-style-span" size="3"><a href="http://133402.blog.51cto.com/123402/172213/">http://133402.blog.51cto.com/123402/172213/</a></font></div><div><br></div><div><br></div></div></div>
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		
		  ]]></description>
		</item>	
			<item>
			<title><![CDATA[遭遇入侵]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2199839]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Mon, 22 Mar 2010 05:13:11 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
				昨天修改了部分代码，今天打算上线测试，发现一台服务器用密钥连接不上，下意识里想到那台机器可能被人攻击了。<br><br>root用户通过密钥连接不上，那就换root密码，结果也不行。<br><br>用之前设置的一个普通用户帐号登录，可以。<br>进到系统，su -，无法进入root，确认root密码被人改掉了。<br>last，发现有几个国外的IP，一个荷兰的，一个泰国的。<br><br>恩，接下来应该考虑夺回root权限。<br>当然，请求机房网管进入单用户模式，然后重设root密码是一种简单粗暴的方法。<br>但是，咱脸皮薄，不想麻烦人家网管，也不想浪费电话……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[《把时间当作朋友》随想]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2185784]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Sat, 27 Feb 2010 08:54:17 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
										过年在家的这几天里面，读了李笑来老师的《把时间当作朋友》，虽然这本电子书躺在我的硬盘里面已经很久了，但是一直以各种借口没有看，趁着年假就读了一下。读完《把时间当作朋友》后，最大的感觉就是之前自己已经不会思考了，懒惰的一塌糊涂。<br><br>我一直在想，人的思考能力是不是随着人的专业深入程度和人的年龄增长呈现反比关系。本科时代，跟Cowoo天马行空的讨论技术，偶尔一拍脑袋，说某个技术如果用在某个领域解决某个问题的话，绝对是一个很不错的创业点，马上google，发现已经有人做了这样的事情，一时间感慨万分。……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[异步DNS查询实现]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2167536]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Fri, 02 Apr 2010 04:06:05 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
								每当打开firefox，点开一个url的时候，左下角显示的Lookuping up xxx.com，有时候速度很难忍受，这个lookuping要持续几十秒之长。这就是DNS查询，在网络应用中几乎不可避免，系统本身的gethostbyname等API都是同步的，会严重阻塞程序运行，严重影响程序的性能。<br><br>为了提高DNS查询的速度，有几种解决方法：<br>1. 本地DNS Cache Server，dnsmasq就是一个典型的例子，在本地安装一个DNS缓存服务器，将127.0.0.1加入到/etc/resolv.conf中。<br>2. 代码中增加DNS Cache，这个在很多网络应用程序中都很常见，比如squid中。<br>……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[压力测试工具实现]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2167736]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Sat, 27 Feb 2010 04:24:07 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
				在网络应用开发过程中，经常需要对服务器程序进行压力测试，从而得到服务器的性能指标，或者是暴露程序的bug。在最近的项目中，我们也实现了一个业务相关的压力测试工具，专门用来测试我们的服务器代码。<br><br>大家所熟知的压力测试工具，比如http压力测试工具，ab(apache benchmark)，zb(zeus benchmark)，httperf等。这些压力测试工具的实现，基本上分为两种：<br>1. 采用原始套接字，比如httperf。<br>2. 采用非阻塞socket+I/O多路复用，比如ab，zb。ab代码依赖于apr(<em>Apache Portable Runtime</em>)，zb代码使用select，代……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[High Performance Network Programming]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2166499]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Mon, 15 Mar 2010 13:19:28 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
																这个PPT已经拖了好久了，终于在上个周末的时候，花了些时间总结了一下，似乎有些多，有些小同学反映前面的基础部分可以简化，但是考虑到完整性，还是保留了，主要是那些小童鞋的代码经常出现那些问题。<br><br>一个内容索引：<br>C10K Problem<br>Core Framework of Network App<br>IO Model<br>TCP State Drive Machine<br>Unify Multiple Event Sources<br>Event Handling Patterns <br>&nbsp;&nbsp;&nbsp; Reactor<br>&nbsp;&nbsp;&nbsp; Proactor<br>Concurrency Patterns<br>&nbsp;&nbsp;&nbsp; Half Sync/Half As……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[互联网架构设计中的poll和push]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2163327]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Sat, 30 Jan 2010 01:49:59 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
																接近年关了，最近一直在准备年终汇报的PPT，顺便在做一个《高并发网络程序设计》的PPT，我联想到《计算机组成原理》中的外设管理的中断和轮循方法，想把socket比作成外设，当然这里socket是非阻塞的，事件都是异步的。又想着将这个类比推广到互联网架构中，数据同步的方法：poll和push。<br><br><font style="font-weight: bold;" size="3">poll方式</font><br>poll方式，也称为轮循，是大家都比较熟悉的一种数据同步方式，客户端定期去ping查询服务器，确定是否有需要的数据。例如，软件更新模块，客户端软件需要定期去……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[近期博客文章列表]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2130836]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Mon, 15 Mar 2010 13:20:58 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
												最近，项目上的事情比较多，没有多少时间来写博客，不过前一段时间从项目中还是收获很多的，打算将一些收获写出来，先整理一个列表，接下来一段时间慢慢来写：<br>1. 异步DNS查询实现<br>2. 高并发网络程序设计注意事项(C100K服务器端和C60K客户端) [见《<font><b><a href="http://blog.chinaunix.net/u/12592/showart_2166499.html" target="_blank" class="list1"><font style="font-size: 10pt;" color="#2a5200"><b>High Perforamce Network Programming</b></font></a></b></font>》]<br>3. 服务器压力测试工具 [DONE]……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[一个捕获web密码的程序]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2108128]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Tue, 01 Dec 2009 14:02:16 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
						晚上无心继续写代码，就想写个小东西来玩玩，用pypcap写了一个密码捕获的小程序，能够抓取POST的用户名、密码，以及一些GET的cookie，如何用cookie，你不知道？那就google吧;-)或者看后面，记着要用鼠标选出来，才能够看到：<span style="color: rgb(255, 255, 255);">curl -b</span><br><br><table style="border: 1px solid rgb(153, 153, 153); width: 80%; font-size: 12px;" align="center"><tbody><tr><td># -*- coding: utf8 -*-<br>#!/usr/bin/env python<br><br>import pcap<br>import dpkt<br><br>dev='eth0'<br>filter……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[另类的TCP四次握手]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2093774]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Fri, 13 Nov 2009 10:12:11 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
						说起TCP的三次握手大家都熟悉，经典教材中都有讲到。<br><br><font style="color: rgb(0, 1, 2); font-family: 黑体; font-weight: bold;" size="4">三次握手：</font><br><div align="center"><img src="http://blogimg.chinaunix.net/blog/upfile2/091113154914.png" onload="javascript:if(this.width>500)this.width=500;" border="0"></div><br>由于每一端都要向对方发送初始序列号，并向对方发送确认，理论上讲像TCP的关闭，需要4次握手，但是接收端将SYN和ACK报文合二为一，就形成了3次握手。<br><pre>    1) A --&gt; B  ……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[EMC的一个笔试题目]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2084111]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Fri, 06 Nov 2009 05:40:26 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
		同学参加了EMC的笔试回来，说了一个EMC的一个笔试题目，他没有答上来，就问我。我感觉很有意思，就拿出来分析一下。<br><br>======================================<br>int main(int argc, char* argv[])<br>{<br>&nbsp;&nbsp; fork();<br>&nbsp;&nbsp; fork() &amp;&amp; fork() || fork();<br>&nbsp;&nbsp; fork();<br>}<br><br>不算main这个进程自身，到底创建了多少个进程啊？<br>======================================<br><br>为了解答这个问题，我们先作一下弊，先用程序验证一下，到此有多少个进程。<br>int main(int argc,……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[主机多IP情况下的gethostbyname使用问题]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2081411]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Wed, 28 Oct 2009 13:04:56 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
																项目中每台服务器上配置了几百个公网IP地址，代码中要取主机的地址，然后向bind到其中一个地址，与外面的机器进行交互。之前的代码工作正常，但是机器上配置了多IP之后，得到的ip总是有问题。<br><br>代码如下：<br><table style="border: 1px solid rgb(153, 153, 153); width: 80%; font-size: 12px;" align="center"><tbody><tr><td>#include &lt;stdio.h&gt;<br>#include &lt;netdb.h&gt;<br><br>int main()<br>{<br>&nbsp;&nbsp;&nbsp; struct hostent *hptr;<br>&nbsp;&nbsp;&nbsp; char **pptr;<br>&nbsp;&nbsp;……  ]]></description>
		</item>	
			<item>
			<title><![CDATA[试用pypcap]]></title>
			<link><![CDATA[http://blog.chinaunix.net/u/12592/showart.php?id=2079182]]></link>
			<author></author>
			<guid></guid>
			<category></category>
			<pubDate>Wed, 28 Oct 2009 11:53:49 GMT</pubDate>
			<comments></comments>
			<description><![CDATA[
						要写一个捕包分析的小程序，要在windows上运行，不想找一个windows机器，安装开发环境，就想用Python来写，这样跨平台吗;-)<br><br><font style="font-weight: bold;" size="3"><span style="font-family: 黑体;">安装</span></font><br>1. python 2.5<br>2. pypcap<br>http://code.google.com/p/pypcap/downloads/list<br>3. dpkt<br>http://code.google.com/p/dpkt/downloads/list<br>4. winpcap<br>如果有wireshark的话，就直接安装wireshark吧，里面带着winpcap<br><br><font style="font-family: 黑体; font-weight: bold;" size="3">代码</font><br>捕获接收到的所有udp报文，然后判断报文内容是不是bencode格式的，然后将bencode报文的源IP和源Port输出。<br><table style="border: 1px solid rgb(153, 153, 153); width: 584px; font-size: 12px; height: 566px;" align="center"><tbody><tr><td># -*- coding: utf8 -*-<br>#!/usr/bin/env python<br><br>import pcap<br>import dpkt<br><br>dev='eth0' #windows下，可以根据wireshark的输出填写：\Device\NPF_{87AF0973-017E-4479-9654-A6384FDBB030}<br>filter='ip dst 192.168.1.2 and udp'<br><br>pc=pcap.pcap(dev)<br>pc.setfilter(filter)<br><br>file = open('./dht_nodes.txt','w')<br>for ptime,pdata in pc:<br>&nbsp;&nbsp; &nbsp;ether=dpkt.ethernet.Ethernet(pdata)<br>&nbsp;&nbsp; &nbsp;#p=dpkt.ip.IP(pdata)<br>&nbsp;&nbsp; &nbsp;ip=ether.data<br>&nbsp;&nbsp;&nbsp; ip_str='%d.%d.%d.%d'%tuple(map(ord,list(ip.src)))<br><br>&nbsp;&nbsp; &nbsp;udp=ip.data<br>&nbsp;&nbsp; &nbsp;port=udp.sport<br>&nbsp;&nbsp; &nbsp;node='%s:%d'%(ip_str,port)<br>&nbsp;&nbsp; &nbsp;content_len=len(udp)-8 <br><br>&nbsp;&nbsp; &nbsp;#简单判断UDP报文内容是不是bencode的<br>&nbsp;&nbsp; &nbsp;if udp.data[0] == 'd' and udp.data[content_len-1] == 'e':<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;print node <br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;file.write(node+'\n')<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;file.flush()</td></tr></tbody></table><br>在windows上面安装不算麻烦，在我的debian上安装也不麻烦，但是在远程的Redhat上安装真是麻烦，不是缺这个包就是缺那个包。怒了，用C写了一个。<br><table style="border: 1px solid rgb(153, 153, 153); width: 80%; font-size: 12px;" align="center"><tbody><tr><td>#include&lt;stdio.h&gt;<br>#include&lt;stdlib.h&gt;<br>#include&lt;string.h&gt;<br>#include&lt;sys/stat.h&gt;<br>#include&lt;fcntl.h&gt;<br>#include&lt;arpa/inet.h&gt;<br>#include&lt;netinet/in.h&gt;<br>#include&lt;sys/types.h&gt;<br>#include&lt;unistd.h&gt;<br>#include&lt;assert.h&gt;<br>#include&lt;pcap.h&gt;<br><br>#define ETHHDR_SIZE 14<br><br>struct ip_header<br>{<br>#ifdef WORDS_BIGENDIAN<br>u_int8_t ip_version:<br>&nbsp;&nbsp;&nbsp; 4,<br>ip_header_length:<br>&nbsp;&nbsp;&nbsp; 4;<br>#else<br>u_int8_t ip_header_length:<br>&nbsp;&nbsp;&nbsp; 4,<br>ip_version:<br>&nbsp;&nbsp;&nbsp; 4;<br>#endif<br><br>&nbsp;&nbsp;&nbsp; u_int8_t ip_tos;<br>&nbsp;&nbsp;&nbsp; u_int16_t ip_length;<br>&nbsp;&nbsp;&nbsp; u_int16_t ip_id;<br>&nbsp;&nbsp;&nbsp; u_int16_t ip_off;<br>&nbsp;&nbsp;&nbsp; u_int8_t ip_ttl;<br>&nbsp;&nbsp;&nbsp; u_int8_t ip_protocol;<br>&nbsp;&nbsp;&nbsp; u_int16_t ip_checksum;<br>&nbsp;&nbsp;&nbsp; struct in_addr ip_src_address;<br>&nbsp;&nbsp;&nbsp; struct in_addr ip_dst_address;<br>};<br><br>struct tcp_header<br>{<br>&nbsp;&nbsp;&nbsp; u_int16_t tcp_src_port;<br>&nbsp;&nbsp;&nbsp; u_int16_t tcp_dst_port;<br>&nbsp;&nbsp;&nbsp; u_int32_t tcp_sequence;<br>&nbsp;&nbsp;&nbsp; u_int32_t tcp_ack;<br><br>#ifdef WORDS_BIGENDIAN<br>u_int8_t tcp_offset:<br>&nbsp;&nbsp;&nbsp; 4,<br>tcp_reserverd:<br>&nbsp;&nbsp;&nbsp; 4;<br>#else<br>u_int8_t tcp_reserverd:<br>&nbsp;&nbsp;&nbsp; 4,<br>tcp_offset:<br>&nbsp;&nbsp;&nbsp; 4;<br>#endif<br><br>&nbsp;&nbsp;&nbsp; u_int8_t tcp_flags;<br>&nbsp;&nbsp;&nbsp; u_int16_t tcp_windows;<br>&nbsp;&nbsp;&nbsp; u_int16_t tcp_checksum;<br>&nbsp;&nbsp;&nbsp; u_int16_t tcp_urgent_pointer;<br>};<br><br>struct udp_header<br>{<br>&nbsp;&nbsp;&nbsp; u_int16_t udp_src_port;<br>&nbsp;&nbsp;&nbsp; u_int16_t udp_dst_port;<br>&nbsp;&nbsp;&nbsp; u_int16_t udp_len;<br>&nbsp;&nbsp;&nbsp; u_int16_t udp_checksum;<br>};<br><br>FILE *fp;<br><br>void GoDaemon(void);<br>void process_pkt(u_char *argument,const struct pcap_pkthdr* packet_header,const u_char *packet_content);<br><br>int main(int argc, char *argv[])<br>{<br>&nbsp;&nbsp;&nbsp; pcap_t* pcap_handle;<br>&nbsp;&nbsp;&nbsp; char err_content[PCAP_ERRBUF_SIZE];<br>&nbsp;&nbsp;&nbsp; char *net_interface=NULL;<br>&nbsp;&nbsp;&nbsp; struct bpf_program bpf_filter;<br>&nbsp;&nbsp;&nbsp; char bpf_filter_string[64]="ip dst 172.16.11.11 and udp";<br>&nbsp;&nbsp;&nbsp; char filename[64]="./DHT_nodes.sav";<br>&nbsp;&nbsp;&nbsp; bpf_u_int32 net_mask;<br>&nbsp;&nbsp;&nbsp; bpf_u_int32 net_ip;<br><br>&nbsp;&nbsp;&nbsp; if(argc==4)<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; net_interface = argv[1];<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; memset(bpf_filter_string, 0, sizeof(bpf_filter_string));<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; sprintf(bpf_filter_string, "ip dst %s and udp", argv[2]);<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; memset(filename, 0, sizeof(filename));<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; strcpy(filename, argv[3]);<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; //filename=argv[3];<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; else<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; fprintf(stderr, "Usage: find_dhtnode INTERFACE IP FILENAME\n");<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; fprintf(stderr, "&nbsp; exp: find_dhtnode eth0 172.16.11.11 ./DHT_nodes.sav\n");<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; return 0;<br>&nbsp;&nbsp;&nbsp; }<br><br>&nbsp;&nbsp;&nbsp; fp = fopen("./DHT_nodes.sav","w");<br>&nbsp;&nbsp;&nbsp; if(fp==NULL)<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; perror("fopen");<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; return;<br>&nbsp;&nbsp;&nbsp; }<br><br>&nbsp;&nbsp;&nbsp; assert(net_interface);<br>&nbsp; &nbsp;<br>&nbsp;&nbsp;&nbsp; //net_interface = pcap_lookupdev(err_content);<br>&nbsp;&nbsp;&nbsp; pcap_lookupnet(net_interface,&amp;net_ip,&amp;net_mask,err_content);<br><br>&nbsp;&nbsp;&nbsp; printf("Device: %s\n", net_interface);<br>&nbsp;&nbsp;&nbsp; printf("Filter: %s\n", bpf_filter_string);<br><br>&nbsp;&nbsp;&nbsp; pcap_handle = pcap_open_live(net_interface,BUFSIZ,1,0,err_content);<br><br>&nbsp;&nbsp;&nbsp; pcap_compile(pcap_handle,&amp;bpf_filter,bpf_filter_string,0,net_ip);<br><br>&nbsp;&nbsp;&nbsp; pcap_setfilter(pcap_handle,&amp;bpf_filter);<br><br>&nbsp;&nbsp;&nbsp; if ( pcap_datalink(pcap_handle) != DLT_EN10MB )<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return;<br>&nbsp;&nbsp;&nbsp; }<br><br>&nbsp;&nbsp;&nbsp; GoDaemon();<br><br>&nbsp;&nbsp;&nbsp; pcap_loop(pcap_handle,-1,process_pkt,NULL);<br><br>&nbsp;&nbsp;&nbsp; pcap_close(pcap_handle);<br><br>&nbsp;&nbsp;&nbsp; fclose(fp);<br>&nbsp;&nbsp;&nbsp; return 0;<br>}<br><br>void process_pkt(u_char *argument,const struct pcap_pkthdr* packet_header,const u_char *packet_content)<br>{<br>&nbsp;&nbsp;&nbsp; struct ip_header *ip_protocol;<br>&nbsp;&nbsp;&nbsp; struct udp_header* udp_protocol;<br>&nbsp;&nbsp;&nbsp; unsigned char *data;<br>&nbsp;&nbsp;&nbsp; int ip_length;<br>&nbsp;&nbsp;&nbsp; int ip_header_length;<br>&nbsp;&nbsp;&nbsp; int data_length;<br><br>&nbsp;&nbsp;&nbsp; ip_protocol = (struct ip_header *) (packet_content+ETHHDR_SIZE);<br>&nbsp;&nbsp;&nbsp; /*ip_length Need to change from net mode into host mode*/<br>&nbsp;&nbsp;&nbsp; ip_length = ntohs(ip_protocol-&gt;ip_length);<br>&nbsp;&nbsp;&nbsp; ip_header_length = ip_protocol-&gt;ip_header_length*4;<br><br>&nbsp;&nbsp;&nbsp; udp_protocol = (struct udp_header *)(packet_content+ETHHDR_SIZE+ip_header_length);<br>&nbsp;&nbsp;&nbsp; data_length = ntohs(udp_protocol-&gt;udp_len) - 8;<br>&nbsp;&nbsp;&nbsp; data = (unsigned char *)(packet_content + ETHHDR_SIZE + ip_header_length + 8);<br><br>&nbsp;&nbsp;&nbsp; if(data[0] == 'd' &amp;&amp; data[data_length-1] == 'e')<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /*inet_ntoa is not thread security*/<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fprintf(fp, "%s:%d\n",inet_ntoa(ip_protocol-&gt;ip_src_address),ntohs(udp_protocol-&gt;udp_src_port));<br>&nbsp;&nbsp; &nbsp; fflush(fp);<br>&nbsp;&nbsp;&nbsp; }<br><br>}<br><br>void GoDaemon(void)<br>{<br>&nbsp;&nbsp; &nbsp;pid_t fs;<br>&nbsp;&nbsp; &nbsp;printf("Initializing daemon mode\n");<br><br>&nbsp;&nbsp; &nbsp;if (getppid() != 1)<br>&nbsp;&nbsp; &nbsp;{<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;fs = fork();<br><br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;if (fs &gt; 0)<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;{<br>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;exit(0);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* parent */<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; if (fs &lt; 0)<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; perror("fork");<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; exit(1);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; setsid();<br>&nbsp;&nbsp;&nbsp; }<br><br>&nbsp;&nbsp;&nbsp; chdir("/");<br><br>&nbsp;&nbsp;&nbsp; /* redirect stdin/stdout/stderr to /dev/null */<br>&nbsp;&nbsp;&nbsp; close(0);<br>&nbsp;&nbsp;&nbsp; close(1);<br>&nbsp;&nbsp;&nbsp; close(2);<br><br>&nbsp;&nbsp;&nbsp; open("/dev/null", O_RDWR);<br><br>&nbsp;&nbsp;&nbsp; dup(0);<br>&nbsp;&nbsp;&nbsp; dup(0);<br><br>&nbsp;&nbsp;&nbsp; return;<br>}</td></tr></tbody></table>
		
		  ]]></description>
		</item>	
			</channel>
	</rss>
