Chinaunix首页 | 论坛 | 博客
  • 博客访问: 3926628
  • 博文数量: 93
  • 博客积分: 3189
  • 博客等级: 中校
  • 技术积分: 4229
  • 用 户 组: 普通用户
  • 注册时间: 2009-02-02 13:29
个人简介

出没于杭州和青岛的程序猿一枚,对内核略懂一二

文章分类

全部博文(93)

文章存档

2016年(2)

2015年(3)

2014年(11)

2013年(29)

2012年(16)

2011年(5)

2010年(5)

2009年(22)

分类: LINUX

2013-09-11 17:37:23

欢迎转载,转载请注明出处:http://forever.blog.chinaunix.net

最近使用netperf工具包测试网络性能,结果发现一个很奇怪的问题,如下图所示:


虽然能后最后出来结果,但是大量的enable_enobufs failed的错误。于是决定找一找问题的原因。

首先查看netperf代码中打印该部分的代码,如下

点击(此处)折叠或打开

  1. #if defined(__linux)
  2. /*
  3.  * Linux has this odd behavior where if the socket buffers are larger
  4.  * than a device's txqueuelen, the kernel will silently drop transmits
  5.  * which would not fit into the tx queue, and not pass an ENOBUFS
  6.  * error back to the application. As a result, a UDP stream test can
  7.  * report absurd transmit bandwidths (like 20Gb/s on a 1GbE NIC).
  8.  * This behavior can be avoided if you request extended error
  9.  * reporting on the socket. This is done by setting the IP_RECVERR
  10.  * socket option at the IP level.
  11.  */
  12. static void
  13. enable_enobufs(int s)
  14. {
  15.   struct protoent *pr;
  16.   int on = 1;
  17.   
  18.   if ((pr = getprotobyname("ip")) == NULL) {
  19.     fprintf(where, "%s failed: getprotobyname\n",__FUNCTION__);
  20.     fflush(where);
  21.     return;
  22.   }
  23.   if (setsockopt(s, pr->p_proto, IP_RECVERR, (char *)&on, sizeof(on)) < 0) {
  24.     fprintf(where, "%s failed: setsockopt\n",__FUNCTION__);
  25.     fflush(where);
  26.     return;
  27.   }
  28. }
  29. #endif
正是上面第23行代码的输出。为了不重新编译netperf工具,决定通过systemtap分析该函数从内核返回值小于0的原因。

我们知道setsockopt函数对应的内核函数是ip_setsockopt,于是编写如下的stap脚本定位问题:

点击(此处)折叠或打开

  1. /*******************************************************************************
  2.   
  3.   Copyright(c) 2008-2013

  4.   This program is free software; you can redistribute it and/or modify it
  5.   under the terms and conditions of the GNU General Public License,
  6.   version 2, as published by the Free Software Foundation.

  7.   This program is distributed in the hope it will be useful, but WITHOUT
  8.   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
  9.   FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
  10.   more details.

  11.   You should have received a copy of the GNU General Public License along with
  12.   this program; if not, write to the Free Software Foundation, Inc.,
  13.   51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.

  14.   The full GNU General Public License is included in this distribution in
  15.   the file called "COPYING".


  16.   Date: 2013-09-11 15:59:41 CST

  17.   Contact Information:
  18.   Tony <tingw.liu@gmail.com>
  19.   Qingdao, China.
  20. *******************************************************************************/


  21. probe kernel.function("ip_setsockopt").return
  22. {
  23.         printf("ip_setsockopt protocol=%d optname=%d return %d\n", $level, $optname, $return);
  24. }
运行后输出如下所示:



OK,定位到问题原因了,因为setsockopt传入的协议类型有问题。
现在回过来看看netperf中的代码,这个proto是从getprotobyname()函数获取的,而这个函数读取的是/etc/protocols文件。我们看一下suse上这个文件的内容:

点击(此处)折叠或打开

  1. # See also: protocols(5), http://www.sethwklein.net/projects/iana-etc/
  2. #
  3. #
  4. # PROTOCOL NUMBERS
  5. #
  6. # (last updated 28 March 2006)
  7. #
  8. # In the Internet Protocol version 4 (IPv4) [RFC791] there is a field,
  9. # called "Protocol", to identify the next level protocol. This is an 8
  10. # bit field. In Internet Protocol version 6 (IPv6) [RFC1883] this field
  11. # is called the "Next Header" field.
  12. #
  13. # Assigned Internet Protocol Numbers
  14. #
  15. # Decimal Keyword Protocol References
  16. # ------- ------- -------- ----------
  17. # protocol num aliases # comments
  18. hopopt 0 HOPOPT # IPv6 Hop-by-Hop Option [RFC1883]
  19. icmp 1 ICMP # Internet Control Message [RFC792]
  20. igmp 2 IGMP # Internet Group Management [RFC1112]
  21. ggp 3 GGP # Gateway-to-Gateway [RFC823]
  22. ip 4 IP # IP in IP (encapsulation) [RFC2003]
  23. st 5 ST # Stream [RFC1190,RFC1819]
  24. tcp 6 TCP # Transmission Control [RFC793]
  25. cbt 7 CBT # CBT [Ballardie]
  26. egp 8 EGP # Exterior Gateway Protocol [RFC888,DLM1]
  27. igp 9 IGP # any private interior gateway [IANA]
  28. # (used by Cisco for their IGRP)
  29. bbn-rcc-mon 10 BBN-RCC-MON # BBN RCC Monitoring [SGC]
  30. nvp-ii 11 NVP-II # Network Voice Protocol [RFC741,SC3]
看一下第22行,suse是在搞笑吗,居然把ip和ipencap搞错了。。。。
修改后的/etc/protocols文件类似如下

点击(此处)折叠或打开

  1. # See also: protocols(5), http://www.sethwklein.net/projects/iana-etc/
  2. #
  3. #
  4. # PROTOCOL NUMBERS
  5. #
  6. # (last updated 28 March 2006)
  7. #
  8. # In the Internet Protocol version 4 (IPv4) [RFC791] there is a field,
  9. # called "Protocol", to identify the next level protocol. This is an 8
  10. # bit field. In Internet Protocol version 6 (IPv6) [RFC1883] this field
  11. # is called the "Next Header" field.
  12. #
  13. # Assigned Internet Protocol Numbers
  14. #
  15. # Decimal Keyword Protocol References
  16. # ------- ------- -------- ----------
  17. # protocol num aliases # comments
  18. ip 0 IP # internet protocol, pseudo protocol number
  19. hopopt 0 HOPOPT # IPv6 Hop-by-Hop Option [RFC1883]
  20. icmp 1 ICMP # Internet Control Message [RFC792]
  21. igmp 2 IGMP # Internet Group Management [RFC1112]
  22. ggp 3 GGP # Gateway-to-Gateway [RFC823]
  23. ipencap 4 IP-ENCAP # IP in IP (encapsulation) [RFC2003]
  24. st 5 ST # Stream [RFC1190,RFC1819]
  25. tcp 6 TCP # Transmission Control [RFC793]
  26. cbt 7 CBT # CBT [Ballardie]
  27. egp 8 EGP # Exterior Gateway Protocol [RFC888,DLM1]
  28. igp 9 IGP # any private interior gateway [IANA]
  29. # (used by Cisco for their IGRP)
  30. bbn-rcc-mon 10 BBN-RCC-MON # BBN RCC Monitoring [SGC]
  31. nvp-ii 11 NVP-II # Network Voice Protocol [RFC741,SC3]
  32. pup 12 PUP # PUP [PUP,XEROX]
  33. argus 13 ARGUS # ARGUS [RWS4]
  34. emcon 14 EMCON # EMCON [BN7]
  35. xnet 15 XNET # Cross Net Debugger [IEN158,JFH2]
  36. chaos 16 CHAOS # Chaos [NC3]
  37. udp 17 UDP # User Datagram [RFC768,JBP]
  38. mux 18 MUX # Multiplexing [IEN90,JBP]
  39. dcn-meas 19 DCN-MEAS # DCN Measurement Subsystems [DLM1]
  40. hmp 20 HMP # Host Monitoring [RFC869,RH6]

修复后再运行netperf,世界清静了。。。
阅读(7728) | 评论(1) | 转发(0) |
给主人留下些什么吧!~~

bollobas2014-01-15 17:58:03

非常好!