欢迎转载,转载请注明出处:http://forever.blog.chinaunix.net
最近使用netperf工具包测试网络性能,结果发现一个很奇怪的问题,如下图所示:
虽然能后最后出来结果,但是大量的enable_enobufs failed的错误。于是决定找一找问题的原因。
首先查看netperf代码中打印该部分的代码,如下
-
#if defined(__linux)
-
/*
-
* Linux has this odd behavior where if the socket buffers are larger
-
* than a device's txqueuelen, the kernel will silently drop transmits
-
* which would not fit into the tx queue, and not pass an ENOBUFS
-
* error back to the application. As a result, a UDP stream test can
-
* report absurd transmit bandwidths (like 20Gb/s on a 1GbE NIC).
-
* This behavior can be avoided if you request extended error
-
* reporting on the socket. This is done by setting the IP_RECVERR
-
* socket option at the IP level.
-
*/
-
static void
-
enable_enobufs(int s)
-
{
-
struct protoent *pr;
-
int on = 1;
-
-
if ((pr = getprotobyname("ip")) == NULL) {
-
fprintf(where, "%s failed: getprotobyname\n",__FUNCTION__);
-
fflush(where);
-
return;
-
}
-
if (setsockopt(s, pr->p_proto, IP_RECVERR, (char *)&on, sizeof(on)) < 0) {
-
fprintf(where, "%s failed: setsockopt\n",__FUNCTION__);
-
fflush(where);
-
return;
-
}
-
}
-
#endif
正是上面第23行代码的输出。为了不重新编译netperf工具,决定通过systemtap分析该函数从内核返回值小于0的原因。
我们知道setsockopt函数对应的内核函数是ip_setsockopt,于是编写如下的stap脚本定位问题:
-
/*******************************************************************************
-
-
Copyright(c) 2008-2013
-
-
This program is free software; you can redistribute it and/or modify it
-
under the terms and conditions of the GNU General Public License,
-
version 2, as published by the Free Software Foundation.
-
-
This program is distributed in the hope it will be useful, but WITHOUT
-
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
-
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
-
more details.
-
-
You should have received a copy of the GNU General Public License along with
-
this program; if not, write to the Free Software Foundation, Inc.,
-
51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
-
-
The full GNU General Public License is included in this distribution in
-
the file called "COPYING".
-
-
-
Date: 2013-09-11 15:59:41 CST
-
-
Contact Information:
-
Tony <tingw.liu@gmail.com>
-
Qingdao, China.
-
*******************************************************************************/
-
-
-
probe kernel.function("ip_setsockopt").return
-
{
-
printf("ip_setsockopt protocol=%d optname=%d return %d\n", $level, $optname, $return);
-
}
运行后输出如下所示:
OK,定位到问题原因了,因为setsockopt传入的协议类型有问题。
现在回过来看看netperf中的代码,这个proto是从getprotobyname()函数获取的,而这个函数读取的是/etc/protocols文件。我们看一下suse上这个文件的内容:
-
# See also: protocols(5), http://www.sethwklein.net/projects/iana-etc/
-
#
-
#
-
# PROTOCOL NUMBERS
-
#
-
# (last updated 28 March 2006)
-
#
-
# In the Internet Protocol version 4 (IPv4) [RFC791] there is a field,
-
# called "Protocol", to identify the next level protocol. This is an 8
-
# bit field. In Internet Protocol version 6 (IPv6) [RFC1883] this field
-
# is called the "Next Header" field.
-
#
-
# Assigned Internet Protocol Numbers
-
#
-
# Decimal Keyword Protocol References
-
# ------- ------- -------- ----------
-
# protocol num aliases # comments
-
hopopt 0 HOPOPT # IPv6 Hop-by-Hop Option [RFC1883]
-
icmp 1 ICMP # Internet Control Message [RFC792]
-
igmp 2 IGMP # Internet Group Management [RFC1112]
-
ggp 3 GGP # Gateway-to-Gateway [RFC823]
-
ip 4 IP # IP in IP (encapsulation) [RFC2003]
-
st 5 ST # Stream [RFC1190,RFC1819]
-
tcp 6 TCP # Transmission Control [RFC793]
-
cbt 7 CBT # CBT [Ballardie]
-
egp 8 EGP # Exterior Gateway Protocol [RFC888,DLM1]
-
igp 9 IGP # any private interior gateway [IANA]
-
# (used by Cisco for their IGRP)
-
bbn-rcc-mon 10 BBN-RCC-MON # BBN RCC Monitoring [SGC]
-
nvp-ii 11 NVP-II # Network Voice Protocol [RFC741,SC3]
看一下第22行,suse是在搞笑吗,居然把ip和ipencap搞错了。。。。
修改后的/etc/protocols文件类似如下
-
# See also: protocols(5), http://www.sethwklein.net/projects/iana-etc/
-
#
-
#
-
# PROTOCOL NUMBERS
-
#
-
# (last updated 28 March 2006)
-
#
-
# In the Internet Protocol version 4 (IPv4) [RFC791] there is a field,
-
# called "Protocol", to identify the next level protocol. This is an 8
-
# bit field. In Internet Protocol version 6 (IPv6) [RFC1883] this field
-
# is called the "Next Header" field.
-
#
-
# Assigned Internet Protocol Numbers
-
#
-
# Decimal Keyword Protocol References
-
# ------- ------- -------- ----------
-
# protocol num aliases # comments
-
ip 0 IP # internet protocol, pseudo protocol number
-
hopopt 0 HOPOPT # IPv6 Hop-by-Hop Option [RFC1883]
-
icmp 1 ICMP # Internet Control Message [RFC792]
-
igmp 2 IGMP # Internet Group Management [RFC1112]
-
ggp 3 GGP # Gateway-to-Gateway [RFC823]
-
ipencap 4 IP-ENCAP # IP in IP (encapsulation) [RFC2003]
-
st 5 ST # Stream [RFC1190,RFC1819]
-
tcp 6 TCP # Transmission Control [RFC793]
-
cbt 7 CBT # CBT [Ballardie]
-
egp 8 EGP # Exterior Gateway Protocol [RFC888,DLM1]
-
igp 9 IGP # any private interior gateway [IANA]
-
# (used by Cisco for their IGRP)
-
bbn-rcc-mon 10 BBN-RCC-MON # BBN RCC Monitoring [SGC]
-
nvp-ii 11 NVP-II # Network Voice Protocol [RFC741,SC3]
-
pup 12 PUP # PUP [PUP,XEROX]
-
argus 13 ARGUS # ARGUS [RWS4]
-
emcon 14 EMCON # EMCON [BN7]
-
xnet 15 XNET # Cross Net Debugger [IEN158,JFH2]
-
chaos 16 CHAOS # Chaos [NC3]
-
udp 17 UDP # User Datagram [RFC768,JBP]
-
mux 18 MUX # Multiplexing [IEN90,JBP]
-
dcn-meas 19 DCN-MEAS # DCN Measurement Subsystems [DLM1]
-
hmp 20 HMP # Host Monitoring [RFC869,RH6]
修复后再运行netperf,世界清静了。。。
阅读(7739) | 评论(1) | 转发(0) |