AIX性能监控系列学习-CPU 瓶颈分析-raybinbin-ChinaUnix博客

Raybinbin的博客raybinbin.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

raybinbin

博客访问： 533937
博文数量： 105
博客积分： 4174
博客等级：上校
技术积分： 1395
用户组：普通用户
注册时间： 2008-03-07 11:35

文章分类

全部博文（105）

Exchange（2）
养生保健（2）
IT运维（6）
名词解释（3）
Application（0）

IBM（0）
虚拟相关（3）
项目管理（18）
相与析（14）
Network（6）
AIX platform（2）
Windows platform（20）

AD RMS（2）

命令行与脚本（2）

虚拟化（1）

Cluster（1）

Active Directory（3）
Database（5）

Oracle（1）

NoSQL（0）

Mysql（3）

SybaseIQ（0）

DB2（1）
Linux platform（24）

Subversion（1）

CMD（1）

LVS&HA（8）

Monitor（4）

print（1）

pam.d（1）

http（1）

shell（2）

mail（3）
未分配的博文（0）

文章存档

2013年（3）

2012年（16）

2011年（71）

2010年（3）

2009年（6）

2008年（6）

我的朋友

相关博文

AIX性能监控系列学习-CPU 瓶颈分析

分类：系统运维

2011-05-19 11:26:45

CPU 瓶颈

下面我们将就如何使用命令vmstat、tprof和ps检查系统是否存在CPU瓶颈做一个简单介绍。

1. vmstat

使用命令

# vmstat 1 10

P650A:/#vmstat 1 10

System configuration: lcpu=16 mem=15744MB

kthr memory page faults cpu

----- ----------- ------------------------ ------------ -----------

r b avm fre re pi po fr sr cy in sy cs us sy id wa

0 0 3208684 10343 0 0 0 0 0 0 19 1447 290 0 0 99 0

0 0 3208686 10341 0 0 0 0 0 0 2 1268 248 0 0 99 0

0 0 3208686 10341 0 0 0 0 0 0 1 1265 246 0 0 99 0

0 0 3208687 10340 0 0 0 0 0 0 3 1260 254 0 0 99 0

0 0 3208687 10340 0 0 0 0 0 0 1 1320 264 0 0 99 0

0 0 3208687 10337 0 0 0 0 0 0 24 4145 321 0 3 97 0

0 0 3208687 10337 0 0 0 0 0 0 9 1438 313 0 0 99 0

0 0 3208687 10334 0 3 0 0 0 0 40 2348 1110 0 0 99 0

0 0 3208687 10334 0 0 0 0 0 0 1 1323 257 0 0 99 0

0 0 3208687 10334 0 0 0 0 0 0 5 1251 242 0 0 99 0

注: 运行队列有进程等待时系统运行速度会降低。

id CPU 空闲时间或无I/O等待时间的百分比；

wa CPU I/O 等待时间的百分比；

r 运行队列中的线程数；

如果 id 和wa 的值持续为接近0的值，表明CPU此时处于繁忙状态。

下面来看看字段r(运行队列中的线程数)。

运行队列中等待的线程数越多，系统性能受到的影响越大。

2. tprof

tprof命令用于统计每个进程的CPU使用情况。

以超级用户root的身份运行下列命令，可以找出进程占用的CPU时间：

# tprof -x sleep 30

此命令运行30秒钟，在当前目录下创建一个prof的文件。30秒钟内，CPU被调度次数约为3000次。

prof文件中的字段Total为此进程调度到的CPU次数。如果进程所对应的Total字段的值为1500，

表示该进程在3000次CPU调度中占用了1500次，或理解为使用了一半的CPU时间。

tprof的输出准确地显示出哪个进程在使用CPU时间。

例:

P650A:/#tprof -x sleep 30

Mon May 16 16:08:54 2011

System: AIX 5.3 Node: P650A Machine: 00C3EE9E4C00

Starting Command sleep 30

stopping trace collection.

Generating sleep.prof

P650A:/#more sleep.prof

Configuration information

=========================

System: AIX 5.3 Node: P650A Machine: 00C3EE9E4C00

Tprof command was:

tprof -x sleep 30

Trace command was:

/usr/bin/trace -ad -M -L 283203993 -T 500000 -j 000,00A,001,002,003,38F,005,006,134,139,5A2,5A5,465,234, -o -

Total Samples = 8113

Traced Time = 30.01s (out of a total execution time of 30.01s)

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Process Freq Total Kernel User Shared Other

======= ==== ===== ====== ==== ====== =====

wait 16 99.68 99.68 0.00 0.00 0.00

/usr/bin/ps 4 0.18 0.18 0.00 0.00 0.00

swapper 1 0.11 0.11 0.00 0.00 0.00

/usr/sbin/syncd 1 0.01 0.01 0.00 0.00 0.00

ora_mrp0_standby 1 0.01 0.00 0.01 0.00 0.00

======= ==== ===== ====== ==== ====== =====

Total 23 100.00 99.99 0.01 0.00 0.00

Process PID TID Total Kernel User Shared Other

======= === === ===== ====== ==== ====== =====

wait 65568 90157 18.80 18.80 0.00 0.00 0.00

wait 8196 8197 15.88 15.88 0.00 0.00 0.00

wait 296 309 7.25 7.25 0.00 0.00 0.00

wait 33080 33093 7.08 7.08 0.00 0.00 0.00

wait 73764 98353 7.06 7.06 0.00 0.00 0.00

wait 24884 24897 6.64 6.64 0.00 0.00 0.00

wait 16688 16701 3.70 3.70 0.00 0.00 0.00

wait 69666 94255 3.70 3.70 0.00 0.00 0.00

wait 37178 37191 3.70 3.70 0.00 0.00 0.00

wait 77862 102451 3.70 3.70 0.00 0.00 0.00

wait 12590 12603 3.70 3.70 0.00 0.00 0.00

wait 57372 77863 3.70 3.70 0.00 0.00 0.00

wait 28982 28995 3.70 3.70 0.00 0.00 0.00

wait 20786 20799 3.70 3.70 0.00 0.00 0.00

wait 53274 73765 3.70 3.70 0.00 0.00 0.00

wait 61470 86059 3.70 3.70 0.00 0.00 0.00

swapper 0 3 0.11 0.11 0.00 0.00 0.00

/usr/bin/ps 9646118 10195121 0.07 0.07 0.00 0.00 0.00

/usr/bin/ps 9580580 10076383 0.05 0.05 0.00 0.00 0.00

/usr/bin/ps 9646122 10195125 0.05 0.05 0.00 0.00 0.00

ora_mrp0_standby 364696 860273 0.01 0.00 0.01 0.00 0.00

/usr/sbin/syncd 242164 442539 0.01 0.01 0.00 0.00 0.00

/usr/bin/ps 9580582 10076385 0.01 0.01 0.00 0.00 0.00

======= === === ===== ====== ==== ====== =====

Total 100.00 99.99 0.01 0.00 0.00

3. netpmon

netpmon命令用于监控与网络有关的I/0及CPU的使用情况。

以root 身份运行下面的命令，可以找出进程使用的CPU时间，以及其中与网络有关的代码使用的CPU时间：

# netpmon -o /tmp/netpmon.out -O cpu -v; sleep 30; trcstop

此命令运行30秒钟，并在/tmp目录下生成文件 netpmon.out。其中字段 CPU Time 为进程使用CPU的时间总值，

CPU%对应其百分比，Network CPU% 为进程中与网络有关的代码所占用的CPU百分比。

例:

P650A:/#netpmon -o /tmp/netpmon.out -O cpu -v; sleep 30; trcstop

Mon May 16 16:13:36 2011

System: AIX 5.3 Node: P650A Machine: 00C3EE9E4C00

/usr/bin/trace -ad -L 283203993 -T 1000000 -j 000,00A,001,002,003,38F,005,006,106,10C,4B0,210,139,134,135,100,200,102,103,101,104,465,467,46A,419,256,255,262,26A,26B,32D,32E,2A7,2A8,351,352,320,321,30A,30B,330,331,334,335,2C3,2C4,2A4,2A5,2E6,2E7,2DA,2DB,2EA,2EB,473,474,470,471,252,216,211, -o -

Run trcstop command to signal end of trace.

P650A:more /tmp/netpmon.out

Process CPU Usage Statistics:

-----------------------------

Network

Process PID CPU Time CPU % CPU %

----------------------------------------------------------

netpmon 7995686 27.3816 5.876 0.003

xmwlm 360786 15.0879 3.238 0.000

UNKNOWN 9637940 14.7336 3.162 0.000

aioserver 1536082 6.0583 1.300 0.000

ora_p010_standby 401556 0.6207 0.133 0.000

sched 4394 0.2329 0.050 0.000

ps 1511926 0.0678 0.015 0.000

ps 9625748 0.0631 0.014 0.000

oraclestandby 405822 0.0468 0.010 0.000

dtgreet 233958 0.0338 0.007 0.000

syncd 242164 0.0315 0.007 0.000

swapper 0 0.0269 0.006 0.000

sh 9666742 0.0155 0.003 0.000

ora_mrp0_standby 364696 0.0130 0.003 0.000

wrapper-aix-ppc-32 7827780 0.0107 0.002 0.000

ora_dbw0_standby 373114 0.0095 0.002 0.000

java 9584824 0.0085 0.002 0.000

ora_p001_standby 376966 0.0069 0.001 0.000

init 1 0.0058 0.001 0.000

ora_dbw1_standby 328002 0.0047 0.001 0.000

grep 9666746 0.0047 0.001 0.000

ora_p005_standby 389262 0.0046 0.001 0.000

dsmrecalld 254234 0.0041 0.001 0.000

ora_p004_standby 385164 0.0041 0.001 0.000

gil 45374 0.0039 0.001 0.001

ora_p002_standby 381064 0.0036 0.001 0.000

ora_p014_standby 413852 0.0036 0.001 0.000

ora_p009_standby 368772 0.0033 0.001 0.000

oraclestandby 7852356 0.0031 0.001 0.000

ora_pmon_standby 241746 0.0030 0.001 0.000

grep 9625744 0.0027 0.001 0.000

ora_p003_standby 422282 0.0026 0.001 0.000

ora_p000_standby 372868 0.0025 0.001 0.000

ora_ckpt_standby 381286 0.0022 0.000 0.000

ora_p006_standby 393360 0.0021 0.000 0.000

aioserver 8032318 0.0020 0.000 0.000

----------------------------------------------------------

Total (all processes) 64.5909 13.862 0.004

Idle time 431.2581 92.550

========================================================================

First Level Interrupt Handler CPU Usage Statistics:

---------------------------------------------------

Network

FLIH CPU Time CPU % CPU %

----------------------------------------------------------

data page fault 0.2858 0.061 0.000

PPC decrementer 0.1983 0.043 0.000

external device 0.1106 0.024 0.000

queued interrupt 0.0045 0.001 0.000

instruction page fault 0.0001 0.000 0.000

----------------------------------------------------------

Total (all FLIHs) 0.5993 0.129 0.000

========================================================================

Second Level Interrupt Handler CPU Usage Statistics:

----------------------------------------------------

Network

SLIH CPU Time CPU % CPU %

----------------------------------------------------------

goentdd64 0.0080 0.002 0.002

sisraid_dd64 0.0063 0.001 0.000

----------------------------------------------------------

Total (all SLIHs) 0.0143 0.003 0.002

========================================================================

Detailed Second Level Interrupt Handler CPU Usage Statistics:

-------------------------------------------------------------

SLIH: goentdd64

cpu time (msec): avg 0.072 min 0.003 max 0.199 sdev 0.057

SLIH: sisraid_dd64

cpu time (msec): avg 0.007 min 0.005 max 0.032 sdev 0.002

COMBINED (All SLIHs)

cpu time (msec): avg 0.015 min 0.003 max 0.199 sdev 0.028

P650A:/#

转自：http://yunlongzheng.blog.51cto.com/788996/566538

阅读(1151) | 评论(0) | 转发(0) |

上一篇：人事部面试常问的19个问题和一些面试经典问题

下一篇：Linux 学习相关

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6