衡量I/O闲忙程度的指标
下面是一些衡量I/O闲忙程度的经用指标:
磁盘利用率(disk utilization)
磁盘队列长度(disk queue length)
磁头/逻辑卷的读/写速率(read/write rates per spindle/logical volume)
原始I/O(raw I/O):主要用于数据库
交换队列的长度(swap queue length)
缓存命中率(buffer cache hit ratio)
网络文件系统和无盘工作站速率(NFS and diskless rates(server))
I/O资源成为系统性能的瓶颈的征兆
当I/O成为瓶颈时,会出现下面这些典型的症状:
过高的磁盘利用率(high disk utilization)
太长的磁盘等待队列(large disk queue length)
等待磁盘I/O的时间所占的百分率太高(large percentage of time waiting for disk I/O)
太高的物理I/O速率:large physical I/O rate(not sufficient in itself)
过低的缓存命中率(low buffer cache hit ratio(not sufficient in itself))
太长的运行进程队列,但CPU却空闲(large run queue with idle CPU)
哪些活动是占用I/O资源的大户?
下面是一些占用大量I/O资源的活动:
换页(paging):paging不仅会引起内存问题,还可能引起磁盘问题;
open,creat,and stat system calls:系统调用会引起大量的磁盘I/O;
multiuser I/O and random I/O
relational database
core dumps
利用iostat分析I/O的利用率
iostat - report I/O statistics
iostat iteratively reports I/O statistics for each active disk on the system.
If two or more disks are present, data is presented on successive lines for each disk.
With the advent of new disk technologies, such as data striping, where a single data transfer is spread across several disks, the number of milliseconds per average seek becomes impossible to compute accurately. At best it is only an approximation, varying greatly, based on several dynamic system conditions. For this reason and to maintain backward compatibility, the milliseconds per average seek ( msps ) field is set to the value 1.0.
它的语法为:
iostat [-t] [interval [count]]
其选项的含义为:
-t:Report terminal statistics as well as disk statistics.
interval: Display successive lines which are summaries of the last interval seconds. The first line reported is for the time since a reboot and each subsequent line is for the last interval only.
count: Repeat the statistics count times.
对结果的分析:
通过查看bps列和sps列的值我们可以知道哪些磁盘比较忙,哪些磁盘比较闲。
利用SAR命令分析磁盘活动
通过命令sar -d,我们可以分析系统中的每个磁盘和磁带的活动情况。
Report activity for each block device, e.g., disk or tape drive. One line is printed for each device that had activity during the last interval. If no devices were active, a blank line is printed.Each line contains the following data:
device:设备名;
%busy: Portion of time device was busy servicing a request; statistics.
avque: Average number of requests outstanding for the device;
r+w/s: Number of data transfers per second (read and writes) from and to the device;
blks/s: Number of bytes transferred (in 512-byte units) from and to the device;
avwait: Average time (in milliseconds) that transfer requests waited idly on queue for the device;
avserv: Average time (in milliseconds) to service each transfer request (includes seek, rotational latency, and data transfer times) for the device.
对结果的分析:
如果某个磁盘的%busy列的值大于50%,则说明该磁盘可能存在瓶颈;
如果某个磁盘的avwait珍的值大于avserv列的值,也说明该磁盘可能存在瓶颈;
利用SAR命令分析缓冲区的活动
通过命令sar -b,我们可以分析系统中的缓冲区的活动情况。
Report activity for each block device, e.g., disk or tape drive. One line is printed for each device that had activity during the last interval. If no devices were active, a blank line is printed.Each line contains the following data:
bread/s Number of physical reads per second from the disk (or other block devices) to the buffer cache;
bwrit/s: Number of physical writes per second from the buffer cache to the disk (or other block device);
lread/s: Number of reads per second from buffer cache;
lwrit/s: Number of writes per second to buffer cache;
%rcache: Buffer cache hit ratio for read requests e.g., 1 - bread/lread;
%wcache: Buffer cache hit ratio for write requests e.g., 1 - bwrit/lwrit;
pread/s: Number of reads per second from character device using the physio() (raw I/O) mechanism;
pwrit/s: Number of writes per second to character device using the physio() (i.e., raw I/O ) mechanism; mechanism.
对结果的分析:
如果%rcache列的值小于90%,并且%wcache列的值不在70-70%之间,我们必须观察系统中什么应用在做什么样的读/写操作,我们是否需要增加缓冲欧的大小。
利用SAR命令分析交换区的活动
通过命令sar -w,我们可以分析系统中的交换区的活动情况。
Report activity for each block device, e.g., disk or tape drive. One line is printed for each device that had activity during the last interval. If no devices were active, a blank line is printed.Each line contains the following data:
swpin/s: Number of process swapins per second;
swpot/s: Number of process swapouts per second;
bswin/s: Number of 512-byte units transferred for swapins per second;
bswot/s: Number of 512-byte units transferred for swapouts per second;
pswch/s: Number of process context switches per second.
对结果的分析:
如果swpin/s的值大于零,那么swpot的值必须引起注意;
同时必须注意pswch/s的值,如果很大,说明进程切换频繁。
阅读(4488) | 评论(0) | 转发(2) |