2013年(2)
分类:
2013-04-22 13:30:19
当系统内存占用率长期过高时,我们需要分析内存的使用情况。内存使用率过高会导致Paging space空间占用增长。Paging Space空间过低会造成系统进程丢失,如果Paging Space空间耗尽,那么系统会宕机。所以在系统物理内存占用率过高,而Paging Space空间持续增长时,需要立即分析内存使用的原因。
首先要搞清楚内存的分类,如下阐述:
结合topas里面与内存有关的输出信息,内存可以分为:
PAGING MEMORY
Faults 2228 Real,MB 6144
Steals 0 % Comp 20.7
PgspIn 0 % Noncomp 11.3
PgspOut 0 % Client 0.9
PageIn 0
PageOut 0 PAGING SPACE
Sios 0 Size,MB 8448
% Used 0.6
% Free 99.3
A. 物理内存 – MEMORY项
A1. % Comp: Reports real memory allocated to computational pages. Computational page frames are generally those that are backed by paging space. 该内存块即所谓的计算段内存,在其不足的情况下,主要依靠Paging Space换页来进行内存的补充,所以这块内存段是导致Paging Space增长的直接原因。
A2. % Noncomp: Reports real memory allocated to non-computational pages. Non-computational page frames are generally those that are backed by file space, either data files, executable files, or shared library files. 该内存段是所谓的非计算内存段,主要用于存储操作系统的文件,当该段内存不足时,则直接与硬盘上进行文件的换页操作
A3. %Client: Reports on the amount of memory that is currently used to cache remotely mounted files. 该内存段用于作为远端mount文件系统的缓存。注意:当远端的文件系统的空间非常大,而操作系统没有做限制时,会造成该内存段占用的内存比例过高,从而挤掉A2和A3的物理内存空间,造成Paging Space的持续增长,这个需要我们特别注意。
A2和A3统称为文件段内存。
B. 虚拟内存 – AIX通过Paging Space设备来管理,对应于上面得PAGING SPACE项
PAGING中的各项含义如下:
Faults: Reports the number of faults.
Steals: Reports the number of 4 KB pages of memory stolen by the Virtual Memory Manager per second.
PgspIn: Reports the number of 4 KB pages read in from the paging space per second.
PgspOut: Reports the number of 4 KB pages written to the paging space per second.
PageIn: Reports the number of 4 KB pages read per second.
PageOut: Reports the number of 4 KB pages written per second.
Sios: Reports the number of input/output requests per second issued by the Virtual Memory Manager.
该命令的输出已经在上文阐述
使用vmstat –v得到的信息如下:
该命令返回VMM的信息,可以看的物理内存的详细分配情况和对应的操作系统阈值参数
1572864 memory pages
1504963 lruable pages
1086639 free pages
0 memory pools
135736 pinned pages
80.0 maxpin percentage
20.0 minperm percentage
80.0 maxperm percentage
11.2 numperm percentage
169926 file pages
0.0 compressed percentage
0 compressed pages
0.4 numclient percentage
80.0 maxclient percentage
6803 client pages
0 remote pageouts scheduled
0 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
10721 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
0 external pager filesystem I/Os blocked with no fsbuf
文档的解释如下:
memory pages |
Size of real memory in number of 4 KB pages. |
lruable pages |
Number of 4 KB pages considered for replacement. This number excludes the pages used for VMM internal pages and the pages used for the pinned part of the kernel text. |
free pages |
Number of free 4 KB pages. |
memory pools |
Tuning parameter (managed using vmo) specifying the number of pools. |
pinned pages |
Number of pinned 4 KB pages. |
maxpin percentage |
Tuning parameter (managed using vmo) specifying the percentage of real memory that can be pinned. |
file pages |
Number of 4 KB pages currently used by the file cache. |
numperm percentage |
Percentage of memory currently used by the file cache |
minperm percentage |
Tuning parameter (managed using vmo) in percentage of real memory. This specifies the point below which file pages are protected from the re-page algorithm. |
maxperm percentage |
Tuning parameter (managed using vmo) in percentage of real memory. This specifies the point above which the page stealing algorithm steals only file pages. |
client pages |
Number of client pages. |
Numclient percentage |
percentagePercentage of memory occupied by client pages. |
maxclient percentage |
Tuning parameter (managed using vmo) specifying the maximum percentage of memory that can be used for client pages. |
与topas返回的结果比照对应,注意以下两点:
1) file pages所占比例对应于Nocomp%
client pages所占比例对应于client%
而%Comp的值则无直接对应项
2) pinned pages:是指存在于内存中但不能被paging space或者磁盘换页的内存空间
3) 该命令提供的另外一个重要信息是用于调整file pages和client pages在物理内存中占用比率的操作系统参数
maxpin percentage:指定pinned pages在物理内存中最多占用的比例
minperm percentage和maxperm percentage则对file pages在内存中占用的比例提供了一个阈值开关,如果file pages低至minperm percentage,则换页算法不会从file pages内存部分中换页,如果file pages高于maxperm percentage,则换页算法会先从ile pages内存部分中换页,而暂停从paging space中取页。(?)
maxclient percentage则指定了client pages可以占用的内存空间使用率,所以该参数可以限制操作系统为远端mount的文件系统所预留的缓存空间。
如果你觉得vmstat -v命令还没有看到Comp%的信息,那么svmon –G可以满足你的需要。其输出如下所示:
root% /$svmon -G
size inuse free pin virtual
memory 1572864 486247 1086617 135736 309924
pg space 2162688 3357
work pers clnt
pin 135736 0 0
in use 309924 169520 6803
与上面两个命令对应,
work,pers,clnt分别对应comp,Noncomp,client项
而svmon同时给出了pin(不能被换页内存)和in use(总的物理内存)在三项的分布情况
通过上述的分类查看,我们可以初步定位内存的主要分布情况,如果是client%占用过多,我们可以马上得出是mount远端文件系统所分配的缓存过多,但是如果work和pers过高,则需要分类分析定位,svmon命令的丰富参数可以满足我们的要求
# svmon -Pu -t 10|grep -p Pid|grep '^.*[0-9]'
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1589282 db2dasstm 37414 7725 0 35102 Y Y N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1560828 db2fmp 36871 7725 0 36781 Y Y N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1511470 db2sysc 35356 7717 0 35329 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1404992 db2sysc 35218 7717 0 35179 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1519756 db2sysc 35125 7717 0 35098 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1392824 db2sysc 35064 7717 0 35037 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1609756 db2sysc 34833 7717 0 34806 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1564908 db2sysc 34762 7717 0 34735 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1437728 db2fmp 34734 7722 0 34298 Y Y N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1421384 db2sysc 34624 7717 0 34585 Y N N
# svmon -Pg -t 10 |grep -p Pid|grep '^.*[0-9]'
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
0 swapper 12066 7704 0 12066 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1048576 aioserver 12064 7703 0 12064 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1 init 17599 7698 0 17576 N N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
876716 aioserver 12062 7703 0 12062 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1052674 aioserver 12064 7703 0 12064 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
880814 aioserver 12062 7703 0 12062 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
8196 wait 12060 7704 0 12060 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
1056774 dtlogin 17620 7698 0 17578 N N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
884912 aioserver 12062 7703 0 12062 Y N N
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
root% /$svmon -Uu -t 3|grep -p User
===============================================================================
User Inuse Pin Pgsp Virtual
root 70244 10499 0 57433
===============================================================================
User Inuse Pin Pgsp Virtual
testinst 67279 8239 0 64869
===============================================================================
User Inuse Pin Pgsp Virtual
gsjinst 44970 8030 0 44504
HQHFDB1:/ # svmon -Ug -t 3|grep -p User
===============================================================================
User Inuse Pin Pgsp Virtual LPageCap
db2inst1 1192556 8685 313850 1183044 -
===============================================================================
User Inuse Pin Pgsp Virtual LPageCap
root 33914 10413 19756 51603 -
===============================================================================
User Inuse Pin Pgsp Virtual LPageCap
db2fenc1 19116 4604 3519 23934 -
该命令用于查询DB2使用的物理内存情况
1. db2mtrk –i
按照instance查询,结果如下:
Tracking Memory on: 2010/03/10 at 12:47:27
Memory for instance
monh other
320.0K 2.7M
2. db2mtrk –d
按照数据库查询,结果如下
Tracking Memory on: 2010/03/10 at 12:46:22
Memory for database: TESTDB
utilh pckcacheh catcacheh bph (2) bph (1) bph (S32K) bph (S16K)
64.0K 192.0K 64.0K 31.6M 4.2M 704.0K 448.0K
bph (S8K) bph (S4K) shsorth lockh dbh other
320.0K 256.0K 0 640.0K 4.4M 192.0K
3. db2mtrk -p
按照agent查询,结果如下:
testinst% ~$db2mtrk -p
Tracking Memory on: 2010/03/18 at 14:39:09
Memory for agent 1601556
other apph appctlh
64.0K 64.0K 64.0K
Memory for agent 1519756
other apph appctlh
64.0K 64.0K 64.0K
Memory for agent 1401018
other apph appctlh
64.0K 64.0K 64.0K
综上所述,如果AIX系统内存使用出现问题,我们可以使用topas,svmon –G,vmstat –v总体查看,然后使用svmon –P –U分类查看,如果是DB2,可以使用db2mtrk查询db2所使用的物理内存,从而得到最终的结论。