4 Linux Commands To View Page Faults Statistics-ckelsel-ChinaUnix博客

How do I view minor and major page faults statistics for a process under Linux operating systems?

You can use page faults to improve Linux server performance. Make sure you optimize your daemons / programs to reduce the number of page faults. Once the number of page faults gone down the performance of the daemons and the entire Linux operating system will go up.

Linux (and most Unix like) system uses a virtual memory into a physical address space. Linux kernel manages this mapping as and when required using "on demand" technique. A page fault occurs when a process accesses a page that is mapped in the virtual address space, but not loaded in physical memory. In most cases, page faults are not errors. They are used to increase the amount of memory available to programs in Linux and Unix like operating systems that use virtual memory. Virtual memory is nothing but a memory management technique used by Linux and many other modern operating systems that combine active RAM and inactive memory on the disk drive (hard disk / ssd) to form a large range of contiguous addresses.

Tutorial details
Difficulty	Intermediate (rss)
Root privileges	Yes
Requirements	ps, top, sar

A major fault occurs when disk access required. For example, start an app called Firefox. The Linux kernel will search in the physical memory and CPU cache. If data do not exist, the Linux issues a major page fault.
A minor fault occurs due to page allocation.

You can use standard Linux commands such as ps, top, time, and sar to view page faults for all process or specific process.

Example: ps command

Use the ps command to view page faults for PID #1, enter:

  ps -o min_flt,maj_flt 1

Sample outputs:

 MINFL  MAJFL
  3104     36

Where,

min_flt : Number of minor page faults.
maj_flt : Number of major page faults.

You may want to see the other details for PID # 1 such as user, group, command and its args, enter:
# ps -o min_flt,maj_flt,cmd,args,uid,gid 1
Sample outputs:

 MINFL  MAJFL CMD                         COMMAND                       UID   GID
  3104     36 /sbin/init                  /sbin/init                      0     0

To see every process on the system:
# ps -eo min_flt,maj_flt,cmd,args,uid,gid | less

Example: top command

Type the following (you can also use atop and htop):
# top
OR start the top command with a delay time interval:
# top -d 1
Type F to see sort menu and type u to sort by faults. Finally, hit the [Enter] key.

Example: sar command

The program including page activity. Type the following command:
# sar -B
# sar -B 1 10
Sample outputs:

 
Linux 2.6.32-279.el6.x86_64 (server1.cyberciti.biz) Monday 05 November 2012 _x86_64_ (8 CPU)   12:46:48 CST  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff 12:46:49 CST 0.00 460.61 68.69 0.00 452.53 0.00 0.00 0.00 0.00 12:46:50 CST 0.00 276.00 170.00 0.00 642.00 0.00 0.00 0.00 0.00 12:46:51 CST 0.00 460.00 47.00 0.00 550.00 0.00 0.00 0.00 0.00 12:46:52 CST 0.00 228.00 49.00 0.00 705.00 0.00 0.00 0.00 0.00 12:46:53 CST 0.00 320.00 146.00 0.00 420.00 0.00 0.00 0.00 0.00 12:46:54 CST 0.00 164.00 69.00 0.00 479.00 0.00 0.00 0.00 0.00 12:46:55 CST 0.00 501.01 1144.44 0.00 991.92 0.00 0.00 0.00 0.00 12:46:56 CST 0.00 220.00 65.00 0.00 503.00 0.00 0.00 0.00 0.00 12:46:57 CST 0.00 280.00 156.00 0.00 514.00 0.00 0.00 0.00 0.00 12:46:58 CST 0.00 160.00 941.00 0.00 949.00 0.00 0.00 0.00 0.00 Average: 0.00 306.61 284.97 0.00 620.44 0.00 0.00 0.00 0.00

From the sar man page:

-B Report paging statistics. Some of the metrics below are available only with post 2.5 kernels. The following values are dis-
played:
pgpgin/s
Total number of kilobytes the system paged in from disk per second. Note: With old kernels (2.2.x) this value is a num-
ber of blocks per second (and not kilobytes).
pgpgout/s
Total number of kilobytes the system paged out to disk per second. Note: With old kernels (2.2.x) this value is a number
of blocks per second (and not kilobytes).
fault/s
Number of page faults (major + minor) made by the system per second. This is not a count of page faults that generate
I/O, because some page faults can be resolved without I/O.
majflt/s
Number of major faults the system has made per second, those which have required loading a memory page from disk.
pgfree/s
Number of pages placed on the free list by the system per second.
pgscank/s
Number of pages scanned by the kswapd daemon per second.
pgscand/s
Number of pages scanned directly per second.
pgsteal/s
Number of pages the system has reclaimed from cache (pagecache and swapcache) per second to satisfy its memory demands.
%vmeff
Calculated as pgsteal / pgscan, this is a metric of the efficiency of page reclaim. If it is near 100% then almost every
page coming off the tail of the inactive list is being reaped. If it gets too low (e.g. less than 30%) then the virtual
memory is having some difficulty. This field is displayed as zero if no pages have been scanned during the interval of
time.

Example: time command

Use the /usr/bin/time command (do not use shell built-in time command) to run programs and summarize system resource usage include page faults. First, find out path to real time command:
# type -a time
Sample outputs:

time is a shell keyword
time is /usr/bin/time

Now, type the following command to see ls command page faults:
$ /usr/bin/time -v ls /etc/resolv.conf
Sample outputs:

/etc/resolv.conf
	Command being timed: "ls /etc/resolv.conf"
	User time (seconds): 0.00
	System time (seconds): 0.00
	Percent of CPU this job got: 0%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 3456
	Average resident set size (kbytes): 0  Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 280  Voluntary context switches: 1
	Involuntary context switches: 3
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

In this example, I am running xclock program two times (note down the output):

$ /usr/bin/time -v xclock Major (requiring I/O) page faults: 4 Minor (reclaiming a frame) page faults: 1083
$ /usr/bin/time -v xclock Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 1087

The first time xclock starts, there are many major faults. But, the second time xclock starts, the Linux kernel does not issue any major faults as the xclock is in memory already.

If you found a large number of page faults for a specific process try the following suggestions to improve the situation:

Optimize the server process.
Reduce the memory process by tweaking parameters in configuration files such as php.ini or httpd.conf or lighttpd.conf.
Add more RAM to the system.
Use a better page replacement algorithm that can reduce the incidence of page faults.