Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1362424
  • 博文数量: 343
  • 博客积分: 13098
  • 博客等级: 上将
  • 技术积分: 2862
  • 用 户 组: 普通用户
  • 注册时间: 2005-07-06 00:35
文章存档

2012年(131)

2011年(31)

2010年(53)

2009年(23)

2008年(62)

2007年(2)

2006年(36)

2005年(5)

分类: 虚拟化

2012-09-22 08:42:52

The swappiness value

The default swappiness value is 60. The system accepts values between zero and 100. When you set the swappiness value to zero on Intel Nehalem systems, in most cases the virtual memory manager removes page cache and buffer cache rather than swapping out program memory. In KVM environments, program memory likely consists of a large amount of memory that the guest operating system uses.

Intel Nahalem systems use an extended page table (EPT). EPT does not set an access bit for pages that the guest operating systems uses. Therefore, the Linux virtual memory manager cannot use the least recently used (LRU) algorithm to accurately determine which pages are not needed. In other words, the LRU algorithm cannot accurately determine which pages, that back the memory of the guest operating systems, are the best candidates to swap out. In this situation, the best performance option is to avoid swapping for as long as possible by setting the swappiness value to zero.

Systems with Advanced Micro Devices (AMD) processors and nested page table (NPT) technology do not have this issue.

echo 0 > /proc/sys/vm/swappiness

or:

Edit the /etc/sysctl.conf file by adding the following information:
vm.swappiness=0

Swapping size

The swap partition is used for swapping underused memory to the hard drive to speed up memory performance. The default size of the swap partition is calculated from amount of RAM and overcommit ratio. It is recommended to make your swap partition larger if you intend to overcommit memory with KVM. A recommended overcommit ratio is 50% (0.5). The formula used is:

(0.5 * RAM) (overcommit ratio * RAM) = Recommended swap size

dirty_ratio 

Contains, as a percentage of total system memory, 
the number of pages at which a process which is generating 
disk writes will itself start writing out dirty data.

Lower the amount of unwritten write cache to reduce lags 
when a huge write is required

Lower the amount of unwritten write cache to reduce lags when a huge write is required.

echo 10 > /proc/sys/vm/dirty_ratio

dirty_background_ratio 

Contains, as a percentage of total system memory, the number of pages 
at which the pdflush background writeback daemon will start writing 
out dirty data.

echo 4 > /proc/sys/vm/dirty_background_ratio

min_free_kbytes

This is used to force the Linux VM to keep a minimum number of kilobytes 
free. The VM uses this number to compute a pages_min value for each 
lowmem zone in the system. Each lowmem zone gets a number of reserved 
free pages based proportionally on its size

Increase minimum free memory, in theory this should make the kernel less likely to 
suddenly run out of memory

echo 4096 > /proc/sys/vm/min_free_kbytes

dirty_expire_centisecs

This tunable is used to define when dirty data is old enough to be 
eligible for writeout by the pdflush daemons. It is expressed in 100'ths 
of a second. Data which has been dirty in memory for longer than this 
interval will be written out next time a pdflush daemon wakes up.

echo 200 > /proc/sys/vm/dirty_expire_centisecs

dirty_writeback_centisecs 

The pdflush writeback daemons will periodically wake up and write "old" 
data out to disk. This tunable expresses the interval between those 
wakeups, in 100'ths of a second.

Setting this to zero disables periodic writeback altogether.

echo 1000 > /proc/sys/vm/dirty_writeback_centisecs

laptop_mode

When laptop mode is enabled, the Linux will try to be smart about when
to do disk I/O.  It gives as much time as possible to be in a low 
power state. 
If mode is disabled if value is set to zero (0). To enable mode use non 
zero value such as 5.

echo 0 > /proc/sys/vm/laptop_mode

vfs_cache_pressure

Controls the tendency of the kernel to reclaim the memory which is used for
caching of directory and inode objects.

At the default value of vfs_cache_pressure = 100 the kernel will attempt
to reclaim dentries and inodes at a "fair" rate with respect to pagecache 
and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel 
to prefer to retain dentry and inode caches. Increasing vfs_cache_pressure 
beyond 100 causes the kernel to prefer to reclaim dentries and inodes.

echo 5 > /proc/sys/vm/vfs_cache_pressure

drop_caches

Will cause the kernel to drop clean caches, dentries and inodes from 
memory, causing that memory to become free.

1         to free pagecache
2         to free dentries and inodes
3         to free pagecache, dentries and inodes

As this is a non-destructive operation, and dirty objects are not 
freeable, the user should run "sync" first in order to make sure all 
cached objects are freed.

echo 3 > /proc/sys/vm/drop_caches

page-cluster

controls the number of pages which are written to swap in a single attempt. 
The swap I/O size.

It is a logarithmic value 
0           means "1 page" 
1           means "2 pages" 
2           means "4 pages"
3           means "8 pages"

The default value is three (eight pages at a time). 
There may be some small benefits in tuning this to a different value 
if your workload is swap-intensive.

echo 3 > /proc/sys/vm/page-cluster

overcommit_memory

Controls overcommit of system memory, possibly allowing processes to 
allocate (but not use) more memory than is actually available.

0           Heuristic overcommit handling. Obvious overcommits of 
            address space are refused. Used for a typical system. 
            It ensures a seriously wild allocation fails while allowing 
            overcommit to reduce swap usage. root is allowed to allocate 
            slighly more memory in this mode. This is the default.
1           Always overcommit. Appropriate for some scientific applications.
2           Don't overcommit. The total address space commit for the system 
            is not  permitted to exceed swap plus a configurable percentage
            (default is 50) of physical RAM. Depending on the percentage 
            you use, in most situations this means a process will not be 
            killed while attempting to use already-allocated memory but 
            will receive errors on memory allocation as appropriate.

echo 1 > /proc/sys/vm/overcommit_memory

如何计算虚拟内存?

总虚拟内存 = 可用物理内存 × 百分比 交换分区
 
#cat /proc/meminfo |grep -i commit
CommitLimit:   4114264 kB           //最大可用虚拟内存
Committed_AS:  3821756 kB         //已使用虚拟内存

overcommit_ratio 

Percentage of physical memory size to include in overcommit calculations.

Memory allocation limit = swapspace physmem * (overcommit_ratio / 100)

swapspace = total size of all swap areas
physmem = size of physical memory in system

echo 100 > /proc/sys/vm/overcommit_ratio

zone_reclaim_mode

The hardware of some operating systems has a Non Uniform Memory Architecture (NUMA) penalty for remote memory access. If the NUMA penalty of an operating system is high enough, the operating system performs zone reclaim.

For example, an operating system allocates memory to a NUMA node, but the NUMA node is full. In this case, the operating system reclaims memory for the local NUMA node rather than immediately allocating the memory to a remote NUMA node. The performance benefit of allocating memory to the local node outweighs the performance drawback of reclaiming the memory. However, in some situations reclaiming memory decreases performance to the extent that the opposite is true. In other words, in these situations, allocating memory to a remote NUMA node generates better performance than reclaiming memory for the local node.

A guest operating system can sometimes cause zone reclaim to occur when you pin memory. For example, a guest operating system causes zone reclaim in the following situations:
  • When you configure the guest operating system to use huge pages.
  • When you use Kernel same-page merging (KSM) to share memory pages between guest operating systems.
Configuring huge pages and running KSM are both best practices for KVM environments. Therefore, to optimize performance in KVM environments, disable zone reclaim.

echo 0 > /proc/sys/vm/zone_reclaim_mode

  1. echo 0 > /proc/sys/vm/swappiness
  2. echo 10 > /proc/sys/vm/dirty_ratio
  3. echo 4 > /proc/sys/vm/dirty_background_ratio
  4. echo 4096 > /proc/sys/vm/min_free_kbytes
  5. echo 5 > /proc/sys/vm/vfs_cache_pressure
  6. echo 0 > /proc/sys/vm/laptop_mode
  7. echo 0 > /proc/sys/vm/panic_on_oom
  8. echo 1 > /proc/sys/vm/drop_caches
  9. echo 3 > /proc/sys/vm/page-cluster
  10. echo 1 > /proc/sys/vm/overcommit_memory
  11. echo 100 > /proc/sys/vm/overcommit_ratio
  12. echo 200 > /proc/sys/vm/dirty_expire_centisecs
  13. echo 500 > /proc/sys/vm/dirty_writeback_centisecs
  14. echo 0 > /proc/sys/vm/zone_reclaim_mode

阅读(2543) | 评论(0) | 转发(1) |
给主人留下些什么吧!~~