KVM环境swap的配置-bigluo-ChinaUnix博客

Linux is Powerbigluo.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

bigluo

博客访问： 1425111
博文数量： 343
博客积分： 13098
博客等级：上将
技术积分： 2862
用户组：普通用户
注册时间： 2005-07-06 00:35

文章分类

全部博文（343）

Web Development（2）
Python & Perl（35）
Operating System（8）
Visualization Te（106）
Miscellaneous（10）
Google Android（15）
Motorola EzX（6）
Linux Memory Mgm（10）
Embedded Develop（31）

Embedded Toolcha（5）

Embedded Linux O（9）

Embedded Java（0）

Embedded Hardwar（3）

Embedded Databas（2）

Embedded Browser（0）

Embedded UI Fram（9）

Embedded Multime（3）
C++ Programming（36）
Linux System Adm（76）
Secure Programmi（5）
未分配的博文（3）

文章存档

2012年（131）

2011年（31）

2010年（53）

2009年（23）

2008年（62）

2007年（2）

2006年（36）

2005年（5）

我的朋友

相关博文

KVM环境swap的配置

分类：虚拟化

2012-09-22 08:42:52

The swappiness value

The default swappiness value is 60. The system accepts values between zero and 100. When you set the swappiness value to zero on Intel Nehalem systems, in most cases the virtual memory manager removes page cache and buffer cache rather than swapping out program memory. In KVM environments, program memory likely consists of a large amount of memory that the guest operating system uses.

Intel Nahalem systems use an extended page table (EPT). EPT does not set an access bit for pages that the guest operating systems uses. Therefore, the Linux virtual memory manager cannot use the least recently used (LRU) algorithm to accurately determine which pages are not needed. In other words, the LRU algorithm cannot accurately determine which pages, that back the memory of the guest operating systems, are the best candidates to swap out. In this situation, the best performance option is to avoid swapping for as long as possible by setting the swappiness value to zero.

Systems with Advanced Micro Devices (AMD) processors and nested page table (NPT) technology do not have this issue.

echo 0 > /proc/sys/vm/swappiness

or:

Edit the /etc/sysctl.conf file by adding the following information:

vm.swappiness=0

Swapping size

The swap partition is used for swapping underused memory to the hard drive to speed up memory performance. The default size of the swap partition is calculated from amount of RAM and overcommit ratio. It is recommended to make your swap partition larger if you intend to overcommit memory with KVM. A recommended overcommit ratio is 50% (0.5). The formula used is:

(0.5 * RAM) (overcommit ratio * RAM) = Recommended swap size

dirty_ratio

Contains, as a percentage of total system memory,

the number of pages at which a process which is generating

disk writes will itself start writing out dirty data.

Lower the amount of unwritten write cache to reduce lags

when a huge write is required

Lower the amount of unwritten write cache to reduce lags when a huge write is required.

echo 10 > /proc/sys/vm/dirty_ratio

dirty_background_ratio

Contains, as a percentage of total system memory, the number of pages

at which the pdflush background writeback daemon will start writing

out dirty data.

echo 4 > /proc/sys/vm/dirty_background_ratio

min_free_kbytes

This is used to force the Linux VM to keep a minimum number of kilobytes

free. The VM uses this number to compute a pages_min value for each

lowmem zone in the system. Each lowmem zone gets a number of reserved

free pages based proportionally on its size

Increase minimum free memory, in theory this should make the kernel less likely to

suddenly run out of memory

echo 4096 > /proc/sys/vm/min_free_kbytes

dirty_expire_centisecs

This tunable is used to define when dirty data is old enough to be

eligible for writeout by the pdflush daemons. It is expressed in 100'ths

of a second. Data which has been dirty in memory for longer than this

interval will be written out next time a pdflush daemon wakes up.

echo 200 > /proc/sys/vm/dirty_expire_centisecs

dirty_writeback_centisecs

The pdflush writeback daemons will periodically wake up and write "old"

data out to disk. This tunable expresses the interval between those

wakeups, in 100'ths of a second.

Setting this to zero disables periodic writeback altogether.

echo 1000 > /proc/sys/vm/dirty_writeback_centisecs

laptop_mode

When laptop mode is enabled, the Linux will try to be smart about when

to do disk I/O. It gives as much time as possible to be in a low

power state.

If mode is disabled if value is set to zero (0). To enable mode use non

zero value such as 5.

echo 0 > /proc/sys/vm/laptop_mode

vfs_cache_pressure

Controls the tendency of the kernel to reclaim the memory which is used for

caching of directory and inode objects.

At the default value of vfs_cache_pressure = 100 the kernel will attempt

to reclaim dentries and inodes at a "fair" rate with respect to pagecache

and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel

to prefer to retain dentry and inode caches. Increasing vfs_cache_pressure

beyond 100 causes the kernel to prefer to reclaim dentries and inodes.

echo 5 > /proc/sys/vm/vfs_cache_pressure

drop_caches

Will cause the kernel to drop clean caches, dentries and inodes from

memory, causing that memory to become free.

1 to free pagecache

2 to free dentries and inodes

3 to free pagecache, dentries and inodes

As this is a non-destructive operation, and dirty objects are not

freeable, the user should run "sync" first in order to make sure all

cached objects are freed.

echo 3 > /proc/sys/vm/drop_caches

page-cluster

controls the number of pages which are written to swap in a single attempt.

The swap I/O size.

It is a logarithmic value

0 means "1 page"

1 means "2 pages"

2 means "4 pages"

3 means "8 pages"

The default value is three (eight pages at a time).

There may be some small benefits in tuning this to a different value

if your workload is swap-intensive.

echo 3 > /proc/sys/vm/page-cluster

overcommit_memory

Controls overcommit of system memory, possibly allowing processes to

allocate (but not use) more memory than is actually available.

0 Heuristic overcommit handling. Obvious overcommits of

address space are refused. Used for a typical system.

It ensures a seriously wild allocation fails while allowing

overcommit to reduce swap usage. root is allowed to allocate

slighly more memory in this mode. This is the default.

1 Always overcommit. Appropriate for some scientific applications.

2 Don't overcommit. The total address space commit for the system

is not permitted to exceed swap plus a configurable percentage

(default is 50) of physical RAM. Depending on the percentage

you use, in most situations this means a process will not be

killed while attempting to use already-allocated memory but

will receive errors on memory allocation as appropriate.

echo 1 > /proc/sys/vm/overcommit_memory

如何计算虚拟内存？

总虚拟内存 = 可用物理内存 × 百分比交换分区

#cat /proc/meminfo |grep -i commit

CommitLimit: 4114264 kB //最大可用虚拟内存

Committed_AS: 3821756 kB //已使用虚拟内存

overcommit_ratio

Percentage of physical memory size to include in overcommit calculations.

Memory allocation limit = swapspace physmem * (overcommit_ratio / 100)

swapspace = total size of all swap areas

physmem = size of physical memory in system

echo 100 > /proc/sys/vm/overcommit_ratio

zone_reclaim_mode

The hardware of some operating systems has a Non Uniform Memory Architecture (NUMA) penalty for remote memory access. If the NUMA penalty of an operating system is high enough, the operating system performs zone reclaim.

For example, an operating system allocates memory to a NUMA node, but the NUMA node is full. In this case, the operating system reclaims memory for the local NUMA node rather than immediately allocating the memory to a remote NUMA node. The performance benefit of allocating memory to the local node outweighs the performance drawback of reclaiming the memory. However, in some situations reclaiming memory decreases performance to the extent that the opposite is true. In other words, in these situations, allocating memory to a remote NUMA node generates better performance than reclaiming memory for the local node.

A guest operating system can sometimes cause zone reclaim to occur when you pin memory. For example, a guest operating system causes zone reclaim in the following situations:

When you configure the guest operating system to use huge pages.
When you use Kernel same-page merging (KSM) to share memory pages between guest operating systems.

Configuring huge pages and running KSM are both best practices for KVM environments. Therefore, to optimize performance in KVM environments, disable zone reclaim.

echo 0 > /proc/sys/vm/zone_reclaim_mode

echo 0 > /proc/sys/vm/swappiness
echo 10 > /proc/sys/vm/dirty_ratio
echo 4 > /proc/sys/vm/dirty_background_ratio
echo 4096 > /proc/sys/vm/min_free_kbytes
echo 5 > /proc/sys/vm/vfs_cache_pressure
echo 0 > /proc/sys/vm/laptop_mode
echo 0 > /proc/sys/vm/panic_on_oom
echo 1 > /proc/sys/vm/drop_caches
echo 3 > /proc/sys/vm/page-cluster
echo 1 > /proc/sys/vm/overcommit_memory
echo 100 > /proc/sys/vm/overcommit_ratio
echo 200 > /proc/sys/vm/dirty_expire_centisecs
echo 500 > /proc/sys/vm/dirty_writeback_centisecs
echo 0 > /proc/sys/vm/zone_reclaim_mode

阅读(2633) | 评论(0) | 转发(1) |

上一篇：Get a performance boost by backing your KVM guest with hugetlbfs

下一篇：uksm

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6