Chinaunix首页 | 论坛 | 博客
  • 博客访问: 326917
  • 博文数量: 78
  • 博客积分: 1322
  • 博客等级: 中尉
  • 技术积分: 680
  • 用 户 组: 普通用户
  • 注册时间: 2010-04-14 13:24
文章分类
文章存档

2012年(20)

2011年(55)

2010年(3)

分类:

2011-11-01 12:57:07

原文地址:zone的几个水位值 作者:flw2

linux伙伴系统基于zone,每个zone有几个水位,用于决策不同条件下分配页的策略,特别是在极端条件下,这些值比较重要


所有的计算都源于一个值,min_free_kbytes;  该值确保在内存极端短缺的情况下,内核也能分配到一些内存以完成必要的工作
Documentation/sysctl/vm.txt中的介绍

min_free_kbytes:

This is used to force the Linux VM to keep a minimum number
of kilobytes free.  The VM uses this number to compute a
watermark[WMARK_MIN] value for each lowmem zone in the system.
Each lowmem zone gets a number of reserved free pages based
proportionally on its size.


Some minimal amount of memory is needed to satisfy PF_MEMALLOC
allocations; if you set this to lower than 1024KB, your system will
become subtly broken, and prone to deadlock under high loads.

Setting this too high will OOM your machine instantly.

它可以用/proc/sys/vm/min_free_kbytes修改和读取


  1. min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
lowmem_kbytes是低端内存的大小,即ZONE_DMA和ZONE_NORMAL,通常是896

对应的min_free_kbytes为

  1. xcm@u32:/u64/home/xcm/source/linux-2.6.37$ cat /proc/sys/vm/min_free_kbytes
  2. 3798
该值到底该多少比较合适我也不太清楚,一般没人会去改这个值,就当默认值比较合适吧
但是确实允许修改,如果好奇也可以尝试一下


DMA和NORMAL的min值根据比例分配总共3798的min_free_kbytes
low = min*5/4
high = min*3/2
计算在setup_per_zone_wmarks 函数中
HIGHMEM区域按别的方法计算

DMA free:15904kB min:64kB low:80kB high:96kB
present:15796kB
lowmem_reserve[]: 0 865 1983

Normal free:739872kB min:3728kB low:4660kB high:5592kB
present:885944kB
lowmem_reserve[]: 0 0 8943

HighMem free:278048kB min:512kB low:1716kB high:2920kB

lowmem_reserve[]: 0 0 0


lowmem_reserve数组用于进一步保留低端内存,比如如果因为请求分配HIGHMEM而没有足够内存使得需要从NORMAL区域分配和直接要求NORMAL区域分配是不一样的
zone_watermark_ok函数会考虑lowmem_reserve

  1. /*
  2.  * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the ZONE_NORMAL zone, etc.
  3.  */
  4. #define zone_idx(zone)        ((zone) - (zone)->zone_pgdat->node_zones)

  1. int zone_watermark_ok(struct zone *z, int order, unsigned long mark,
  2.          int classzone_idx, int alloc_flags)
  3. {
  4.     /* free_pages my go negative - that's OK */
  5.     long min = mark;
  6.     long free_pages = zone_nr_free_pages(z) - (1 << order) + 1;
  7.     int o;

  8.     if (alloc_flags & ALLOC_HIGH)
  9.         min -= min / 2;
  10.     if (alloc_flags & ALLOC_HARDER)
  11.         min -= min / 4;

  12.     if (free_pages <= min + z->lowmem_reserve[classzone_idx])
  13.         return 0;
  14.     ...
  15. }


lowmem_reserve的具体计算在 setup_per_zone_lowmem_reserve 函数中

  1. xcm@u32:/u64/home/xcm/source/linux-2.6.37$ cat /proc/sys/vm/lowmem_reserve_ratio
  2. 256    32    32

  1. for_each_online_pgdat(pgdat) {
  2.         for (j = 0; j < MAX_NR_ZONES; j++) {
  3.             struct zone *zone = pgdat->node_zones + j;
  4.             unsigned long present_pages = zone->present_pages;

  5.             zone->lowmem_reserve[j] = 0;

  6.             idx = j;
  7.             while (idx) {
  8.                 struct zone *lower_zone;

  9.                 idx--;

  10.                 if (sysctl_lowmem_reserve_ratio[idx] < 1)
  11.                     sysctl_lowmem_reserve_ratio[idx] = 1;

  12.                 lower_zone = pgdat->node_zones + idx;
  13.                 lower_zone->lowmem_reserve[j] = present_pages /
  14.                     sysctl_lowmem_reserve_ratio[idx];
  15.                 present_pages += lower_zone->present_pages;
  16.             }
  17.         }
  18.     }

DMA:
[0] = 0;
[1] = present_pages_of_normal/256 = (885944KB/4)/256 = 865
[2] = present_pages_of_normal_and_high/256 = (885944/4 + 1144720/4)/256 = 1983

NORMAL:
[0] = 0;
[1] = 0;
[2] =  (1144720/4)/32 = 8943

具体的比例也可以修改,256,32,32是默认值

这样,当请求HIGHMEM最后落到NORMAL时,水位会加上8943,而落到DMA时,水位会加上1983
也就是说,对于一共才有16M的DMA区,至少要求它还有2M左右的内存才会满足HIGHMEM的请求
否则要发起内存回收

成比例是有意义的
比如如果 HIGHMEM 只有100M的内存,如果请求HIGHMEM,结果到了NORMAL,就应该尽量满足(NORMAL->lowmem_reserve[2] 较小)
而如果HIGHMEM 有2G,NORMAL->lowmem_reserve[2] 就尽量多保留


阅读(929) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~