linux伙伴系统基于zone,每个zone有几个水位,用于决策不同条件下分配页的策略,特别是在极端条件下,这些值比较重要
所有的计算都源于一个值,min_free_kbytes; 该值确保在内存极端短缺的情况下,内核也能分配到一些内存以完成必要的工作
Documentation/sysctl/vm.txt中的介绍
min_free_kbytes:
This is used to force the Linux VM to keep a minimum number
of kilobytes free. The VM uses this number to compute a
watermark[WMARK_MIN] value for each lowmem zone in the system.
Each lowmem zone gets a number of reserved free pages
based
proportionally on its size.Some minimal amount of memory is needed to satisfy PF_MEMALLOC
allocations; if you set this to lower than 1024KB, your system will
become subtly broken, and prone to deadlock under high loads.
Setting this too high will OOM your machine instantly.
它可以用/proc/sys/vm/min_free_kbytes修改和读取
- min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
lowmem_kbytes是低端内存的大小,即ZONE_DMA和ZONE_NORMAL,通常是896
对应的min_free_kbytes为
- xcm@u32:/u64/home/xcm/source/linux-2.6.37$ cat /proc/sys/vm/min_free_kbytes
-
3798
该值到底该多少比较合适我也不太清楚,一般没人会去改这个值,就当默认值比较合适吧
但是确实允许修改,如果好奇也可以尝试一下
DMA和NORMAL的min值根据比例分配总共3798的min_free_kbytes
low = min*5/4
high = min*3/2
计算在setup_per_zone_wmarks 函数中
HIGHMEM区域按别的方法计算
DMA free:15904kB min:64kB low:80kB high:96kB
present:15796kB
lowmem_reserve[]: 0 865 1983
Normal free:739872kB min:3728kB low:4660kB high:5592kB
present:885944kB
lowmem_reserve[]: 0 0 8943
HighMem free:278048kB min:512kB low:1716kB high:2920kB
lowmem_reserve[]: 0 0 0
lowmem_reserve数组用于进一步保留低端内存,比如如果因为请求分配HIGHMEM而没有足够内存使得需要从NORMAL区域分配和直接要求NORMAL区域分配是不一样的
zone_watermark_ok函数会考虑lowmem_reserve
- /*
-
* zone_idx() returns 0 for the ZONE_DMA zone, 1 for the ZONE_NORMAL zone, etc.
-
*/
-
#define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones)
- int zone_watermark_ok(struct zone *z, int order, unsigned long mark,
-
int classzone_idx, int alloc_flags)
-
{
-
/* free_pages my go negative - that's OK */
-
long min = mark;
-
long free_pages = zone_nr_free_pages(z) - (1 << order) + 1;
-
int o;
-
-
if (alloc_flags & ALLOC_HIGH)
-
min -= min / 2;
-
if (alloc_flags & ALLOC_HARDER)
-
min -= min / 4;
-
-
if (free_pages <= min + z->lowmem_reserve[classzone_idx])
-
return 0;
- ...
- }
lowmem_reserve的具体计算在 setup_per_zone_lowmem_reserve 函数中
- xcm@u32:/u64/home/xcm/source/linux-2.6.37$ cat /proc/sys/vm/lowmem_reserve_ratio
-
256 32 32
- for_each_online_pgdat(pgdat) {
-
for (j = 0; j < MAX_NR_ZONES; j++) {
-
struct zone *zone = pgdat->node_zones + j;
-
unsigned long present_pages = zone->present_pages;
-
-
zone->lowmem_reserve[j] = 0;
-
-
idx = j;
-
while (idx) {
-
struct zone *lower_zone;
-
-
idx--;
-
-
if (sysctl_lowmem_reserve_ratio[idx] < 1)
-
sysctl_lowmem_reserve_ratio[idx] = 1;
-
-
lower_zone = pgdat->node_zones + idx;
-
lower_zone->lowmem_reserve[j] = present_pages /
-
sysctl_lowmem_reserve_ratio[idx];
-
present_pages += lower_zone->present_pages;
-
}
-
}
-
}
DMA:
[0] = 0;
[1] = present_pages_of_normal/256 = (885944KB/4)/256 = 865
[2] = present_pages_of_normal_and_high/256 = (885944/4 + 1144720/4)/256 = 1983
NORMAL:
[0] = 0;
[1] = 0;
[2] = (1144720/4)/32 = 8943
具体的比例也可以修改,256,32,32是默认值
这样,当请求HIGHMEM最后落到NORMAL时,水位会加上8943,而落到DMA时,水位会加上1983
也就是说,对于一共才有16M的DMA区,至少要求它还有2M左右的内存才会满足HIGHMEM的请求
否则要发起内存回收
成比例是有意义的
比如如果 HIGHMEM 只有100M的内存,如果请求HIGHMEM,结果到了NORMAL,就应该尽量满足(NORMAL->lowmem_reserve[2] 较小)
而如果HIGHMEM 有2G,NORMAL->lowmem_reserve[2] 就尽量多保留
阅读(6001) | 评论(0) | 转发(5) |