Chinaunix首页 | 论坛 | 博客
  • 博客访问: 482131
  • 博文数量: 135
  • 博客积分: 1860
  • 博客等级: 上尉
  • 技术积分: 1441
  • 用 户 组: 普通用户
  • 注册时间: 2008-01-05 20:39
文章分类
文章存档

2012年(2)

2011年(130)

2009年(2)

2008年(1)

我的朋友

分类: LINUX

2011-10-26 11:32:46

问题描述: When Lisa try to run the little tool `runmem’ on her test Appliance domain 0, after a few repeat, the java tomcat server process was killed by system while runmem process can obtain its 200 MB memory. At the critical point, there was still nearly 600 MB memory free.

 

-          Runmem is a little tool I developed for testing, it will consume 200MB RAM in one process, and another 200MB when you run another runmem process …

-          The testing Appliance domain 0 system has its separately 4GB physical memory but no swap

-          the critical point means it is ok now, but once you start a runmem process, the java tomcat process will be killed.

 

研究发现:After some time for researching, I found that linux has a self-safeguard function named oom-killer ( out of memory killer), this function will kill a process when linux can’t provide enough mem for some other process’ new mem request. But we see there was 600MB free mem in this case, so, why oom-killer killing java tomcat process? (we see oom-killer performed from /var/log/message)

 

Use /proc/zoneinfo, we can get more detail about memory. There are 3 zones in here, DMA zone, Normal zone and HighMem zone. (see bellow.), user space app will use HighMem zone while kernel space process will use normal zone memory.

[root@lisa1043 mem and swap]# cat /proc/zoneinfo
Node 0, zone      DMA
  pages free     3160
        min      17
        low      21
        high     25
        active   0
        inactive 0
        scanned  0 (a: 17 i: 17)
        spanned  4096
        present  4096
    nr_anon_pages 0
    nr_mapped    1
    nr_file_pages 0
    nr_slab      0
    nr_page_table_pages 0
    nr_dirty     0
    nr_writeback 0
    nr_unstable  0
    nr_bounce    0
        protection: (0, 0, 851, 24149)
  pagesets
  all_unreclaimable: 1
  prev_priority:     12
  start_pfn:         0
Node 0, zone   Normal
  pages free     137757
        min      924
        low      1155
        high     1386
        active   48
        inactive 35
        scanned  97 (a: 5 i: 7)
        spanned  218110
        present  218110
    nr_anon_pages 0
    nr_mapped    1
    nr_file_pages 80
    nr_slab      4052
    nr_page_table_pages 1827
    nr_dirty     0
    nr_writeback 0
    nr_unstable  0
    nr_bounce    0
        protection: (0, 0, 0, 186383)
  pagesets
    cpu: 0 pcp: 0
              count: 9
              high:  186
              batch: 31
    cpu: 0 pcp: 1
              count: 61
              high:  62
              batch: 15
  vm stats threshold: 24
    cpu: 1 pcp: 0
              count: 46
              high:  186
              batch: 31
    cpu: 1 pcp: 1
              count: 59
              high:  62
              batch: 15
  vm stats threshold: 24
    cpu: 2 pcp: 0
              count: 60
              high:  186
              batch: 31
    cpu: 2 pcp: 1
              count: 51
              high:  62
              batch: 15
  vm stats threshold: 24
    cpu: 3 pcp: 0
              count: 121
              high:  186
              batch: 31
    cpu: 3 pcp: 1
              count: 53
              high:  62
              batch: 15
  vm stats threshold: 24
  all_unreclaimable: 0
  prev_priority:     2
  start_pfn:         4096
Node 0, zone  HighMem
  pages free     11114
        min      128
        low      6449
        high     12770
        active   795251
        inactive 381
        scanned  297953 (a: 0 i: 20)
        spanned  5964270
        present  5964270
    nr_anon_pages 793116
    nr_mapped    2155
    nr_file_pages 2494
    nr_slab      0
    nr_page_table_pages 0
    nr_dirty     38
    nr_writeback 0
    nr_unstable  0
    nr_bounce    0
        protection: (0, 0, 0, 0)
  pagesets
    cpu: 0 pcp: 0
              count: 152
              high:  186
              batch: 31
    cpu: 0 pcp: 1
              count: 0
              high:  62
              batch: 15
  vm stats threshold: 54
    cpu: 1 pcp: 0
              count: 184
              high:  186
              batch: 31
    cpu: 1 pcp: 1
              count: 5
              high:  62
              batch: 15
  vm stats threshold: 54
    cpu: 2 pcp: 0
              count: 71
              high:  186
              batch: 31
    cpu: 2 pcp: 1
              count: 3
              high:  62
              batch: 15
  vm stats threshold: 54
    cpu: 3 pcp: 0
              count: 22
              high:  186
              batch: 31
    cpu: 3 pcp: 1
              count: 5
              high:  62
              batch: 15
  vm stats threshold: 54
  all_unreclaimable: 0
  prev_priority:     2
  start_pfn:         222206

 

In this case we see Highmem zone has 11114 free pages (11114*4/1024 = 43 MB), but runmem request 200 MB per time. So it turn to Normal zone. This time we should notice this  pages free     137757 &  protection: (0, 0, 0, 186383) in normal zone and 137757 < 186383, so oom-killer is awaken to found a process to kill, then it found java tomcat.

 

Note: those protected memory can be alloced by a kernel space process.

 

Seems we got the reason, is there any way to fix or optimize this, that’s say we want there are more available memory for user space app?

 

There’s a way we can adjust /proc/sys/vm/lowmem_reserve_ratio, the bigger value in here, the lower value protection will display. You can check this like:

 

# echo "1024 1024 256" > /proc/sys/vm/lowmem_reserve_ratio
# cat /proc/zoneinfo | grep protection

 

I think the way that return null to user app  *alloc function when there is no more memory to be alloc is a better way because most user app can handle is exception. But now, I don’t got how to do this, I will keep going on this… and welcome your interesting.

阅读(2835) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~