分类: 云计算
2016-08-20 16:08:53
使用了某开源文件服务,某个进程占用内存持续增长,进程启动后用top命令查看虚拟内存高达17.8G物理内存占用500M,运行一段时间后,物理内存占用高达3G,会导致可用内存为0,分配内存失败。进程重启后,内存释放。
内存持续增长,应该存储在内存泄漏,利用工具分析内存泄漏处
1、 第一次用valgrind工具检查内存泄漏处
A、用命令以下命令启动进程
valgrind --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=40 --show-reachable=yes --log-file=/home/xxxxx/memcheck.log /usr/local/bin/ffff /etc/ffff /fffd.conf start
B、查看valgrind执行结果,果然有sched_thread.c:400 和ddf_binlog_read (sync.c:1517)2处内存泄漏,malloc后没有free
==21056== LEAK SUMMARY:
==21056== definitely lost: 1,317 bytes in 2 blocks
==21056== indirectly lost: 0 bytes in 0 blocks
==21056== possibly lost: 0 bytes in 0 blocks
==21056== still reachable: 17,188,006,888 bytes in 65,564 blocks
==21056== suppressed: 0 bytes in 0 blocks
==21056== 288 bytes in 1 blocks are definitely lost in loss record 5 of 10
==21056== at 0x4C27A2E: malloc (vg_replace_malloc.c:270)
==21056== by 0x40D397: sched_dup_array (sched_thread.c:400)
==21056== by 0x40D4CA: sched_start (sched_thread.c:487)
==21056== by 0x403629: main (fffd.c:499)
==21056== 1,029 bytes in 1 blocks are definitely lost in loss record 6 of 10
==21056== at 0x4C27A2E: malloc (vg_replace_malloc.c:270)
==21056== by 0x41A793: ffft_binlog_read (sync.c:1517)
==21056== by 0x41C707: ffft_sync_thread_entrance (sync.c:1785)
==21056== by 0x51AAA50: start_thread (in /lib64/libpthread-2.12.so)
==21056== by 0x41E76FF: ???
C、找到对应的代码修改,内存泄漏处
2、 修改后再次用valgrind检查
A、查看valgrind检查结果,没有内存泄漏,但内存依然持续增长。为什么?难道是缓存了
==14069== LEAK SUMMARY:
==14069== definitely lost: 0 bytes in 0 blocks
==14069== indirectly lost: 0 bytes in 0 blocks
==14069== possibly lost: 544 bytes in 2 blocks
==14069== still reachable: 17,189,181,517 bytes in 65,780 blocks
==14069== suppressed: 0 bytes in 0 blocks
B、强制释放缓存,执行echo "3">/proc/sys/vm/drop_caches,用free –g 查看可用内存没有增加,那就说明没有使用缓存
1、继续分析valgrind检查结果,发现有处分配的内存很大,多大17G左右,这个和用top命令看到的虚拟内存基本一致
==14069== 17,179,607,040 bytes in 65,535 blocks are still reachable in loss record 51 of 51
==14069== at 0x4C27A2E: malloc (vg_replace_malloc.c:270)
==14069== by 0x4112E8: malloc_mpool (fff_task_queue.c:86)
==14069== by 0x4115BE: free_queue_init (ffft_task_queue.c:211)
==14069== by 0x41972B: work_thread_init (work_thread.c:95)
==14069== by 0x403273: main (hhtf.c:227)
2、 查看对应代码分析,这个是接收消息处理队列初始化,其内存分配与配置最大并发连接数有关。
3、检查配置文件max_connections=65535,用netstat –an|grep 11411 查看目前实际使用连接数不到300 个链接
4、修改配置文件max_connections=512 ,重启进程,运行一段时间发现物理内存减少。占用内存比正常。
1、 要善于使用工具来分析问题,解决问题
2、 在使用开源软件时要熟悉,每个配置项的意义,做到合理配值,最好能熟读源码,便于分析问题,解决问题
3、 服务迁移到不同的主机,也需要优化,对配置项参数值调整