昨天晚上有台服务器crash了重启了。
需要用crash工具来分析当时服务器crash的状态。
我的OS环境是:CentOS release 6.3 (Final)
安装crash yum -y install crash
安装 kernel-debuginfo yum -y install kernel-debuginfo-2.6.32-902.279.9.1* ##这个版本取决于你的oskernel版本。
然后进入 /var/crash/ 找到相应的vmcore执行
crash 127.0.0.1-2014-01-21-23\:36\:14/vmcore /usr/lib/debug/lib/modules/2.6.32-902.279.9.1.***.el6.x86_64/vmlinux
输出如下:
KERNEL: /usr/lib/debug/lib/modules/2.6.32-902.279.9.1.***.el6.x86_64/vmlinux
DUMPFILE: 127.0.0.1-2014-01-21-23:36:14/vmcore [PARTIAL DUMP]
CPUS: 4
DATE: Tue Jan 21 23:34:53 2014
UPTIME: 380 days, 04:43:06
LOAD AVERAGE: 427.67, 324.75, 163.66 ##load 好高呀 427啦。。
TASKS: 1132
NODENAME: cdn.oss.***.com
RELEASE: 2.6.32-902.279.9.1.***.el6.x86_64
VERSION: #1 SMP Thu Sep 27 15:00:13 CST 2012
MACHINE: x86_64 (2266 Mhz)
MEMORY: 6 GB
PANIC: "[32794617.007664] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 1"
PID: 18431
COMMAND: "java"
TASK: ffff8801f98a2080 [THREAD_INFO: ffff8801f9b06000]
CPU: 1
STATE: TASK_RUNNING (PANIC)
然后执行bt
PID: 18431 TASK: ffff8801f98a2080 CPU: 1 COMMAND: "java"
#0 [ffff880028227b00] machine_kexec at ffffffff8103284b
#1 [ffff880028227b60] crash_kexec at ffffffff810ba962
#2 [ffff880028227c30] panic at ffffffff814fdb01
#3 [ffff880028227cb0] watchdog_overflow_callback at ffffffff810db5bd
#4 [ffff880028227cd0] __perf_event_overflow at ffffffff8110e1fd
#5 [ffff880028227d70] perf_event_overflow at ffffffff8110e7b4
#6 [ffff880028227d80] intel_pmu_handle_irq at ffffffff8101e976
#7 [ffff880028227e90] perf_event_nmi_handler at ffffffff815021b9
#8 [ffff880028227ea0] notifier_call_chain at ffffffff81503d05
#9 [ffff880028227ee0] atomic_notifier_call_chain at ffffffff81503d6a
#10 [ffff880028227ef0] notify_die at ffffffff810981de
#11 [ffff880028227f20] do_nmi at ffffffff81501983
#12 [ffff880028227f50] nmi at ffffffff81501290
[exception RIP: _spin_lock_irqsave+41]
RIP: ffffffff815009e9 RSP: ffff8800282233d0 RFLAGS: 00000097
RAX: 000000000000061b RBX: ffffffff81a974d0 RCX: 000000000000061a
RDX: 0000000000000006 RSI: 0000000000000002 RDI: ffffffff81fce528
RBP: ffff8800282233d0 R8: 0000000000000002 R9: 0000000000000f5a
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800282234d8
R13: ffff88000001dd80 R14: ffff8801f98a2080 R15: 0000000000000020
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- ---
#13 [ffff8800282233d0] _spin_lock_irqsave at ffffffff815009e9
#14 [ffff8800282233d8] __ratelimit at ffffffff8127870c
#15 [ffff8800282233f8] printk_ratelimit at ffffffff8106be55
#16 [ffff880028223408] __alloc_pages_nodemask at ffffffff81127816
#17 [ffff880028223528] kmem_getpages at ffffffff811620a2
#18 [ffff880028223558] fallback_alloc at ffffffff81162cba
#19 [ffff8800282235d8] ____cache_alloc_node at ffffffff81162a39
#20 [ffff880028223638] kmem_cache_alloc_node_notrace at ffffffff811638ff
#21 [ffff880028223678] __kmalloc_node at ffffffff81163b3b
#22 [ffff8800282236c8] __alloc_skb at ffffffff814304bd
#23 [ffff880028223718] refill_skbs at ffffffff8144f693
#24 [ffff880028223738] find_skb at ffffffff814504c5
#25 [ffff880028223768] netpoll_send_udp at ffffffff814507a6
#26 [ffff8800282237b8] write_msg at ffffffffa00d42eb [netconsole]
#27 [ffff880028223818] __call_console_drivers at ffffffff8106b935
#28 [ffff880028223848] _call_console_drivers at ffffffff8106b99a
#29 [ffff880028223868] release_console_sem at ffffffff8106bf68
#30 [ffff8800282238a8] vprintk at ffffffff8106c668
#31 [ffff880028223948] printk at ffffffff814fdc03
#32 [ffff8800282239a8] __ratelimit at ffffffff812787cf
#33 [ffff8800282239c8] printk_ratelimit at ffffffff8106be55
#34 [ffff8800282239d8] __alloc_pages_nodemask at ffffffff81127816
#35 [ffff880028223af8] kmem_getpages at ffffffff811620a2
#36 [ffff880028223b28] fallback_alloc at ffffffff81162cba
#37 [ffff880028223ba8] ____cache_alloc_node at ffffffff81162a39
#38 [ffff880028223c08] kmem_cache_alloc_node_notrace at ffffffff811638ff
#39 [ffff880028223c48] __kmalloc_node at ffffffff81163b3b
#40 [ffff880028223c98] __alloc_skb at ffffffff814304bd
#41 [ffff880028223ce8] __netdev_alloc_skb at ffffffff81430656
#42 [ffff880028223d08] bnx2_poll_work at ffffffffa01e4a3c [bnx2]
#43 [ffff880028223e18] bnx2_poll at ffffffffa01e5579 [bnx2]
#44 [ffff880028223e68] net_rx_action at ffffffff8143f503
#45 [ffff880028223ec8] __do_softirq at ffffffff81073f41
#46 [ffff880028223f38] call_softirq at ffffffff8100c24c
#47 [ffff880028223f50] do_softirq at ffffffff8100de85
#48 [ffff880028223f70] irq_exit at ffffffff81073d25
#49 [ffff880028223f80] do_IRQ at ffffffff815064d5
--- ---
#50 [ffff8801f9b07598] ret_from_intr at ffffffff8100ba53
[exception RIP: shrink_inactive_list+1636]
RIP: ffffffff8112e454 RSP: ffff8801f9b07648 RFLAGS: 00000282
RAX: 0000000000000003 RBX: ffff8801f9b077f8 RCX: ffff8800000261c0
RDX: 0000000000000001 RSI: 0000000000000016 RDI: ffff88000001dd80
RBP: ffffffff8100ba4e R8: 0000000000000024 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8800000261c0 R14: 0000000000000024 R15: 0000000000000000
ORIG_RAX: ffffffffffffff46 CS: 0010 SS: 0018
#51 [ffff8801f9b07640] shrink_inactive_list at ffffffff8112e302
#52 [ffff8801f9b07800] shrink_zone at ffffffff8112ee8f
#53 [ffff8801f9b078b0] do_try_to_free_pages at ffffffff8112f11e
#54 [ffff8801f9b07940] try_to_free_pages at ffffffff8112f72d
#55 [ffff8801f9b079f0] __alloc_pages_nodemask at ffffffff8112752d
#56 [ffff8801f9b07b10] alloc_pages_current at ffffffff8115c51a
#57 [ffff8801f9b07b40] __page_cache_alloc at ffffffff811147e7
#58 [ffff8801f9b07b70] __do_page_cache_readahead at ffffffff8112a40b
#59 [ffff8801f9b07c00] ra_submit at ffffffff8112a561
#60 [ffff8801f9b07c10] ondemand_readahead at ffffffff8112a8d5
#61 [ffff8801f9b07c70] page_cache_sync_readahead at ffffffff8112aaf3
#62 [ffff8801f9b07c80] generic_file_aio_read at ffffffff81116168
#63 [ffff8801f9b07d60] xfs_file_aio_read at ffffffffa02776df [xfs]
#64 [ffff8801f9b07dc0] do_sync_read at ffffffff8117b1ea
#65 [ffff8801f9b07ef0] vfs_read at ffffffff8117bbf5
#66 [ffff8801f9b07f30] sys_read at ffffffff8117bd31
#67 [ffff8801f9b07f80] system_call_fastpath at ffffffff8100b0f2
RIP: 0000003fa340e54d RSP: 00007f788db54250 RFLAGS: 00000202
RAX: 0000000000000000 RBX: ffffffff8100b0f2 RCX: 0000000782c323e0
RDX: 0000000000001000 RSI: 00007f788db52310 RDI: 00000000000001dc
RBP: 00007f788db522f0 R8: 00007f78bcb0b500 R9: 00000007a21b13a8
R10: 0000000000019cfc R11: 0000000000000293 R12: 00007f788db52310
R13: 0000000000001000 R14: 0000000000001000 R15: 0000000000001000
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
然后执行
crash> dis -l ffff8801f98a2080
dis: WARNING: ffff8801f98a2080: no associated kernel symbol found
0xffff8801f98a2080: add %al,(%rax)
######
到这里了,汇编了。。。不懂了。。
阅读(4512) | 评论(1) | 转发(0) |