客户有台HP RX8640出现故障,跑是的双机环境,出问题是备机。问题定位在cell0 上的处理器,我们先给客户寄了一个经过测试的处理器,客户自己动手更换后,还是发生宕机。客户怀疑我们的处理器的是有问题,处理器寄回来了,我在RX7640上跑了很长时间,没有发现宕机事件或相关的告警信息。于是客户建议我们安排人过去现场调试。为以防万一,我带了二颗CPU,还有一块CELL板,如果真是CPU的问题,那么就换CPU,如cell板,那是就换板;没去之前,我感觉可能是CELL出问题。
From the crash analysis, we can see that there was a panic caused by spinlock deadlock.
And there was no crash dump for cpu 0&1, they belong to the same socket processor, they met a fault and didn’t release the lock as expected, then time out happened, and system panic.
The solution is to replace this processor on cell 0 socket 0.
** A system crash has occurred. (See the above messages for details.)
*** The system is now preparing to dump physical memory to disk, for use
*** in debugging the crash.
*** The dump will be compressed.
*** To change this dump type, press any key within 10 seconds.
*** Select one of the following dump types, by pressing the corresponding key:
C) The dump will be compressed.
S) The dump will be without compression.
N) There will be NO DUMP performed
*** Enter your selection now.
*** Unrecognized response. Please try again.
HP-UX Start-up in progress
Configure system crash dumps ........................................ OK
Removing old vxvm files ............................................. OK
VxVM INFO V-5-2-3360 VxVM device node check ......................... OK
VxVM INFO V-5-2-3362 VxVM general startup ........................... OK
VxVM INFO V-5-2-3366 VxVM reconfiguration recovery .................. OK
Mount file systems .................................................. OK
Setting hostname .................................................... OK
Start Kernel Logging facility ....................................... OK
Set privilege group ................................................. OK
Display date ........................................................ N/A
Save system crash dump if needed ....................................
阅读(4580) | 评论(0) | 转发(0) |