近日,一生产系统oracle服务器不知何原因,一天晚上负载异常高,最后的top记录显示平均负载500多,当晚就发现oracle挂了,因为之前实施部署的时候没有做规划,直接把oracle数据也都丢在了/根文件系统中,天有不测风云,这天晚上/文件系统出现了损坏,进入了file system read-only状态,所有操作都做不了了。
因为是根文件系统,随后,只能考虑重启看看是否有逻辑损坏,结果一起到/文件系统自检的时候通不过,提示有异常,类似如下。
“Your system appears to have shut down uncleanly”,“Press 'Y' in 1 seconds to force file system integrity check
随后,只能进入linux rescue模式(设置光盘启动,安装盘插入,安装界面boot出现后输入linux rescue,回车,按照向导即可)
进入shell之后,执行下列操作修复/文件系统。
# umount /
# fsck -fn /
可能出现一堆错误提示,类似如下:
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 16973826 has zero dtime. Fix? no
Inodes that were part of a corrupted orphan linked list found. Fix? no
Inode 16973829 was part of the orphaned inode list. IGNORED.
Inode 16973830 was part of the orphaned inode list. IGNORED.
Inode 16973831 was part of the orphaned inode list. IGNORED.
Inode 16973832 was part of the orphaned inode list. IGNORED.
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -10865691 -23558155
Fix? no
Free blocks count wrong for group #331 (10212, counted=10211).
Fix? no
Free blocks count wrong for group #689 (10, counted=2).
Fix? no
只不过在这次危机中,我们遇到的是两个文件大小不一致,最后执行了fsck -fy /
修复完成后,
mount /
最后reboot的时候,出现了下列错误:
init: /dev/initctl: no such file or directory
我NN的,别吓我,最后google找到了个解决方法,执行下列命令:
$mkfifo /dev/initctl
$reboot -f
服务器顺利启动,数据库正常,没有文件损坏。
阅读(2835) | 评论(0) | 转发(0) |