RAID5两快盘出现黄灯后的恢复
环境:IBM P610主机+4channel scsi raid 卡,4个18Gdisk作raid5,没有hotspare.
第一天:客户说有一个盘黄灯亮
第三天:另外一个盘黄灯亮
#lspv的时候输出只有
hdisk0 000b85cdf79f0ec5 rootvg
而做完RAID后的disk没有了
#lsdev -Cc disk
hdisk0 Available 10-60-00-0,0 16 Bit LVD SCSI Disk Drive
hdisk1 Defined 20-60-00-0,0 SCSI Disk Array RAID 5
VG不能varyon
diag运行出现下面的错误.
The Service Request Number(s)/Probable Cause(s)
(causes are listed in descending order of probability):
66D-111: The disk has been failed by the adapter.
FRU: n/a [18CCH/ID 2B
Physical Disk
66D-111: The disk has been failed by the adapter.
FRU: n/a [18CCH/ID 2C
Physical Disk
操作步骤:
1\#smitty pdam
List PCI SCSI Disk Arrays
Create a PCI SCSI Disk Array
Delete a PCI SCSI Disk Array
Configure a Defined PCI SCSI Disk Array
Change/Show a PCI SCSI Disk Array
Reconstruct a PCI SCSI Disk Array
Revive a FAILED Drive in a PCI SCSI Disk Array
Fail a Drive in a PCI SCSI Disk Array
Change/Show PCI SCSI RAID Drive Status
Perform Consistency Check
Display Status of Adapter Write Cache
Recovery Options\
选择List PCI SCSI Disk Arrays
7mscraid0 Available 20-60 PCI 4-Channel Ultra3 SCSI RAID Adapter[出现选择]
hdisk1 Defined Raid 5 20-60-00-0,0 52072 MB Status DEAD
hdisk1 2A Channel 2 ID A ONLINE
hdisk1 2B Channel 2 ID B FAILED DRIVE
hdisk1 2C Channel 2 ID C FAILED DRIVE
hdisk1 2D Channel 2 ID D ONLINE
2\#smitty pdam
Revive a FAILED Drive in a PCI SCSI Disk Array[目的强行将硬盘给online[/color:67f1065d61]]
2B Channel 2 ID B FAILED DRIVE
2C Channel 2 ID C FAILED DRIVE
选择2B Channel 2 ID B FAILED DRIVE
PCI SCSI Disk Array hdisk1
Channel ID C2B
会车后出现下面的提示
Continuing may delete information you may want
to keep. This is your last chance to stop
before continuing. [26;19HPress Enter to continue.
Press Cancel to return to the application
敲会车键继续[ 确认命令完成是ok的 ]
3\通过List PCI SCSI Disk Arrays查看RAID的状态
出现scraid0 Available 20-60 PCI 4-Channel Ultra3 SCSI RAID Adapter[进行选择]
hdisk1 Defined Raid 5 20-60-00-0,0 52072 MB Status DEGRADED
hdisk1 2A Channel 2 ID A ONLINE
hdisk1 2B Channel 2 ID B ONLINE
hdisk1 2C Channel 2 ID C FAILED DRIVE
hdisk1 2D Channel 2 ID D ONLINE
注意在这是2B已经是online.
4\执行diag命令进行诊断,结果显示入下
The Service Request Number(s)/Probable Cause(s)
causes are listed in descending order of probability):
66D-111: The disk has been failed by the adapter.
FRU: n/a CH/ID 2B
Physical Disk
5\#varyonvg datavg
此时datavg能够varyon,同时文件系统可以mount上,
6\换上2CChannel上的硬盘,RAID进行数据重建.重建完成后ok
#smitty pdam
List PCI SCSI Disk Arrays
结果如下:
hdisk1 Available Raid 5 20-60-00-0,0 52072 MB Status OPTIMAL
hdisk1 2A Channel 2 ID A ONLINE - 17357Meg
hdisk1 2B Channel 2 ID B ONLINE - 17357Meg
hdisk1 2C Channel 2 ID C ONLINE - 17357Meg
hdisk1 2D Channel 2 ID D ONLINE - 17357Meg
7\#varyonvg datavg[结果显示ok]
8\#fsck -y /dev/datalv
** Checking /dev/rdatalv (/orada)
** Phase 0 - Check Log
log redo processing for /dev/rdatalv
** Phase 1 - Check Blocks and Sizes
Block count wrong, Inode=16388 (ADJUSTED)
Fragment allocated to file larger than 32k (Inode=16664)
Fragment allocated to file larger than 32k (Inode=16665)
Fragment allocated to file larger than 32k (Inode=16666)
Fragment allocated to file larger than 32k (Inode=16670)
Fragment allocated to file larger than 32k (Inode=16671)
Unknown file type I=16785 owner=root mode=0
size=0 mtime=Jan 18 21:05 1970 (CLEARED)
.......
.......
.......
size=0 mtime=Jan 01 08:00 1970 (CLEARED)
** Phase 5 - Check Inode Map
Bad Inode Map (SALVAGED)
** Phase 5b - Salvage Inode Map
** Phase 6 - Check Block Map
Bad Block Map (SALVAGED)
** Phase 6b - Salvage Block Map
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
map agsize bad, vm1->agsize = -16385 agrsize = 16384
-1 blocks missing
-1 blocks missing
Superblock is marked dirty (FIXED)
-430 files 70114432 blocks 53128488 free
***** Filesystem was modified *****
9\#mount /oradata
进行读些测试,结果ok.
在os一级ok,文件系统能正常进行读些.
阅读(328) | 评论(0) | 转发(0) |