/boot on software raid1故障恢复实验-netyu-ChinaUnix博客

勇往直前的飞鱼 Power余长发 netyu.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

netyu

博客访问： 2057940
博文数量： 593
博客积分： 20034
博客等级：上将
技术积分： 6779
用户组：普通用户
注册时间： 2006-02-06 14:07

文章分类

全部博文（593）

虚拟化&云计算机（50）
IBM/ZOS（6）
BSD/linux（0）
英语旅途（6）
心情日志（18）
IBMpSeries（0）
storage（0）
databases（16）
人生篇（4）
未分配的博文（493）

文章存档

2016年（1）

2011年（101）

2010年（80）

2009年（10）

2008年（102）

2007年（16）

2006年（283）

我的朋友

send_lin

相关博文

/boot on software raid1故障恢复实验

分类： LINUX

2008-11-12 02:02:12

最近做了个测试服务器,系统全安装两块RAID1盘上,其余的三块盘做的RAID5存放数据.结果发现有时系统能正常启动.有时系统则是启动不了.

在网络上找到一篇文章.

实验目的：测试一下 3.0.5 的root partition安装在software raid1分区上的故障恢复，以解决原来服务器上的hostraid不支持的麻烦

实验环境：Vmware的虚拟机，guest os配置了两块2048M的磁盘， 3.0.5系统分三个区（/boot,/,swap)都安装在software raid1分区上，分区状况如下：

Disk /dev/sda: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 13 104391 fd Linux raid autodetect
/dev/sda2 14 78 522112+ fd Linux raid autodetect
/dev/sda3 79 261 1469947+ fd Linux raid autodetect
Disk /dev/sdb: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 13 104391 fd Linux raid autodetect
/dev/sdb2 14 78 522112+ fd Linux raid autodetect
/dev/sdb3 79 261 1469947+ fd Linux raid autodetect
Personalities : [raid1]
md1 : active raid1 sdb2[1] sda2[0]
522048 blocks [2/2] [UU]
md2 : active raid1 sdb3[1] sda3[0]
1469824 blocks [2/2] [UU]
md0 : active raid1 sdb1[1] sda1[0]
104320 blocks [2/2] [UU]
Filesystem Size Used Avail Use% Mounted on
/dev/md2 1.4G 775M 651M 55% /
/dev/md0 99M 7.3M 87M 8% /boot
/dev/md1作为swap分区

模拟故障：将虚拟机关机后，直接在虚拟机的配置里面remove掉原有的scsi0:0

故障恢复：

因为此时的引导磁盘为原有系统的/dev/sdb，而 3.0.5在系统默认安装的时候，是不会往第一块磁盘之外的磁盘的mbr写grub信息的，所以这时候系统是启动不了的。当然了，如果是在生产服务器上，就算坏了一块磁盘也是不停机的，就不会遇到这个问题。
在要停机的前提下，为了解决这个问题，在系统安装完毕后，记得要进入要在第二块硬盘上安装grub信息：
1. root@localhost ~# grub
2. grub> install (hd0,0)/grub/stage1 d (hd1) (hd0,0)/grub/stage2 p (hd0,0)/grub/grub.conf
就可以在系统重启之后正常进入系统。
根据/dev/sda的分区状况把新增加的磁盘分区
1. 查看原有的分区情况：
  1. root@localhost ~# fdisk -l /dev/sda
  3. Disk /dev/sda: 2147 MB, 2147483648 bytes
  4. 255 heads, 63 sectors/track, 261 cylinders
  5. Units = cylinders of 16065 * 512 = 8225280 bytes
  7. Device Boot Start End Blocks Id System
  8. /dev/sda1 * 1 13 104391 fd Linux raid autodetect
  9. /dev/sda2 14 78 522112+ fd Linux raid autodetect
  10. /dev/sda3 79 261 1469947+ fd Linux raid autodetect
2. 为新增加的磁盘分区，并将新分的分区格式更改为Linux raid autodetect:
  1. root@localhost ~# fdisk /dev/sdb
  2. Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
  3. Building a new DOS disklabel. Changes will remain in memory only,
  4. until you decide to write them. After that, of course, the previous
  5. content won't be recoverable.
  7. Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
  9. Command (m for help): n
  10. Command action
  11. e extended
  12. p primary partition (1-4)
  13. p
  14. Partition number (1-4): 1
  15. First cylinder (1-261, default 1):
  16. Using default value 1
  17. Last cylinder or +size or +sizeM or +sizeK (1-261, default 261): 13
  19. Command (m for help): n
  20. Command action
  21. e extended
  22. p primary partition (1-4)
  23. p
  24. Partition number (1-4): 2
  25. First cylinder (14-261, default 14):
  26. Using default value 14
  27. Last cylinder or +size or +sizeM or +sizeK (14-261, default 261): 78
  29. Command (m for help): n
  30. Command action
  31. e extended
  32. p primary partition (1-4)
  33. p
  34. Partition number (1-4): 3
  35. First cylinder (79-261, default 79):
  36. Using default value 79
  37. Last cylinder or +size or +sizeM or +sizeK (79-261, default 261):
  38. Using default value 261
  40. Command (m for help): t
  41. Partition number (1-4): 1
  42. Hex code (type L to list codes): fd
  43. Changed system type of partition 1 to fd (Linux raid autodetect)
  45. Command (m for help): t
  46. Partition number (1-4): 2
  47. Hex code (type L to list codes): fd
  48. Changed system type of partition 2 to fd (Linux raid autodetect)
  50. Command (m for help): t
  51. Partition number (1-4): 3
  52. Hex code (type L to list codes): fd
  53. Changed system type of partition 3 to fd (Linux raid autodetect)
  55. Command (m for help): w
  56. The partition table has been altered!
  58. Calling ioctl() to re-read partition table.
  59. Syncing disks.
将新切出的分区加入raid
1. root@localhost ~# mdadm /dev/md0 -a /dev/sdb1
2. mdadm: added /dev/sdb1
3. root@localhost ~# mdadm /dev/md1 -a /dev/sdb2
4. mdadm: added /dev/sdb2
5. root@localhost ~# mdadm /dev/md2 -a /dev/sdb3
6. mdadm: added /dev/sdb3
查看刚回复的raid正在rebuild：
1. root@localhost ~# more /proc/mdstat
2. Personalities : [raid1]
3. md1 : active raid1 sdb2[1] sda2[0]
4. 522048 blocks [2/2] [UU]
6. md2 : active raid1 sdb3[2] sda3[0]
7. 1469824 blocks [2/1] [U_]
8. [===========>.........] recovery = 55.8% (821632/1469824) finish=0.1min s
9. peed=63202K/sec
11. md0 : active raid1 sdb1[1] sda1[0]
12. 104320 blocks [2/2] [UU]
14. unused devices:
查看到raid已经恢复正常：
1. root@localhost ~# more /proc/mdstat
2. Personalities : [raid1]
3. md1 : active raid1 sdb2[1] sda2[0]
4. 522048 blocks [2/2] [UU]
6. md2 : active raid1 sdb3[1] sda3[0]
7. 1469824 blocks [2/2] [UU]
9. md0 : active raid1 sdb1[1] sda1[0]
10. 104320 blocks [2/2] [UU]
12. unused devices:
最后记得将新的raid配置保存到/etc/mdadm.conf，要不然系统重新引导之后不能恢复raid配置：
1. root@localhost ~# mdadm -Ds >/etc/mdadm.conf

实验结论：因尚未测试software raid1对系统性能带来的影响，所以只能肯定在对性能要求不是太高、而对数据安全性要求优先的情况下，将整个linux部署在software raid 1上是完全可行的。

阅读(1001) | 评论(0) | 转发(0) |

上一篇：/boot on software raid1故障恢复实验

下一篇：用grub引导software raid1上的Linux

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6