欢迎加入IT云增值在线QQ交流群:342584734
分类:
2006-03-18 21:16:31
If the system has to be booted from a cdrom or from the network ("boot cdrom" or "boot net") in order to perform maintenance, the operator needs to adjust for the existence of a mirrored operating system. Because these alternate boot devices do not include the drivers necessary for veritas volume manager, they cannot be used to operate on veritas volumes. This raises subtle issues addressed below.
Typically, the administrator is often under pressure while performing these types of maintenance. Because simple mistakes at this stage can render the system unusable, it is important that the process be well documented and tested prior to using it in production.
In the example below, the server pegasus has two internal disks (c0t0d0 and c0t1d0) under VxVM control. The operating system is mirrored between the two devices. Assume that the administrator has forgotten the root password on this server, and needs to boot from cdrom in order to edit the shadow file.
Password forgotten! Insert the Solaris operating system CD into the cdrom drive and boot from it into single-user mode:
pegasus console login: root Password: Login incorrect Oct 25 17:33:44 pegasus login: REPEATED LOGIN FAILURES ON /dev/console, root pegasus console login: root Password: Login incorrect pegasus console login: Type 'go' to resume ok boot cdrom -s Resetting ... screen not found. Can't open input device. Keyboard not present. Using ttya for input and output. Sun Ultra 30 UPA/PCI (UltraSPARC-II 296MHz), No Keyboard OpenBoot 3.27, 512 MB memory installed, Serial #9377973. Ethernet address 8:0:20:8f:18:b5, Host ID: 808f18b5. Initializing Memory Rebooting with command: boot cdrom -s Boot device: /pci@1f,4000/scsi@3/disk@6,0:f File and args: -s SunOS Release 5.8 Version Generic_108528-07 64-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Configuring /dev and /devices Using RPC Bootparams for network configuration information. INIT: SINGLE USER MODE #
Fsck and mount the root disk's "/" partition in order to edit the /etc/shadow file:
# fsck -y /dev/rdsk/c0t0d0s0
# mount /dev/dsk/c0t0d0s0 /a
Remove the encrypted password from the /a/etc/shadow file:
# TERM=vt100; export TERM
# vi /a/etc/shadow
For example, if the entry for the root user looks like the following:
root:NqfAn3tWOy2Ro:6445::::::
Change it so that is looks as follows:
root::6445::::::
Unmount the root filesystem:
# cd /; umount /a
Since we've updated just one half of the mirror device, one needs to convince Veritas volume manager to not make use of the other half of the mirror. To do this, we "zero out" the public and private regions of rootmirror c0t1d0. When Veritas boots and cannot detect the private region on c0t1d0, it will assume that the disk has failed:
# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0/pci@1f,4000/scsi@3/sd@0,0 1. c0t1d0 /pci@1f,4000/scsi@3/sd@1,0 Specify disk (enter its number): 1 selecting c0t1d0 [disk formatted] . . . partition> p Current partition table (original): Total disk cylinders available: 5266 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 1 - 3995 6.40GB (3995/0/0) 13423200 1 swap wu 3996 - 4620 1.00GB (625/0/0) 2100000 2 backup wm 0 - 5265 8.44GB (5266/0/0) 17693760 3 - wu 0 - 0 1.64MB (1/0/0) 3360 4 - wu 1 - 5265 8.44GB (5265/0/0) 17690400 5 unassigned wm 0 0 (0/0/0) 0 6 var wm 4621 - 5245 1.00GB (625/0/0) 2100000 7 unassigned wm 0 0 (0/0/0) 0 partition> 3 Part Tag Flag Cylinders Size Blocks 3 - wu 0 - 0 1.64MB (1/0/0) 3360 Enter partition id tag[unassigned]: Enter partition permission flags[wu]: Enter new starting cyl[0]: Enter partition size[3360b, 1c, 1.64mb, 0.00gb]: 0 partition> 4 Part Tag Flag Cylinders Size Blocks 4 - wu 1 - 5265 8.44GB (5265/0/0) 17690400 Enter partition id tag[unassigned]: Enter partition permission flags[wu]: Enter new starting cyl[1]: 0 Enter partition size[17690400b, 5265c, 8637.89mb, 8.44gb]: 0 partition> p Current partition table (unnamed): Total disk cylinders available: 5266 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 1 - 3995 6.40GB (3995/0/0) 13423200 1 swap wu 3996 - 4620 1.00GB (625/0/0) 2100000 2 backup wm 0 - 5265 8.44GB (5266/0/0) 17693760 3 unassigned wu 0 0 (0/0/0) 0 4 unassigned wu 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 var wm 4621 - 5245 1.00GB (625/0/0) 2100000 7 unassigned wm 0 0 (0/0/0) 0 partition> label Ready to label disk, continue? y partition> quit
Return to the ok prompt and boot from the primary boot device:
# stop-a
ok boot disk
Resetting ...
screen not found.
Can't open input device.
Keyboard not present. Using ttya for input and output.
Sun Ultra 30 UPA/PCI (UltraSPARC-II 296MHz), No Keyboard
OpenBoot 3.27, 512 MB memory installed, Serial #9377973.
Ethernet address 8:0:20:8f:18:b5, Host ID: 808f18b5.
Initializing MemoryRebooting with command: boot disk
Boot device: /pci@1f,4000/scsi@3/disk@0,0 File and args:
SunOS Release 5.8 Version Generic_108528-16 64-bit
Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved.
Starting VxVM restore daemon...
VxVM starting in boot mode...
/usr/sbin/prtconf: getexecname() failed
vxvm:vxconfigd: WARNING: Detaching plex rootvol-02 from volume rootvol
vxvm:vxconfigd: WARNING: Disk rootmirror in group rootdg: Disk device not found
configuring IPv4 interfaces: hme0.
Hostname: pegasus
VxVM starting special volumes ( swapvol rootvol var )...
VxVM general startup...
dumpadm: no swap devices could be configured as the dump device
The system is coming up. Please wait.
starting rpc services: rpcbind done.
Setting netmask of hme0 to 255.255.255.0
Setting default IPv4 interface for multicast: add net 224.0/4: gateway pegasus
Starting sshd...
This platform does not support both privilege separation and compression
Compression disabled
syslog service starting.
savecore: no dump device configured
savecore: no dump device configured
dumpadm: no swap devices could be configured as the dump device
Oct 25 17:44:21 pegasus savecore: no dump device configured
Print services started.
/dev/bd.off: not a serial device.
volume management starting.
No VVR license installed on the system; vradmind not started.
No VVR license installed on the system; in.vxrsyncd not started.
The system is ready.
Login to the system (without a password) and confirm that VxVM has determined that the rootmirror disk has failed.
pegasus console login: root Last login: Fri Oct 25 17:02:05 from rambler.wakefie Oct 25 17:44:46 pegasus login: ROOT LOGIN /dev/console Sun Microsystems Inc. SunOS 5.8 Generic February 2000 You have new mail. # vxdisk list DEVICE TYPE DISK GROUP STATUS c0t0d0s2 sliced rootdisk rootdg online c0t1d0s2 sliced - - error - - rootmirror rootdg failed was:c0t1d0s2 # vxprint -ht Disk group: rootdg DG NAME NCONFIG NLOG MINORS GROUP-ID DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE DC NAME PARENTVOL LOGVOL SP NAME SNAPVOL DCO dg rootdg default default 0 1035555399.1025.pegasus dm rootdisk c0t0d0s2 sliced 3359 17690400 - dm rootmirror - - - - NODEVICE v rootvol - ENABLED ACTIVE 13423200 ROUND - root pl rootvol-01 rootvol ENABLED ACTIVE 13423200 CONCAT - RW sd rootdisk-B0 rootvol-01 rootdisk 17690399 1 0 c0t0d0 ENA sd rootdisk-02 rootvol-01 rootdisk 0 13423199 1 c0t0d0 ENA pl rootvol-02 rootvol DISABLED NODEVICE 13423200 CONCAT - RW sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 - NDEV v swapvol - ENABLED ACTIVE 2100000 ROUND - swap pl swapvol-01 swapvol ENABLED ACTIVE 2100000 CONCAT - RW sd rootdisk-01 swapvol-01 rootdisk 13423199 2100000 0 c0t0d0 ENA pl swapvol-02 swapvol DISABLED NODEVICE 2100000 CONCAT - WO sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 - NDEV v var - ENABLED ACTIVE 2100000 ROUND - fsgen pl var-01 var ENABLED ACTIVE 2100000 CONCAT - RW sd rootdisk-03 var-01 rootdisk 15523199 2100000 0 c0t0d0 ENA pl var-02 var DISABLED NODEVICE 2100000 CONCAT - WO sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 - NDEV
Change the root password for the system:
# passwd root New password: ******* Re-enter new password: passwd (SYSTEM): passwd successfully changed for root
Go through the standard process of replacing the c0t1d0 disk, without actually physically changing the disk device:
# vxdiskadm Volume Manager Support Operations Menu: VolumeManager/Disk 1 Add or initialize one or more disks 2 Encapsulate one or more disks 3 Remove a disk 4 Remove a disk for replacement 5 Replace a failed or removed disk 6 Mirror volumes on a disk 7 Move volumes from a disk 8 Enable access to (import) a disk group 9 Remove access to (deport) a disk group 10 Enable (online) a disk device 11 Disable (offline) a disk device 12 Mark a disk as a spare for a disk group 13 Turn off the spare flag on a disk 14 Unrelocate subdisks back to a disk 15 Exclude a disk from hot-relocation use 16 Make a disk available for hot-relocation use 17 Prevent multipathing/Suppress devices from VxVM's view 18 Allow multipathing/Unsuppress devices from VxVM's view 19 List currently suppressed/non-multipathed devices 20 Change the disk naming scheme 21 Get the newly connected/zoned disks in VxVM view list List disk information ? Display help about menu ?? Display help about the menuing system q Exit from menus Select an operation to perform: 4 Remove a disk for replacement Menu: VolumeManager/Disk/RemoveForReplace Use this menu operation to remove a physical disk from a disk group, while retaining the disk name. This changes the state for the disk name to a "removed" disk. If there are any initialized disks that are not part of a disk group, you will be given the option of using one of these disks as a replacement. Enter disk name [,list,q,?] list Disk group: rootdg DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE dm rootdisk c0t0d0s2 sliced 3359 17690400 - dm rootmirror - - - - NODEVICE Enter disk name [ ,list,q,?] rootmirror The following volumes will lose mirrors as a result of this operation: rootvol swapvol var No data on these volumes will be lost. The requested operation is to remove disk rootmirror from disk group rootdg. The disk name will be kept, along with any volumes using the disk, allowing replacement of the disk. Select "Replace a failed or removed disk" from the main menu when you wish to replace the disk. Continue with operation? [y,n,q,?] (default: y) Removal of disk rootmirror completed successfully. Remove another disk? [y,n,q,?] (default: n) Volume Manager Support Operations Menu: VolumeManager/Disk 1 Add or initialize one or more disks 2 Encapsulate one or more disks 3 Remove a disk 4 Remove a disk for replacement 5 Replace a failed or removed disk 6 Mirror volumes on a disk 7 Move volumes from a disk 8 Enable access to (import) a disk group 9 Remove access to (deport) a disk group 10 Enable (online) a disk device 11 Disable (offline) a disk device 12 Mark a disk as a spare for a disk group 13 Turn off the spare flag on a disk 14 Unrelocate subdisks back to a disk 15 Exclude a disk from hot-relocation use 16 Make a disk available for hot-relocation use 17 Prevent multipathing/Suppress devices from VxVM's view 18 Allow multipathing/Unsuppress devices from VxVM's view 19 List currently suppressed/non-multipathed devices 20 Change the disk naming scheme 21 Get the newly connected/zoned disks in VxVM view list List disk information ? Display help about menu ?? Display help about the menuing system q Exit from menus Select an operation to perform: 5 Replace a failed or removed disk Menu: VolumeManager/Disk/ReplaceDisk Use this menu operation to specify a replacement disk for a disk that you removed with the "Remove a disk for replacement" menu operation, or that failed during use. You will be prompted for a disk name to replace and a disk device to use as a replacement. You can choose an uninitialized disk, in which case the disk will be initialized, or you can choose a disk that you have already initialized using the Add or initialize a disk menu operation. Select a removed or failed disk [ ,list,q,?] list Disk group: rootdg DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE dm rootmirror - - - - REMOVED Select a removed or failed disk [ ,list,q,?] rootmirror Select disk device to initialize [,list,q,?] list DEVICE DISK GROUP STATUS c0t0d0 rootdisk rootdg online c0t1d0 - - error Select disk device to initialize [,list,q,?] c0t1d0 The following disk device has a valid VTOC, but does not appear to have been initialized for the Volume Manager. If there is data on the disk that should NOT be destroyed you should encapsulate the existing disk partitions as volumes instead of adding the disk as a new disk. Output format: [Device_Name] c0t1d0 Encapsulate this device? [y,n,q,?] (default: y) n c0t1d0 Instead of encapsulating, initialize? [y,n,q,?] (default: n) y The requested operation is to initialize disk device c0t1d0 and to then use that device to replace the removed or failed disk rootmirror in disk group rootdg. Continue with operation? [y,n,q,?] (default: y) Replacement of disk rootmirror in group rootdg with disk device c0t1d0 completed successfully. Replace another disk? [y,n,q,?] (default: n) n # vxdisk list DEVICE TYPE DISK GROUP STATUS c0t0d0s2 sliced rootdisk rootdg online c0t1d0s2 sliced rootmirror rootdg online # vxprint -ht Disk group: rootdg DG NAME NCONFIG NLOG MINORS GROUP-ID DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE DC NAME PARENTVOL LOGVOL SP NAME SNAPVOL DCO dg rootdg default default 0 1035555399.1025.pegasus dm rootdisk c0t0d0s2 sliced 3359 17690400 - dm rootmirror c0t1d0s2 sliced 3359 17690400 - v rootvol - ENABLED ACTIVE 13423200 ROUND - root pl rootvol-01 rootvol ENABLED ACTIVE 13423200 CONCAT - RW sd rootdisk-B0 rootvol-01 rootdisk 17690399 1 0 c0t0d0 ENA sd rootdisk-02 rootvol-01 rootdisk 0 13423199 1 c0t0d0 ENA pl rootvol-02 rootvol ENABLED STALE 13423200 CONCAT - WO sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 c0t1d0 ENA v swapvol - ENABLED ACTIVE 2100000 ROUND - swap pl swapvol-01 swapvol ENABLED ACTIVE 2100000 CONCAT - RW sd rootdisk-01 swapvol-01 rootdisk 13423199 2100000 0 c0t0d0 ENA pl swapvol-02 swapvol DISABLED RECOVER 2100000 CONCAT - WO sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 c0t1d0 ENA v var - ENABLED ACTIVE 2100000 ROUND - fsgen pl var-01 var ENABLED ACTIVE 2100000 CONCAT - RW sd rootdisk-03 var-01 rootdisk 15523199 2100000 0 c0t0d0 ENA pl var-02 var DISABLED RECOVER 2100000 CONCAT - WO sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 c0t1d0 ENA #
Once the replacement process completes (monitor it via the vxtask list
command), operating system redundancy has been restored