Chinaunix首页 | 论坛 | 博客
  • 博客访问: 3828033
  • 博文数量: 197
  • 博客积分: 10086
  • 博客等级: 上将
  • 技术积分: 5145
  • 用 户 组: 普通用户
  • 注册时间: 2007-05-13 10:50
文章分类

全部博文(197)

文章存档

2011年(2)

2009年(30)

2008年(165)

我的朋友

分类: LINUX

2008-06-12 09:45:15

It is possible, though difficult, to resize a Linux root partition while it's still mounted. What's more, it can be done remotely, without having to be at the console. You'll need 2GB of RAM, but here is how:
  1. Stop all services other than the network and SSH, and stop SELinux interfering:
    # telinit 2
    # for SERVICE in \
    `chkconfig --list | grep 2:on | awk '{print $1}' | grep -v -e sshd -e network -e rawdevices`; \
    do service $SERVICE stop; done
    # service nfs stop
    # service rpcidmapd stop
    # setenforce 0

  2. Unmount all filesystems:
    # umount -a

  3. Create a temporary filesystem:
    # mkdir /tmp/tmproot
    # mount none /tmp/tmproot -t tmpfs
    # mkdir /tmp/tmproot/{proc,sys,usr,var,oldroot}
    # cp -ax /{bin,etc,mnt,sbin,lib} /tmp/tmproot/
    # cp -ax /usr/{bin,sbin,lib} /tmp/tmproot/usr/
    # cp -ax /var/{account,empty,lib,local,lock,nis,opt,preserve,run,spool,tmp,yp} /tmp/tmproot/var/
    # cp -a /dev /tmp/tmproot/dev
    Note that this used up about 1.6GB of ramdisk on my Red Hat Enterprise Linux (AS) 4 server.

    Also note that on 64-bit systems you will also need to copy /lib64 and /usr/lib64 as well, otherwise you will see errors like "lib64/ld-linux-x86-64.so.2: bad ELF interpreter: No such file or directory".

  4. Switch the filesystem root to the temporary filesystem:
    # pivot_root /tmp/tmproot/ /tmp/tmproot/oldroot
    # mount none /proc -t proc
    # mount none /sys -t sysfs (this may fail on 2.4 systems)
    # mount none /dev/pts -t devpts

  5. Restart the SSH daemon to close the old pty devices:
    # service sshd restart
    You should now try to make a new connection. If that succeeds, close your old one to release the old pty device. If it fails, get the SSH daemon properly restarted before proceeding.

  6. Close everything that's still using the old filesystem:
    # umount /oldroot/proc
    # umount /oldroot/dev/pts
    # umount /oldroot/selinux
    # umount /oldroot/sys
    # umount /oldroot/var/lib/nfs/rpc_pipefs
    Now try to find other things that are still holding on to the old filesystem, particularly /dev:
    # fuser -vm /oldroot/dev
    Common processes that will need killing:
    # killall udevd
    # killall gconfd-2
    # killall mingetty
    # killall minilogd
    Finally, you will need to re-execute init:
    # telinit u

  7. Unmount the old filesystem:
    # umount -l /oldroot/dev
    # umount /oldroot
    Note that we use the umount -l ("lazy") option, available only with kernels 2.4.11 and later, because /oldroot is actually mounted using an entry in /oldroot/dev, so it would be difficult if not impossible to unmount either of them otherwise.

  8. Now resize the root filesystem:
    # e2fsck -C 0 -f /dev/VolGroup00/LogVol00
    # resize2fs -p -f /dev/VolGroup00/LogVol00 8G
    # lvresize /dev/VolGroup00/LogVol00 -L 8G
    # resize2fs -p -f /dev/VolGroup00/LogVol00
    # e2fsck -C 0 -f /dev/VolGroup00/LogVol00
    In this example the root partition is /dev/VolGroup00/LogVol00 and it is being shrunk to 8GB. You don't necessarily have to run resize2fs twice, I just do in case my idea of the size differs from what lvresize thinks.

  9. We're done, so start putting everything back:
    # mount /dev/VolGroup00/LogVol00 /oldroot
    # pivot_root /oldroot /oldroot/tmp/tmproot
    # umount /tmp/tmproot/proc
    # mount none /proc -t proc
    # cp -ax /tmp/tmproot/dev/* /dev/
    # mount /dev/pts
    # mount /sys
    # killall mingetty
    # telinit u
    # service sshd restart
    Now make a new SSH connection, and if it works, close the old one. Note that sshd may still be running in the temporary filesystem at this point because of the way the service scripts work - check this with fuser, and if this is the case, kill the oldest sshd process and then do service sshd start. Then log in again and disconnect all other connections.

    Final steps to unmount the temporary filesystem:
    # umount -l /tmp/tmproot/dev/pts
    # umount -l /tmp/tmproot
    # rmdir /tmp/tmproot
    Now to re-mount our original filesystems and start services back up:
    # mount -a
    # umount /sys
    # mount /sys
    # for SERVICE in \
    `chkconfig --list | grep 2:on | awk '{print $1}' | grep -v -e sshd -e network -e rawdevices`; \
    do service $SERVICE start; done
    # telinit 3
    Replace 3 with your preferred runlevel. You may also want to start SELinux up again with setenforce.

The above has only been tested on RHEL AS 4, but something like it should work on most Linux variants that have pivot_root, tmpfs, and umount -l, so long as you can replace the chkconfig and service parts with whatever is appropriate for your distribution.



Update: says, for CentOS 4.4, "I was not able to login after restarting sshd in step 5 until I did this: mount none /dev/pts -t devpts".



Update: Simetrical suggests that 64-bit systems also need to copy /lib64 and /usr/lib64, and that after pivot_root 2.6 kernels will also need mount none /sys -t sysfs and mount none /dev/pts -t devpts. (The above steps have been modified accordingly).

Labels:

5 Comments:

said...

I got this to work with CentOS 4.4.

Tip: I was not able to login after restarting sshd in step 5 until I did this:

# mount none /dev/pts -t devpts

27-Feb-2007 02:01:00  
z_s44@yahoo.com said...

following the first pivot_root. every single shell command fails with this error message :(

-bash:/bin/ls: lib64/ld-linux-x86-64.so.2: bad ELF interpreter: No such file or directory

I might have missed /lib64 in the old root
anyone can help ?

15-Aug-2007 19:42:00  
Simetrical said...

An excellent guide. I have a couple of comments to start: first of all, surely rather than

# mkdir /tmp/tmproot/{proc,sys,usr,var}
# cp -ax /{bin,etc,mnt,sbin,lib,oldroot} /tmp/tmproot/

you meant

# mkdir /tmp/tmproot/{proc,sys,usr,var,oldroot}
# cp -ax /{bin,etc,mnt,sbin,lib} /tmp/tmproot/

since /oldroot presumably doesn't exist.

Second of all, on 64-bit RHEL 5, I had to move /lib64 and /usr/lib64 as well:

# cp -ax /{bin,etc,mnt,sbin,lib,lib64} /tmp/tmproot/
# cp -ax /usr/{bin,sbin,lib,lib64} /tmp/tmproot/usr/

Third of all, since I'm using a 2.6 kernel, I made sure to mount none /sys -t sysfs; mount none /dev/pts -t devpts after the pivot_root in case those were needed by something.

Now, for the problems. Basically, I just get "Cannot unmount X: device is busy" errors when I try to unmount /oldroot. Processes like init are using all sorts of libraries, and their binaries are executing and so can't be touched either. A truncated sample of fuser output follows:

USER PID ACCESS COMMAND
/oldroot/lib64/ld-2.5.so:
root 1 ....m init
root 3562 ....m restorecond
root 3617 ....m irqbalance
root 3667 ....m sdpd
root 3708 ....m hidd
root 3773 ....m gpm
xfs 3817 ....m xfs
root 3847 ....m rhnsd
root 3864 ....m smartd

/oldroot/lib64/libdl-2.5.so:
root 1 ....m init
root 3562 ....m restorecond

/oldroot/sbin/init: root 1 ...e. init

/oldroot/sbin/scsi_id:
root 1 Frc.. init
root 2 .rc.. migration/0
root 3 .rc.. ksoftirqd/0
root 4 .rc.. watchdog/0
root 5 .rc.. migration/1
root 6 .rc.. ksoftirqd/1
root 7 .rc.. watchdog/1
root 8 .rc.. events/0
root 9 .rc.. events/1
root 10 .rc.. khelper
root 79 .rc.. kthread
root 84 .rc.. kblockd/0
root 85 .rc.. kblockd/1
root 86 .rc.. kacpid
root 160 .rc.. cqueue/0
root 161 .rc.. cqueue/1
root 164 .rc.. khubd
root 166 .rc.. kseriod
root 239 .rc.. pdflush
root 240 .rc.. pdflush
root 241 .rc.. kswapd0
root 242 .rc.. aio/0
root 243 .rc.. aio/1
root 390 .rc.. kpsmoused
root 422 .rc.. scsi_eh_0
root 426 .rc.. ata/0
root 427 .rc.. ata/1
root 428 .rc.. ata_aux
root 432 .rc.. scsi_eh_1
root 433 .rc.. scsi_eh_2
root 434 .rc.. kjournald
root 466 .rc.. kauditd
root 1306 .rc.. kmirrord
root 2180 .rc.. krfcommd
root 3562 .rc.. restorecond
root 3617 .rc.. irqbalance
root 3667 .rc.. sdpd
root 3708 .rc.. hidd
root 3773 .rc.. gpm
xfs 3817 .rc.. xfs
root 3847 .rc.. rhnsd
root 3864 .rc.. smartd
root 3901 Frce. sshd
root 3903 Frce. sshd
root 3905 .rce. bash
root 4048 Frce. mingetty
root 4054 Frce. mingetty
root 4056 Frce. mingetty
root 4059 Frce. mingetty
root 4061 Frce. mingetty
root 4064 Frce. mingetty
root 4185 Frce. auditd
root 4187 Frce. python
root 4214 Frce. agetty

/oldroot/sbin/telinit:
root 1 ...e. init

The scsi_id is particularly puzzling, since it's used by a ton of processes as root and current working directory, as well as other things, but it seems to be a symlink to a binary:

-bash-3.1# ls -l /oldroot/sbin/scsi_id
lrwxrwxrwx 1 root root 17 Aug 23 17:00 /oldroot/sbin/scsi_id -> /lib/udev/scsi_id
-bash-3.1# ls -l /oldroot/lib/udev/scsi_id
-rwxr-xr-x 1 root root 28216 Dec 18 2006 /oldroot/lib/udev/scsi_id

Do you (or anyone else) have any thoughts?

24-Aug-2007 02:37:00  
Andrew Wood said...

Thanks for the comments.

I've no idea why you are getting things still using /oldroot, but you can force it to be unmounted using "umount -l" (lazy unmount). This may or may not work correctly though - lazy unmounts are a bit dangerous if you're then going to modify the filesystem you just unmounted.

Make sure all of your other filesystems on /oldroot, such as /oldroot/sys etc, have been unmounted first. Use lazy unmounts on them if you have to. This may solve the problem without having to resort to a lazy unmount of the old root filesystem itself.

24-Aug-2007 08:10:00  
Simetrical said...

I eventually got it to work. I'm honestly not quite sure what caused init and friends to finally give up on /oldroot, but eventually umount /dev/sda3 worked and various resizing utilities were happy to do their work. Thanks a lot for this guide.

Incidentally, "lsof /oldroot" is very handy for finding all the files still using the old filesystem. fuser -vm /oldroot, oddly, didn't seem to give me complete results.

阅读(1078) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~