Chinaunix首页 | 论坛 | 博客
  • 博客访问: 6961114
  • 博文数量: 637
  • 博客积分: 10265
  • 博客等级: 上将
  • 技术积分: 6165
  • 用 户 组: 普通用户
  • 注册时间: 2004-12-12 22:00
文章分类

全部博文(637)

文章存档

2011年(1)

2010年(1)

2009年(3)

2008年(12)

2007年(44)

2006年(156)

2005年(419)

2004年(1)

分类: LINUX

2006-03-08 11:12:01

Diskless Linux with PXE HOWTO

You'll find all downloads and the current version of this document at

If you have comments or additions to this HOWTO don't hesitate to write to .

Revision: 1.1 (2004-02-05):

  • Fix trim function in bootimage
  • Add fsck workaround
  • Add retries to dhcp client (thanks to Joe Robertson)
  • Add SuSE comment

Revision: 1.0 (2003-12-21): Original release

1. Legal stuff

Copyright (C) 2003,2004 by Gerd v. Egidy (gve@intra2net.com)

This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at ).
The source code found on this site is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.
This document is provided WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

2. Pre-Requisites

  • a diskless client with PXE remote boot capabilities. Most on-board network cards have a PXE client in BIOS. The more expensive NICs often come with their own small PXE BIOS. Sometimes you have to activate it first - I had to use to activate PXE on my Intel EEPro/100+.
  • a Linux distribution CD you want to run on the diskless system
  • a DHCP server (e.g. dhcpd that comes with most distributions)
  • a TFTP server (e.g. tftp-server that comes with most distributions)
  • a NFS server
  • DHCP, TFTP and NFS can live happily together on one server

3. Installing the distribution

A ready-to-run installed distribution usually resides on the hard disk of the machine. Now we don't want to use a hard disk on our client, but load the distribution installed from a server via NFS. So we need to have it installed into a directory tree onto the server. There are two ways to accomplish this:
  1. Install the distribution on a disk and then copy all files into a directory onto your server
  2. Install the distribution directly into a dir on the server. to install a minimal Fedora Core 1. Mount the first Fedora CD, create an empty directory and run it like this: ./mkfedora.sh [TARGETDIR] [CDROMPATH]
In either case you need to prepare your distribution to be booted remotely. You can use like ./mkbootready [SYSROOT] where SYSROOT is the root of your installed distribution. If you prefer to fix it by hand do it like this (watch out for the tabs!):
# we'll have a read-only /etc
ln -sf /proc/mounts ${SYSROOT}/etc/mtab
ln -sf /var/resolv.conf ${SYSROOT}/etc/resolv.conf

# replace the entry for / in fstab: rc.sysinit otherwise tries to do a fsck.ext2...
cp ${SYSROOT}/etc/fstab ${SYSROOT}/etc/fstab.orig
grep -v -e "[ ]/[ ]" ${SYSROOT}/etc/fstab >${SYSROOT}/etc/fstab.new
echo "none / tmpfs defaults 0 0" >>${SYSROOT}/etc/fstab.new
mv -f ${SYSROOT}/etc/fstab.new ${SYSROOT}/etc/fstab

# disable networking (we have it already set up if the rc's are running)
/usr/sbin/chroot ${SYSROOT} /sbin/chkconfig --del network

# disable kudzu (won't do anything since we are on a read-only filesystem)
/usr/sbin/chroot ${SYSROOT} /sbin/chkconfig --del kudzu

# move RPM database to /usr (saves ram since /var will live in tmpfs)
mkdir -p ${SYSROOT}/usr/var/lib/
mv ${SYSROOT}/var/lib/rpm/ ${SYSROOT}/usr/var/lib/
ln -s /usr/var/lib/rpm ${SYSROOT}/var/lib/rpm

# remove temp files from RPM database
rm -f ${SYSROOT}/usr/var/lib/rpm/__db.*

# compress /dev to speed up booting
tar czf ${SYSROOT}/dev.tar.gz -C ${SYSROOT} dev

4. Configuring NFS

The next step is to allow mounting the distribution directory via NFS. Add a line like this to your /etc/exports on the NFS server:
/netboot/pxeclient  192.168.1.0/24(ro,no_root_squash)
/netboot/pxeclient is the distribution root directory in this example. DO NOT allow mounting this directory in read-write mode since you need to set no_root_squash. Allowing writes on a no_root_squash export is a major security risk.

Make sure your NFS server is started and don't forget to reload it's configuration after changing the exports. Take a look at the if you got problems.

5. Prepairing the initrd image

Download my base image. If you don't have all needed options built directly into your kernel but as modules, you need to prepare the initrd image. You can use this script to handle this. Copy it into the same directory as the base image and start it like this:

./mkbootimage.sh [SYSROOT] [SYSTEMMAP] [KERNELVER]

SYSROOT is the root directory of the system you want to take the kernel modules from. Usually it is something like /netboot/pxeclient/. There must be a valid modules.dep in [SYSROOT]/lib/modules/[KERNELVER].

SYSTEMMAP is the System.map file that came with your kernel. Usually something like /netboot/pxeclient/boot/System.map-[KERNELVER]

KERNELVER is the full version of your kernel. For Fedora Core 1 this would be 2.4.22-1.2115.nptl.

If you want to do it by hand, you've got to do it like this:
# unzip and mount base image
gunzip ${IMAGE}.gz
mkdir __pxeboot-tmp__
mount -o loop $IMAGE __pxeboot-tmp__
[ $? -ne 0 ] && echo "error mounting image" && exit 1

# create module directories
mkdir -p __pxeboot-tmp__/lib/modules/${KERNELVER}/net
mkdir -p __pxeboot-tmp__/lib/modules/${KERNELVER}/nfs
Now copy nfs.o and its dependencies (usually sunrpc.o and lockd.o) into .../nfs.
Then grab your network driver modules and their dependencies (often mii.o) and put them into .../net. It is possible to copy different modules for multiple clients with different hardware in there. The loader will pick the right one.
# recalculate dependencies
/sbin/depmod -a -F $SYSTEMMAP -b __pxeboot-tmp__ -C /dev/null $KERNELVER

# unmount and zip again
umount __pxeboot-tmp__
gzip -9 $IMAGE
rmdir __pxeboot-tmp__

6. Configuring DHCP

Install and configure your DHCP server. There are covering this so I'm not going to explain it here.

Add a section like this for your client to /etc/dhcpd.conf:
host pxeclient {
hardware ethernet AA:BB:CC:DD:EE:FF;
fixed-address 192.168.1.55;
option host-name "pxeclient";

filename "/pxelinux.0";
next-server tftpserver;
option root-path "nfsserver:/netboot/pxeclient";
}
Important for a diskless client are:
filename
the name of the PXE boot image on the TFTP server (always "/pxelinux.0" here)
next-server
name or IP of the TFTP server
root-path
NFS-path to the root directory of the client. You can omit the "server:" if it is on the same machine as the TFTP. You can add comma separated NFS options (see the mount(8) manpage) at the end (like "nfsserver:/netboot/pxeclient,retry=1,rsize=8192,wsize=8192"

Don't forget to restart your dhcpd after adding these options.

7. Configuring TFTP

Install the tftp-server package. If your distribution uses xinetd you usually just have to switch "disable=no" in /etc/xinetd.d/tftp and restart xinetd. Or start tftpd manually with the "-s /tftpboot" option. The downloadable files usually reside in /tftpboot.

Now go to and download the newest version of syslinux. It should contain a file called pxelinux.0. Copy this file into your /tftpboot directory.

Cd to /tftpboot and create a subdirectory called "pxelinux.cfg". Put a file called "default" in there:
LABEL linux
KERNEL vmlinuz-2.4.22-1.2115.nptl
APPEND initrd=pxeboot.img.gz ramdisk_size=8192
Of course you should use the name of the kernel image you are planning to use. This example is for Fedora Core 1. Copy your kernel and initrd image into /tftpboot too.

8. Booting

Now you can switch on your client and watch it booting.

Don't forget that you have a read-only image. That means your client can write on the root ramdisk into /var and /tmp but nearly nowhere else. The easiest way to change something is to do it in a chroot on the NFS server. Execute e.g. chroot /netboot/pxeclient to start a chrooted shell.

9. Reducing boot times

There are some ways to speed up the boot process. The most valuable ones are
  • using a custom kernel. Throw everything out you don't need. My experience is that especially IDE is taking a lot of time to find out that I don't have any hard disk connected. You'll need to have support for your network card, initrd images, NFS, ext2fs and tmpfs.
  • Tweak your /etc/rc.d/rc.sysinit. Usually it is doing stuff like fsck and depmod that take time but don't help anything since your are on a read-only image. Remove everything you don't need.

10. Known problems

  • Shutdown doesn't work: this is due to the client trying to unmount it's nfs root. I haven't found a distribution-independent way to fix this. If you want to try it: you've got to make sure that the root nfs isn't unmounted and the portmapper (if running) isn't killed. Please send me your solutions and I'll add them.
  • No DHCP client daemon running: The system isn't running a DHCP daemon to renew its lease. This isn't a problem if you use fixed addresses with DHCP like in the example above. But if you are using a DHCP pool your client might get thrown out and the IP reassigned if the lease has expired. I didn't want to keep the daemon from the initrd alive since that would mean that we can't unmount initrd and will waste the 8 MB RAM. Starting a new dhclient from the real distribution is possible, but you usually have to tweak dhclient-script to make sure it isn't switching off the network device or changing the IP during discovery.

11. Differences for SuSE Linux

  • SuSE kernels usually have nfs compiled in so you don't need to add it as module. But the af_packet module is needed for dhcp. You'll need to change mkbootimage (or do it by hand) and linuxrc (in pxeboot.img.gz).

Appendix A. The initrd image

The initrd image contains a . Busybox is built against a regular glibc 2.3.2 (taken from FC1), important parts from glibc are in the image too. find is contained because the busybox-find does not support -exec. The network card autodetection is done by lspci using the "-k" option that spits out the corresponding module name. lspci -k is a by Diego Torres Milano.

This is the linuxrc contained in the initrd image:
#!/bin/sh

# (C) Copyright 2003,2004 by Gerd v. Egidy
#
# PCI autodetect Copyright 2001-2003 by Diego Torres Milano
# PCI autodetect adapted from PXES by Gerd v. Egidy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#

autodetect_pci() # PCICLASS MODCLASS
{
local PCICLASS NOTFOUND line

PCICLASS=$1
NOTFOUND=1

if [ -z "$PCICLASS" ]; then
echo "$PROGNAME: autodetect_pci missing parameter" >&2
exit 1
fi

if [ ! -r /proc/bus/pci ]; then
return $NOTFOUND
fi

# I can't make a pipe of this because ash treats it as a subshell,
# that's why the temp file
/sbin/lspci -i /dev/null -knm | /bin/grep "$PCICLASS" > /tmp/pci.$$
while read line
do
set -- $line
eval mod=\$$#
eval mod=$mod
if [ "$mod" != "UNKNOWN" ]; then
echo "loading $mod..."
/sbin/modprobe $mod
NOTFOUND=0
fi
done < /tmp/pci.$$
/bin/rm -f /tmp/pci.$$

return $NOTFOUND
}

echo "Remount / read-write"
mount -n -o remount,rw /

echo "Mounting /proc filesystem"
mount -n -t proc /proc /proc

echo "Autodetecting network devices..."
autodetect_pci 'Class 02..'
if [ $? -ne 0 ]; then
echo "WARNING: no network interface found"
fi

echo "Loading nfs modules"
modprobe nfs

echo "Initializing network loopback device"
ifconfig lo 127.0.0.1 up

echo "Configuring eth0"
udhcpc --now --quit --interface=eth0 --script=/bin/udhcpc.script
if [ $? -ne 0 ]; then
echo "ERROR: couldn't get DHCP lease, trying again"
udhcpc --now --quit --interface=eth0 --script=/bin/udhcpc.script
if [ $? -ne 0 ]; then
echo "ERROR: couldn't get DHCP lease, trying again"
udhcpc --now --quit --interface=eth0 --script=/bin/udhcpc.script
if [ $? -ne 0 ]; then
echo "ERROR: can't get DHCP lease"
exit 0
fi
fi
fi

# load DHCP parameter
. /etc/udhcpc-eth0.info

echo "Mounting nfs root filesystem"
if ! echo $ROOTPATH | grep -q ":/" ; then
# we haven't got a full path, use next-server
ROOTPATH="${NEXTSERVER}:${ROOTPATH}"
fi

if echo $ROOTPATH | grep -q "," ; then
# we have options
NFSOPTIONS=`echo $ROOTPATH | sed -e "s/\(.*\)\(,.*\)/\2/"`
ROOTPATH=`echo $ROOTPATH | sed -e "s/\(.*\)\(,.*\)/\1/"`
fi

echo "Mounting root filesystem"
mount -rw -t tmpfs none /sysroot/

echo "Mounting NFS root-base"
mkdir /sysroot/nfsroot
mount -n -o "ro,nolock${NFSOPTIONS}" -t nfs "$ROOTPATH" /sysroot/nfsroot
if [ $? -ne 0 ]; then
echo "ERROR: can't mount root filesystem via NFS"
exit 0
fi

echo "Setting root symlinks"
cd /sysroot
find ./nfsroot -maxdepth 1 -mindepth 1 -exec ln -s \{\} \;
cd /

echo "Handling special directories"
rm -f /sysroot/initrd
mkdir /sysroot/initrd
rm -f /sysroot/tmp
mkdir /sysroot/tmp
rm -f /sysroot/proc
mkdir /sysroot/proc

echo "Copying /var"
rm -f /sysroot/var
mkdir /sysroot/var
cp -a /sysroot/nfsroot/var /sysroot
cp /etc/resolv.conf /sysroot/var/resolv.conf

# /dev handling
rm -f /sysroot/dev
mkdir /sysroot/dev
if [ -f /sysroot/nfsroot/dev.tar.gz ]; then
echo "Unpacking /dev"
tar xzf /sysroot/nfsroot/dev.tar.gz -C /sysroot
else
echo "Copying /dev"
cp -a /sysroot/nfsroot/dev /sysroot
fi

echo 0x0100 > /proc/sys/kernel/real-root-dev

echo "Unmounting temporary mounts"
umount /proc

echo "Changing to new NFS root"
pivot_root /sysroot /sysroot/initrd
阅读(6666) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~