Contents[] |
[] What is it?
OSLO(the abbreviation of "Operating System LOader") is the code name for the new setup node project.
OSLO is designed to be a complete replacement of the old slow and buggy setup_nodes.sh and its related helper scripts.
It provides faster setup speed, a smooth setup procedure, a better
command line interface, much more fault tolerance and highly
modularized components.
[] What platform does it support?
Currently the supported distributions are RHEL3, RHEL4, RHEL5, CENTOS5, SLES9, SLES10, SLES11, Scientific Linux 5, Oracle Enterprise Linux 5 and a virtual vanilla distro, which is in fact SLES10.
The supported architectures are i686, x86_64 and ia64.
We are planning to , in the near future.
Note that sles9/ia64, centos5/ia64, sl5/ia64 and sles11/ia64, which are not used at all, are not supported now.
[] Where is it?
It is available on
lts-head:/bin/oslo
and
lts-head:/bin/setupnode
They are equivalents. So you can invoke any one of them on lts-head.
[] How to use it?
You can use it as a normal user without using 'sudo'.
Note: Using 'sudo' is OK but it dosen't give you any magic power.
The detailed usage can be obtained by executing
oslo -h
or
oslo --help
You can also see examples by executing
oslo -e
or
oslo --show-examples
[] But I want to learn about it here
OK. Here is the usage:
Usage: oslo [OPTION]...
Mandatory arguments:
-n, --nodelist=NODELIST List of nodes to be setup.
-d, --distro=DISTRO Distribution that the node to run.
-a, --arch=ARCH Architecture that the node to run.
Optional arguments:
-p, --package-dir=PKGDIR A directory that contians packages to be installed.
-k, --kernel-package=KERNELPKG Kernel package to install and
boot into. If not provided, and if there's no kernel rpm specified in
PKGDIR,default kernel of the distro will be used.
-s, --postsetup-scripts=SCRIPTLIST A list of scripts that will
be executed on each of the nodes after setup. Note that the scripts
will be executed in a manner of "First Come First Served" order. So you
are responsible for ordering the list correctly.
-f, --force-install Force package installation. This option can
be used when there are dependency issues or if you are installing
obsolete packages.
-m, --memory-size=MEMORYSIZE Set the specific ammount of memory used by the kernel. This can be used when simulating low mem situation.
-l, --list-default-kernels List the default kernels that supported distributions use.
-u, --use-default-kernel Boot the nodes with the default kernel provided by the OS image.
-i, --install-source-debuginfo-packages Install source and debuginfo packages (if has).
-z, --xen-host Setup node(s) as XEN host - (NOT IMPLEMENTED CURRENTLY)
-o, --boot-options Specify boot options for the nodes
-O, --rpm-options Specify RPM installation options
-N, --rpminst-noscripts Use '--noscript' option when install RPMs
-D, --disable-crashdump Disable crash dump
-h, --help This message.
-e, --show-examples Show me some examples please.
-y, --yes Skip confirmation and proceed without prompt.
For those of you who need to setup node with LBATS build, the following arguments can be quite handy:
-t, --lbats-tag=TAG LBATS build tag.
-b, --branch=BRANCH Lustre branch.
-c, --patchless-client Do patchless client install.
And if you want to use lustre release packages:
-r, --lustre-release=VERSION Lustre release version
NODELIST is a list of nodes separated by comma ','. Of course,
you can specify multiple '-n
DISTRO can be one of rhel3 rhel4 rhel5 sles9 sles10.
ARCH can be one of i686 x86_64 ia64.
SCRIPTLIST is a list of scripts separated by comman ','.
MEMORYSIZE should be in the format of 'n[kKmMgG]' where n is an integer.
TAG should be the tag that you specified when submitting LBATS build request.
BRANCH should be a valid lustre branch name.
VERSION should be a valid lustre release version.
[] Show me some examples
- setup node1,node2 with rhel4/i686 running default kernel:
oslo -n node1,node2 -d rhel4 -a i686
- setup node1 with sles10/x86_64 running kernel /path/to/kernel/file.rpm:
oslo -n node1 -d sles10 -a x86_64 -k /path/to/kernel/file.rpm
- setup node1 with sles10/x86_64 running kernel /path/to/kernel/file.rpm, force installing the rpm:
oslo -n node1 -d sles10 -a x86_64 -k /path/to/kernel/file.rpm -f
- setup node1 with sles10/x86_64, installing rpms in /path/to/package/dir. If there is a kernel rpm in /path/to/package/dir, boot node1 using that kernel:
oslo -n node1 -d sles10 -a x86_64 -p /path/to/package/dir
- setup node1 with sles10/x86_64, installing rpms in /path/to/package/dir, boot node1 with the default kernel whether or not there is a kernel rpm in /path/to/package/dir:
oslo -n node1 -d sles10 -a x86_64 -p /path/to/package/dir -u
- setup node1 with sles10/x86_64, installing rpms in /path/to/package/dir, and kernel /path/to/kernel/file.rpm:
oslo -n node1 -d sles10 -a x86_64 -k /path/to/kernel/file.rpm -p /path/to/package/dir
- setup node1,node2 with sles10/i686, tells the kernel to use only 500M memory:
oslo -n node1 -n node2 -d sles10 -a i686 -m 500M
- setup node1 with rhel5/i686 installing b1_6 LBATS packages tagged by 'mytag':
oslo -n node1 -d rhel5 -a i686 -t mytag -b b1_6
- setup node1 with rhel5/i686 installing b1_6 LBATS patchless packages tagged by 'mytag':
oslo -n node1 -d rhel5 -a i686 -t mytag -b b1_6 -c
- setup node1 with rhel5/i686, executing /path/to/script1,/path/to/utility1,/path/to/script2 after setup.
oslo -n node1 -d rhel5 -a i686 -s /path/to/script1,/path/to/utility1,/path/to/script2
- setup node1 with rhel5/i686, installing lustre 1.6.5.1 release packages.
oslo -n node1 -d rhel5 -a i686 -r 1.6.5.1
- This will setup node1 with rhel5/i686, add "mem=1G" to kernel boot argument list.
oslo -n node1 -d rhel5 -a i686 -o "mem=1G"
- This will setup node1 with sles10/x86_64, installing rpms in /path/to/package/dir using "--nodeps --force" options.
oslo -n node1 -d sles10 -a x86_64 -p /path/to/package/dir -g "--nodeps --force"
[] Can I keep a copy of oslo?
Sure you can. But I don't recommend you to do this.
OSLO is
subject to change at any time. If you keep a copy of your own, you are
at the risk of using an obsolete version that may be malfunctional.
So please use the official OSLO as much as you can.
[] How long does it take to setup some nodes?
Normally it takes 2-5 minutes to complete the whole process. But sometimes it can take up to 8-10 minutes or even more.
The total setup time depends on the following factors
- the hardwares
ia64 nodes boots much slower than the ia32 nodes. Some sfire nodes takes time to initialize its RAID controller kernel.
- the number of packages to install
The more packages you need to install, the more time it takes.
- the system load of OSLO server
If multiple OSLO clients are requesting nodes setting up, it surely will slow down the setup process. And when OSLO server is doing massive pool resyncing(very rare situation) the setup speed will also be affected.
- the system configuration
If there are infiniband adapter installed on the node, it takes some time(about 1 minutes each) for the infiniband interface to be initialized.
[] Do I still need to re-setup the node after rebooting it?
Absolutely NO, unless somebody else reserved it and setup it after you did that. No re-setup is needed after rebooting.
This
is not the old setup_nodes.sh. Doing re-setup on rebooting is totally a
waste of your precious time and of course, a waste of server resources.
[] So what do I get after setting up the nodes?
After setting up, the nodes boots into the specified(or default)
kernel and specified platform(distribution/architecture combination)
with specified packages installed.
All the nodes you just set up share a common READ ONLY NFS root.
Only /etc, /var, and /tmp are writable so that you can do normal operations on each of the nodes.
Also, the following directories are mounted R/W through NFS from lts-head: /home, /testsuite, /opt/lts, /var/cache/cfs, /notbackedup.
These directories are mounted for YALA use exclusively: /export mounted R/W and /export/yala R/O.
[] But what if I want to change something on the rootfs?
If you want to change something (like installing your own packages) on the rootfs, you can do it by chrooting into "/rw" on ONE of the nodes and make the changes. "/rw" is the same rootfs that is mounted R/W.
Keep in mind that all the nodes you have set up share the same rootfs.
So when you modify anything in one of the nodes /rw dir, it takes
effect on all your nodes(Of course, except /etc, /var and /tmp because
they are mounted as tmpfs.).
EXCEPTION: In some distributions, "/rw" is still READ ONLY due to upstream bug described in . But you can remount it as R/W when you need to. The command to remount the rootfs R/W is:
mount -oremount,rw $(cat /proc/mounts |grep /rw|awk '{print $1}') /rw
[] Is there any non-privileged user that can be used to run MPI programs?
YES.
mpiuser is a pre-existing non-privileged user that is setup to be able to satisfy this need.
You can switch to "mpiuser" once you login as root and "su - mpiuser".
[] Why does OSLO fail with message like "nodeX is being setup by someone"?
The OSLO server tries to lock the nodes you are going to setup
before it actually setup them. When it fails to lock some of the nodes,
it complains using the message you are seeing.
This happens mostly because you(or someone else) is setting up nodeX
using OSLO. And if you suspened OSLO by sending SIGSTOP (using
'CTRL+z', kill or whatever means you know of) to it then execute
another OSLO instance to setup the same node, you'll probably see this
message. Sometimes, but very rare, it happens when you(or someone else)
aborted OSLO (using 'CTRL+c', kill or whatever mans you know of) but
the server is doing cleanups before aborting.
So when you are seeing such message, make sure that you don't have a
running/suspended OSLO that is also trying to setup the same node. If
you are sure that you don't have such OSLO instance running, then
probably someone else is setting up the node.
[edit] Why do I get message like "/tmp/xyz: No space left on device"?
Probably you have setup your node with low memory simulation using '-m' or '--memory-size' option.
Keep in mind that /tmp, /etc and /var are mounted as tmpfs and it uses system memory to store its data.
So
when you specified a relatively low memory size for the node, you'll
probably get ENOSPC while writing large files in /tmp, /etc or /var.
A workaround for this issue is to direct your application to write
files to an local mounted device or an NFS based directory such as
/home/yourdir or /testsuite.
[] Why, sometimes, can't I ssh to the node(s) after setup or reboot?
THIS ISSUE IS NOW FIXED
Simply put, it is caused by the
mkdisklessinitrd script which is in a mess. When booting, the files
needed by PAM for SSH authentication is not successfully copied into
tmpfs. For detailed description of this issue, please refer to and .
You can enable SSH access to the node(s) using this command on lts-head:
pdsh -S -w $node 'cp -a /rw/etc/pam.d/* /etc/pam.d'
[] I find some bugs
If you feel like you've found some OSLO bug, please blocking .
[] I want to add some feature to OSLO
Treat it as a bug. :-) Please refer to above section.
[] I have some advices and/or suggestions
Advices, suggestions and comments are always welcome!
Please write or talk to Wang Yibin, the OSLO author and maintainer, via email(yibin.wang@sun.com) or IRC(nickname wangyb).
[edit] View current bugs
Click please.
[] Known issues
-
sfire4 can't boot into its specified IP address due to BIOS settings. See bug . -
sfire6 console baud rate set to 115200 rather than 9600. See bug . -
Sometimes the node is not accessible via SSH. See bug . workarounds are 1) pdsh -S -w $node 'cp -a /rw/etc/pam.d/* /etc/pam'; or 2) reboot the node; or 3) re-setup the node.
[] Technical overview
This section aims for those who are interested in OSLO development.
[] Features
These are new features compared to the old setup_nodes.sh:
- Modulized functions and hierarchial function calls;
- Setupnode job scheduler daemon;
- Automatic job dispatcher to corresponding server;
- Server/Client model job handling;
- Zombie(Suspended) client auto-detection;
- Locking nodes before actually setting them up;
- Supplier/Consumer model adaptive OS image pool;
- On demand pool refreshing;
- Spinlock support on various modules;
- Fault tolerance power management, PXE configuration;
- Flexible configurations
- Easy setup on different distributions(Ubuntu/RedHat)
- OS image reuse
[] Components
[] Infrastructures
- Pristinie OS images
This is the operation system images used as 'upstream', IOW,
'pristine' images. They are the original copies that OS image pool
mirrors. They are located in the oslo server, and the location can be
configured in the server's profile under
$OSLO_ROOT/common/profile/$SERVER_HOSTNAME
Current OS images include RHEL3/RHEL4/RHEL5/SLES9/SLES10 on i686/x86_64/ia64 architectures except SLES9/ia64.
- NFS server
It locates in the OSLO server and the os images that client nodes use are exported through NFS server. Note that RHEL version NFS server contains a bug that prevents successfully restarting. I have fixed this and the workable nfs service can be found in $OSLO_ROOT/bin/daemons/nfs.ia64.modified
- RPM package management
Currently OSLO only support RPM package installation. But new package management can be added.
- Power management device
This is facility located on lts-head that do node power management. OSLO uses it to power cycle the nodes.
- TFTP boot server
This is facility located on lts-head that enables node to do diskless boot via PXE.
[] Modules
- LVM management
Handles logical volume operations like create, rename, mount, delete, etc.
- NFS managementn
Handles NFS operations like export, unexport, etc.
- Adaptive OS image pool
Maintains a pool of OS images ready for use. Provides API to OSLO server for OS image renting.
- Package management
Deals with package installation, upgrade, deletion, etc.
- PXE boot configuration
Deals with diskless boot through PXE.
- Node management
Deals with node power management etc.
- Spinlock system
Provide a common mechanism of exclusive operations upon shared objects.
[] Daemons
- OS image pool service daemon
This daemon provides the following functionalities:
- Replaying(actually remounting and re-exporting) user OS images on startup or server reboot;
- Removing incomplete OS images due to service abnormal exit or server reboot;
- OS image reuse when unused OS image detected;
- Garbage collecting;
- Adaptive Pool refilling;
- On-demand pool refreshing;
- Job scheduler service daemon
This daemon monitors OSLO client setup request and serve the request by dispatching request to OSLO server.
[] Utilities
- Make diskless initrd utility
- OSLO Job scheduler
- OSLO server
- OSLO client
- OSLO installer
[] Deployment guide
The following steps is done on the box that you choose to act as OSLO server
- setup either a RHEL or a Ubuntu on the box;
- setup (optionally RAID) LVM;
- create pristine OS images;
- configure NFS/LVM to be usable;
- check out OSLO source from CVS(qe/oslo) to, say, /root/oslo;
- create a profile for your server in /root/oslo/common/profile/$(hostname). You can copy /root/oslo/common/profile/sample and edit it as needed;
- cd /root/oslo/bin/misc and run ./do_install_checks to check whether the server is ready to act as OSLO server. Fix any issue that is found during the check;
- cd /root/oslo/bin/daemons;
- edit setup_daemons and modify "instdir" to where you want to install oslo;
- execute setup_daemons and there you go!
To install a oslo client, all you need to do is to
- copy oslo/bin/snclient to a dir that you choose, say /bin;
- make sure $JOBDIR_QUEUE_ROOT is the same with $JOBDIR_ROOT in oslo/bin/snjsd. This dir should be an NFS exported dir.