Pre-install Configuration
- Configure RSH
Rsh needs to be configured to allow passwordless communication between the head node and the compute nodes. This is done by creating /etc/hosts.equiv on the head and compute nodes.
- Head node /etc/hosts.equiv
#This file contains a list of all of the compute nodes hostnames
node01
node02
-
-
nodeXX
- Compute node /etc/hosts.equiv
#This file contains the head node hostname
node00
- Enable rsh, rlogin, and rexec
Change the "disable=yes" to "disable=no" in each of their respective xinetd scripts located in /etc/xinetd.d
# /sbin/service xinetd reload
Security note: Make sure you are on a private network, or that your firewall is properly configured so as to deny access from all untrusted IP addresses.
- Test rsh
From the head node, use rsh to login to a compute node as a non-root user
$ rsh nodeXX
From the compute node, use rsh to login to the head node as a non-root user
$ rsh node00
If a password prompt appears in either case, rsh is not configured correctly.
- Download OpenPBS
OpenPBS can be obtained for free from ; registration is required. Download OpenPBS_2_3_16.tar.gz into a directory such as /home/download on the head node. RPMS are also available, but are not yet compatible with gcc 3.0+ that ships with recent version of Linux.
Installation
If you are using the pre-compiled RPMs, you may simply install the RPMs and skip this section. Note that the RPMs are incompatible with gcc 3.0, and may complain of a binary incompatibility. You must then compile OpenPBS from source as described below.
- Untar PBS
# cd /home/download
# tar -xvzf OpenPBS_2_3_16.tar.gz
# cd OpenPBS_2_3_16/
- Patch PBS files
Download the following .
# cd /home/download/OpenPBS_2_3_16
# patch -p1 -b < pbs.patch
- Compile PBS
- Head Node
# mkdir /home/downlad/OpenPBS_2_3_16/head
# cd /home/downlad/OpenPBS_2_3_16/head
# ../configure --disable-gui --set-server-home=/var/spool/PBS --set-default-server=node00
# make
# make install
This disables the GUI and sets the installation directory to /var/spool/PBS instead of /usr/spool/PBS. This also sets the default PBS server to node00, which should be the hostname of the head node. The source is then compiled and the binaries are then installed.
- Compute Nodes
# mkdir /home/downlad/OpenPBS_2_3_16/compute
# cd /home/downlad/OpenPBS_2_3_16/compute
# ../configure --disable-gui --set-server-home=/var/spool/PBS --disable-server --set-default-server=node00 --set-sched=no
# make
# rsh nodeXX 'cd /home/download/OpenPBS_2_3_16/compute; make install'
After the head node is successfully installed, the configuration script is run again in a separate directory for the compute nodes. This disables the GUI and sets the installation directory to /var/spool/PBS instead of /usr/spool/PBS. This also sets the default PBS server to node00, which should be the hostname of the head node. It also disables the server and scheduling processes of PBS on the compute nodes, since they are not necessary. The source is then compiled and installed using rsh.
- Install documentation
# cd /home/download/OpenPBS_2_3_16/head/doc
# make install
Configure PBS
- Create PBS node description file
On the head node, create the file /var/spool/PBS/server_priv/nodes. #This file contains a list of all of the compute nodes hostnames
node01 np=1
node02 np=1
-
-
nodeXX np=1
where nodeXX corresponds to the hostname of the compute node and np corresponds to the number of processsors on the node.
- Configure PBS mom
On the head node and on each compute node, create the file /var/spool/PBS/mom_priv/config. #/var/spool/PBS/mom_priv/config
$logevent 0x0ff
$clienthost node00
$restricted node00
This causes all messages except for debugging messages to be logged and sets the primary server to node00. It also allows node00 to monitor OpenPBS.
On the head node and on each compute node, start PBS mom with
# /usr/local/sbin/pbs_mom
- Configure PBS server
On the head node,
# /usr/local/sbin/pbs_server -t create
# qmgr
>c q workq
>s q workq queue_type=execution
>s q workq enabled=true
>s q workq started=true
>s s default_queue=workq
>s s scheduling=true
>s s query_other_jobs=true
>s s node_pack=false
>s s log_events=511
>s s scheduler_iteration=600
>s s resources_default.neednodes=1
>s s resources_default.nodect=1
>s s resources_default.nodes=1
>quit
This creates an execution queue called workq that is enabled and started. It is then declared the default queue for the server. Logging and scheduling are enabled on the server and node_pack is set to false. The default number of nodes is set to 1.
It is useful to backup the PBS server configuration file with
# qmgr -c "print server" > /var/spool/PBS/qmgr.conf
so that the PBS server could be restored with
# qmgr < /var/spool/PBS/qmgr.conf
- Start PBS scheduler
On the head node, PBS scheduling needs to be started after the server configuration is complete. This is done by,
#/usr/local/sbin/pbs_sched
- Enable PBS on startup
On the head node and on each compute node,
- Create the script .
- Set PBS to automatically restart when the computer boots
# chkconfig pbs on
- Restart PBS
On the head node and on each compute node, manually restart OpenPBS for all the configuration changes to take effect by using
#/sbin/service pbs stop
#/sbin/service pbs start
Testing PBS
After installing and configuring OpenPBS, testing should be done to verify that everything is working properly. In order to do this, a simple test script is created
#!/bin/sh
#testpbs
echo This is a test
echo Today is `date`
echo This is `hostname`
echo The current working directory is `pwd`
ls -alF /home
uptime
and then submitted to PBS using the qsub command as a non-root user.
$ qsub testpbs
After the job is executed, the output is stored in the directory from which the job was submitted. If errors occur or output is not received, check the log files for messages about the job (especially the precise name used by PBS for the headnode in the server_logs).
# more /var/spool/PBS/*_logs/*
阅读(1160) | 评论(0) | 转发(1) |