分类: LINUX
2008-10-24 18:32:55
pdsh - issue commands to groups of hosts in parallel
pdsh [options]... command
pdsh is a variant of the (1) command. Unlike (1), which runs commands on a single remote host, pdsh can run multiple remote commands in parallel. pdsh uses a "sliding window" (or fanout) of threads to conserve resources on the initiating host while allowing some connections to time out.
When pdsh receives SIGINT (ctrl-C), it lists the status of current threads. A second SIGINT within one second terminates the program. Pending threads may be canceled by issuing ctrl-Z within one second of ctrl-C. Pending threads are those that have not yet been initiated, or are still in the process of connecting to the remote host.
If a remote command is not specified on the command line, pdsh runs interactively, prompting for commands and executing them when terminated with a carriage return. In interactive mode, target nodes that time out on the first command are not contacted for subsequent commands, and commands prefixed with an exclamation point will be executed on the local system.
The core functionality of pdsh may be supplemented by dynamically loadable modules. The modules may provide a new connection protocol (replacing the standard (3) protocol used by (1)), filtering options (e.g. removing hosts that are "down" from the target list), and/or host selection options (e.g., -a selects all hosts from a configuration file.). By default, pdsh must have at least one "rcmd" module loaded. See the RCMD MODULES section for more information.
The method by which pdsh runs commands on remote hosts may be selected at runtime using the -R option (See OPTIONS below). This functionality is ultimately implemented via dynamically loadable modules, and so the list of available options may be different from installation to installation. A list of currently available rcmd modules is printed when using any of the -h, -V, or -L options. The default rcmd module will also be displayed with the -h and -V options.
A list of rcmd modules currently distributed with pdsh follows.
Uses an internal, thread-safe implementation of BSD (3) to run commands using the standard (1) protocol.
ssh
Uses a variant of (3) to run multiple copies of the (1) command.
mrsh
This module uses the mrsh(1) protocol to execute jobs on remote hosts. The mrsh protocol uses a credential based authentication, forgoing the need to allocate reserved ports. In other aspects, it acts just like rsh. Remote nodes must be running mrshd(8) in order for the mrsh module to work.
qsh
Allows pdsh to execute MPI jobs over QsNet. Qshell propagates the current working directory, pdsh environment, and Elan capabilities to the remote process. The following environment variable are also appended to the environment: RMS_RANK, RMS_NODEID, RMS_PROCID, RMS_NNODES, and RMS_NPROCS. Since pdsh needs to run setuid root for qshell support, qshell does not directly support propagation of LD_LIBRARY_PATH and LD_PREOPEN. Instead the QSHELL_REMOTE_LD_LIBRARY_PATH and QSHELL_REMOTE_LD_PREOPEN environment variables will may be used and will be remapped to LD_LIBRARY_PATH and LD_PREOPEN by the qshell daemon if set.
mqsh
Similar to qshell, but uses the mrsh protocol instead of the rsh protocol.
krb4
The krb4 module allows users to execute remote commands after authenticating with kerberos. Of course, the remote rshd daemons must be kerberized.
xcpu
The xcpu module uses the xcpu service to execute remote commands.
The list of available options is determined at runtime by supplementing the list of standard pdsh options with any options provided by loaded rcmd and misc modules. In some cases, options provided by modules may conflict with each other. In these cases, the modules are incompatible and the first module loaded wins.
-w [rcmd_type:][user@]host,host,...
Return the largest of the remote command return values.
-h
Output usage menu and quit. A list of available rcmd modules will also be printed at the end of the usage message.
-s
Only on AIX, separate remote command stderr and stdout into two sockets.
-q
List option values and the target nodelist and exit without action.
-b
Disable ctrl-C status feature so that a single ctrl-C kills parallel job. (Batch Mode)
List info on all loaded pdsh modules and quit.
-d
Include more complete thread status when SIGINT is received, and display connect and command time statistics on stderr when done.
-V
Output pdsh version information, along with list of currently loaded modules, and exit.
-n tasks_per_node
Target all nodes from machines file.
In addition to the genders options presented below, the genders attribute pdsh_rcmd_type may also be used in the genders database to specify an alternate rcmd connect type than the pdsh default for hosts with this attribute. For example, the following line in the genders file
host0 pdsh_rcmd_type=sshwould cause pdsh to use ssh to connect to host0, even if rsh were the default. This can be overridden on the commandline with the "rcmd_type:host0" syntax.
Target all nodes in genders database. The -A option will target every host listed in genders -- if you want to omit some hosts by default, see the -a option below.
-a
Target all nodes in genders database except those with the "pdsh_all_skip" attribute. This is shorthand for running "pdsh -A -X pdsh_all_skip ..."
Request translation between canonical and alternate hostnames.
Eliminate target nodes that are considered "down" by libnodeupdown.
The slurm module allows pdsh to target nodes based on currently running SLURM jobs. The slurm module is typically called after all other node selection options have been processed, and if no nodes have been selected, the module will attempt to read a running jobid from the SLURM_JOBID environment variable (which is set when running under a SLURM allocation). If SLURM_JOBID references an invalid job, it will be silently ignored.
The rms module allows pdsh to target nodes based on an RMS resource. The rms module is typically called after all other node selection options, and if no nodes have been selected, the module will examine the RMS_RESOURCEID environment variable and attempt to set the target list of hosts to the nodes in the RMS resource. If an invalid resource is denoted, the variable is silently ignored.
The SDR module supports targeting hosts via the System Data Repository on IBM SPs.
Target all nodes in the SDR. The list is generated from the "reliable hostname" in the SDR by default.
-i
Translate hostnames between reliable and initial in the SDR, when applicable. If the a target hostname matches either the initial or reliable hostname in the SDR, the alternate name will be substitued. Thus a list composed of initial hostnames will instead be replaced with a list of reliable hostnames. For example, when used with -a above, all initial hostnames in the SDR are targeted.
-v
Do not target nodes that are marked as not responding in the SDR on the targeted interface. (If a hostname does not appear in the SDR, then that name will remain in the target hostlist.)
-G
In combination with -a, include all partitions.
The nodeattr module supports access to the genders database via the nodeattr(1) command. See the genders section above for a list of support options with this module. The option usage with the nodeattr module is the same as genders, above, with the exception that the -i option may only be used with -a or -g.
The dshgroup module allows pdsh to use dsh (or Dancer's shell) style group files from /etc/dsh/group/ or ~/.dsh/group/.
The netgroup module allows pdsh to use standard netgroup entries to build lists of target hosts. (/etc/netgroup or NIS)
PDSH_RCMD_TYPE
If no other node selection option is used, the WCOLL environment variable may be set to a filename from which a list of target hosts will be read. The file should contain a list of hosts, one per line (though each line may contain a hostlist expression. See HOSTLIST EXPRESSIONS section below).
Set the pdsh fanout (See description of -f above).
As noted in sections above pdsh accepts lists of hosts the general form: prefix[n-m,l-k,...], where n < m and l < k, etc., as an alternative to explicit lists of hosts. This form should not be confused with regular expression character classes (also denoted by ''[]''). For example, foo[19] does not represent an expression matching foo1 or foo9, but rather represents the degenerate hostlist: foo19.
The hostlist syntax is meant only as a convenience on clusters with a "prefixNNN" naming convention and specification of ranges should not be considered necessary -- this foo1,foo9 could be specified as such, or by the hostlist foo[1,9].
Some examples of usage follow:
Run command on foo01,foo02,...,foo05
pdsh -w foo[01-05] command
Run command on foo7,foo9,foo10
pdsh -w foo[7,9-10] command
pdsh -w foo[0-5] -x foo[1-3] command
pdsh -w "foo[01-05]" command
Originally a rewrite of IBM dsh(1) by Jim Garlick <> on LLNL's ASCI Blue-Pacific IBM SP system. It is now used on Linux clusters at LLNL.
When using ssh for remote execution, expect the stderr of ssh to be folded in with that of the remote command. When invoked by pdsh, it is not possible for ssh to prompt for passwords if RSA/DSA keys are configured properly, etc.. Additionally, the connect timeout is not adjustable when ssh is used. Finally, there is no reliable way for pdsh to ensure that remote commands are actually terminated when using a command timeout. Thus if -u is used with ssh commands may be left running on remote hosts even after timeout has killed local ssh processes.
Output from multiple processes per node may be interspersed when using qshell or mqshell rcmd modules.
Hostlist parsing assumes numerical part of hostname is at the end only, e.g., specifying foo[0-5]bar will not work.
The number of nodes that pdsh can simultaneously execute remote jobs on is limited by the maximum number of threads that can be created concurrently, as well as the availability of reserved ports in the rsh and qshell rcmd modules. On systems that implement Posix threads, the limit is typically defined by the constant PTHREADS_THREADS_MAX.
(1), (1), (1), (1)