4 launching jobs with the srun command, Example 6-1: simple launch of a serial program, 1 the srun roles and modes – HP XC System 2.x Software User Manual
Page 72: Section 6.4, Section 6.4), 3 accessing the slurm manpages
Table 6-1: SLURM Commands (cont.)
Command
Function
sinfo
Reports the state of partitions and nodes managed by SLURM. It has a wide variety
of filtering, sorting, and formatting options.
sinfo
displays a summary of available
partition and node (not job) information (such as partition names, nodes/partition,
and CPUs/node).
scontrol
Is an administrative tool used to view or modify the SLURM state. Typically, users
do not need to access this command. Therefore, the
scontrol
command can only
be executed as user
root
. Refer to the HP XC System Software Administration
Guide for information about using this command.
The
-help
command option also provides a brief summary of SLURM options. Note that
command options are not case sensitive.
6.3 Accessing the SLURM Manpages
You can also view online descriptions of these commands by accessing the SLURM manpages.
Manpages are provided for all SLURM commands and API functions. If SLURM manpages
are not already available in your
MANPATH
environment variable, you can set and export
them as follows:
$ MANPATH=$MANPATH:/opt/hptc/man
$ export MANPATH
You can now access the SLURM manpages with the standard
man
command. For example:
$ man srun
6.4 Launching Jobs with the
srun
Command
The
srun
command submits jobs to run under SLURM management. Jobs can be submitted to
run in parallel on multiple compute nodes.
srun
is used to submit a job for execution, allocate
resources, attach to an existing allocation, or initiate job steps.
srun
can perform the following:
•
Submit a batch job and then terminate
•
Submit an interactive job and then persist to shepherd the job as it runs
•
Allocate resources to a shell and then spawn that shell for use in running subordinate jobs
Jobs can be submitted for immediate execution or later execution (batch).
srun
has a wide
variety of options to specify resource requirements, including: minimum and maximum node
count, processor count, specific nodes to use or not use, and specific node characteristics (so
much memory, disk space, certain required features). Besides securing a resource allocation,
srun
is used to initiate job steps. These job steps can execute sequentially or in parallel on
independent or shared nodes within the job’s node allocation.
Example 6-1: Simple Launch of a Serial Program
$ srun -n2 -l hostname
0: n1
1: n1
6.4.1 The
srun
Roles and Modes
The
srun
command submits jobs to run under SLURM management. The
srun
command can
perform many roles in launching and managing your job.
srun
also provides several distinct
usage modes to accommodate the roles it performs.
6-2
Using SLURM