HP XC System 2.x Software User Manual
Page 76
![background image](/manuals/398425/76/background.png)
Each partition’s node limits supersede those specified by
-N
. Jobs that request more nodes than
the partition allows never leave the PENDING state. To use a specific partition, use the
srun
-p
option. Combinations of
-n
and
-N
control how job processes are distributed among nodes
according to the following
srun
policies:
-n
/
-N
combinations
srun
infers your intended number of processes per node if you
specify both the number of processes and the number of nodes
for your job. Thus
-n
16
-N
8 normally results in running 2
processes/node. But, see the next policy for exceptions.
Minimum interpretation
srun
interprets all node requests as minimum node requests (
-N16
means "at least 16 nodes"). If some nodes lack enough CPUs to
cover the process count specified by
-n
,
srun
will automatically
allocate more nodes (than mentioned with
-N
) to meet the need. For
example, if not all nodes have 2 working CPUs, then
-n32 -N16
together will allocate more than 16 nodes so that all processes are
supported. The actual number of nodes assigned (not the number
requested) is stored in environment variable SLURM_NNODES.
CPU overcommitment
By default,
srun
never allocates more than one process per CPU. If
you intend to assign multiple processes per CPU, you must invoke
the
srun -O
option along with
-n
and
-N
. Thus,
-n16 -N4 -O
together allow 2 processes per CPU on the 4 allocated 2-CPU nodes.
Inconsistent allocation
srun
rejects as errors inconsistent
-n
/
-N
combinations. For
example,
-n15 -N16
requests the impossible assignment of 15
processes to 16 nodes.
-c cpt
(
--cpus-per-task=cpt
)
The
-c cpt
option assigns cpt CPUs per process for this job (default is one CPU per process).
This option supports multithreaded programs that require more than a single CPU per process
for best performance.
For multithreaded programs where the density of CPUs is more important than a specific node
count, use both
-n
and
-c
on the
srun
execute line rather than
-N
. The options
-n16
and
-c2
result in whatever node allocation is needed to yield the requested 2 CPUs/process. This is
the reverse of CPU overcommitment (see
-N
and
-O
options).
-p part
(
--partition=part
)
The
-p part
option requests nodes only from the part partition. The default partition is
assigned by the system administrator.
-t minutes
(
--time=minutes
)
The
-t minutes
option allocates a total number of minutes for this job to run (default is the
current partition’s time limit). If the number of minutes exceeds the partition’s time limit, then
the job never leaves the PENDING state. When the time limit has been reached, SLURM
sends each job process
SIGTERM
followed (after a pause specified by SLURM’s
KillWait
configuration parameter) by
SIGKILL
.
-T nthreads
(
--threads=nthreads
)
The
-T nthreads
option requests that
srun
allocate
nthreads
threads to initiate and
control the parallel tasks in this job. The default is the smaller of either 10 or the number of
nodes actually allocated,
SLURM_NNODES
.
6-6
Using SLURM