beautypg.com

4 srun resource-allocation options – HP XC System 2.x Software User Manual

Page 75

background image

If you specify a script at the end of the

srun

command line (not as an argument to

-A

), the

spawned shell executes that script using the allocated resources (interactively, without a queue).
See the

-b

option for script requirements.

If you specify no script, you can then execute other instances of

srun

interactively, within the

spawned subshell, to run multiple parallel jobs on the resources that you allocated to the subshell.
Resources (such as nodes) will only be freed for other jobs when you terminate the subshell.

-a=jobid

(

--attach=jobid

)

The

-a=jobid

option attaches (or reattaches) your current

srun

session to the already

running job whose SLURM ID is

jobid

. The job to which you attach must have its resources

managed by SLURM, but it can be either interactive ("allocated," started with

-A

) or batch

(started with -

-b

). This option allows you to monitor or intervene in previously started

srun

jobs. You cannot use

-a

with

-b

or

-A

. Because the running job to which you attach already

has its resources specified, you cannot use

-a

with

-n

,

-N

, or

-c

. You can only attach to

jobs for which you are the authorized owner.

By default,

-a

attaches to the designated job read-only.

stdout

and

stderr

are copied to

the attaching

srun

, just as if the current

srun

session had started the job. However, signals

are not forwarded to the remote processes (and a single Ctrl/C will detach the read-only

srun

from the job).

If you use

-j

(

-join

) or

-s

(

-steal)

along with

-a

, your

srun

session joins the running

job and can also forward signals to it as well as receive

stdout

and

stderr

from it. If you

join a SLURM batch (

-b

) job, you can send signals to its batch script. Join (

-j

) does not

forward

stdin

, but steal (

-s

, which closes other open sessions with the job) does forward

stdin

as well as signals.

-j

(

--join

)

The

-j

option joins a running SLURM job (always used only with

-a

option to specify the

jobid). This not only duplicates

stdout

and

stderr

to the attaching

srun

session, but it

also forwards signals to the job’s script or processes as well.

-s

(

--steal

)

The

-s

option steals all connections to a running SLURM job (always used only with

-a

option to specify the jobid).

-steal

closes any open sessions with the specified job, then

copies

stdout

and

stderr

to the attaching

srun

session, and it also forwards both signals

and

stdin

to the job’s script or processes.

6.4.4

srun

Resource-Allocation Options

The

srun

options assign compute resources to your parallel SLURM-managed job. These

options can be used alone or in combination. Also, refer to the other

srun

options that can

affect node management for your job, especially the control options and constraint options.

-n procs

(

--nprocs=procs

)

The

-n procs

option requests that

srun

execute procs processes. To control how these

processes are distributed among nodes and CPUs, combine

-n

with

-c

or

-N

as explained

below (default is one process per node).

-N n

(

--nodes=n

)

The

-N n

option allocates at least n nodes to this job, where n may be one of the following:

a specific node count (such as

-N

16)

a node count range (such as

-N

14-18)

Using SLURM

6-5