beautypg.com

HP XC System 3.x Software User Manual

Page 85

background image

Be sure to unset the SLURM_JOBID when you are finished with the allocation, to prevent a previous SLURM
JOBID from interfering with future jobs:

$ unset SLURM_JOBID

The following examples illustrate launching interactive MPI jobs. They use the hellompi job script introduced
in

Section (page 48)

.

Example 9-8 Launching an Interactive MPI Job

$ mpirun -srun --jobid=150 hellompi

Hello! I'm rank 0 of 4 on n1

Hello! I'm rank 1 of 4 on n2

Hello! I'm rank 2 of 4 on n3

Hello! I'm rank 3 of 4 on n4

Example 9-9 Launching an Interactive MPI Job on All Cores in the Allocation

This example assumes 2 cores per node.

$ mpirun -srun --jobid=150 -n8 hellompi

Hello! I'm rank 0 of 8 on n1

Hello! I'm rank 1 of 8 on n1

Hello! I'm rank 2 of 8 on n2

Hello! I'm rank 3 of 8 on n2

Hello! I'm rank 4 of 8 on n3

Hello! I'm rank 5 of 8 on n3

Hello! I'm rank 6 of 8 on n4

Hello! I'm rank 7 of 8 on n4

Alternatively, you can use the following:

$ export SLURM_JOBID=150

$ export SLURM_NPROCS=8

$ mpirun -srun hellompi

Hello! I'm rank 0 of 8 on n1

Hello! I'm rank 1 of 8 on n1

Hello! I'm rank 2 of 8 on n2

Hello! I'm rank 3 of 8 on n2

Hello! I'm rank 4 of 8 on n3

Hello! I'm rank 5 of 8 on n3

Hello! I'm rank 6 of 8 on n4

Hello! I'm rank 7 of 8 on n4

Use ssh to launch a Totalview debugger session, assuming that TotalView is installed and licensed and that
ssh X forwarding

is properly configured:

$ export SLURM_JOBID=150

$ export SLURM_NPROCS=4

$ mpirun -tv srun

additional parameters as needed

After you finish with this interactive allocation, exit the /bin/bash process in the first terminal; this ends
the LSF job.

Note

If you exported any variables, such as SLURM_JOBID and SLURM_NPROCS, be sure to unset them as
follows before submitting any further jobs from the second terminal:

$ unset SLURM_JOBID

$ unset SLURM_NPROCS

You do not need to launch the /bin/bash shell to be able to interact with any compute node resources;
any running job will suffice. This is excellent for checking on long-running jobs. For example, if you had
submitted a CPU-intensive job, you could execute the uptime command on all nodes in the allocation to

Using LSF-HPC

85