HP XC System 3.x Software User Manual
Page 85
Be sure to unset the SLURM_JOBID when you are finished with the allocation, to prevent a previous SLURM
JOBID from interfering with future jobs:
$ unset SLURM_JOBID
The following examples illustrate launching interactive MPI jobs. They use the hellompi job script introduced
in
.
Example 9-8 Launching an Interactive MPI Job
$ mpirun -srun --jobid=150 hellompi
Hello! I'm rank 0 of 4 on n1
Hello! I'm rank 1 of 4 on n2
Hello! I'm rank 2 of 4 on n3
Hello! I'm rank 3 of 4 on n4
Example 9-9 Launching an Interactive MPI Job on All Cores in the Allocation
This example assumes 2 cores per node.
$ mpirun -srun --jobid=150 -n8 hellompi
Hello! I'm rank 0 of 8 on n1
Hello! I'm rank 1 of 8 on n1
Hello! I'm rank 2 of 8 on n2
Hello! I'm rank 3 of 8 on n2
Hello! I'm rank 4 of 8 on n3
Hello! I'm rank 5 of 8 on n3
Hello! I'm rank 6 of 8 on n4
Hello! I'm rank 7 of 8 on n4
Alternatively, you can use the following:
$ export SLURM_JOBID=150
$ export SLURM_NPROCS=8
$ mpirun -srun hellompi
Hello! I'm rank 0 of 8 on n1
Hello! I'm rank 1 of 8 on n1
Hello! I'm rank 2 of 8 on n2
Hello! I'm rank 3 of 8 on n2
Hello! I'm rank 4 of 8 on n3
Hello! I'm rank 5 of 8 on n3
Hello! I'm rank 6 of 8 on n4
Hello! I'm rank 7 of 8 on n4
Use ssh to launch a Totalview debugger session, assuming that TotalView is installed and licensed and that
ssh X forwarding
is properly configured:
$ export SLURM_JOBID=150
$ export SLURM_NPROCS=4
$ mpirun -tv srun
additional parameters as needed
After you finish with this interactive allocation, exit the /bin/bash process in the first terminal; this ends
the LSF job.
Note
If you exported any variables, such as SLURM_JOBID and SLURM_NPROCS, be sure to unset them as
follows before submitting any further jobs from the second terminal:
$ unset SLURM_JOBID
$ unset SLURM_NPROCS
You do not need to launch the /bin/bash shell to be able to interact with any compute node resources;
any running job will suffice. This is excellent for checking on long-running jobs. For example, if you had
submitted a CPU-intensive job, you could execute the uptime command on all nodes in the allocation to
Using LSF-HPC
85