beautypg.com

3 standard lsf, 4 how lsf-hpc and slurm interact, 5 hp-mpi – HP XC System 3.x Software User Manual

Page 30

background image

1.5.3 Standard LSF

Standard LSF is also available on the HP XC system. The information for using

standard LSF

is

documented in the LSF manuals from Platform Computing. For your convenience, the HP XC
documentation CD contains these manuals.

1.5.4 How LSF-HPC and SLURM Interact

In the HP XC environment, LSF-HPC cooperates with SLURM to combine the powerful scheduling
functionality of LSF-HPC with the scalable parallel job launching capabilities of SLURM. LSF-HPC
acts primarily as a workload scheduler on top of the SLURM system, providing policy and
topology-based scheduling for end users. SLURM provides an execution and monitoring layer
for LSF-HPC. LSF-HPC uses SLURM to detect system topology information, make scheduling
decisions, and launch jobs on allocated resources.

When a job is submitted to LSF-HPC, LSF-HPC schedules the job based on job resource
requirements. LSF-HPC communicates with SLURM to allocate the required HP XC compute
nodes for the job from the SLURM lsf partition. LSF-HPC provides node-level scheduling for
parallel jobs, and core-level scheduling for serial jobs. Because of node-level scheduling, a parallel
job may be allocated more cores than it requested, depending on its resource request; the srun
or mpirun -srun launch commands within the job still honor the original request. LSF-HPC
always tries to pack multiple serial jobs on the same node, with one core per job. Parallel jobs
and serial jobs cannot coexist on the same node.

After the LSF-HPC scheduler allocates the SLURM resources for a job, the SLURM allocation
information is recorded with the job. You can view this information with the bjobs and bhist
commands.

When LSF-HPC starts a job, it sets the SLURM_JOBID and SLURM_NPROCS environment variables
in the job environment. SLURM_JOBID associates the LSF-HPC job with SLURM's allocated
resources. The SLURM_NPROCS environment variable is set to the originally requested number
of cores. LSF-HPC dispatches the job from the

LSF execution host

, which is the same node on

which LSF-HPC daemons run. The LSF-HPC JOB_STARTER script, which is configured for all
queues, uses the srun command to launch a user job on the first node in the allocation. Your job
can contain additional srun or mpirun commands to launch tasks to all nodes in the allocation.

While a job is running, all resource limits supported by LSF-HPC enforced, including core limit,
CPU time limit, data limit, file size limit, memory limit and stack limit. When you terminate a
job, LSF-HPC uses the SLURM scancel command to propagate the signal to the entire job.

After a job finishes, LSF-HPC releases all allocated resources.

A detailed description, along with an example and illustration, of how LSF-HPC and SLURM
cooperate to launch and manage jobs is provided in

“How LSF-HPC and SLURM Launch and

Manage a Job”

. It is highly recommended that you review this information.

In summary, and in general:

LSF-HPC

Determines WHEN and WHERE the job will run. LSF-HPC communicates with
SLURM to determine WHICH resources are available, and SELECTS the appropriate
set of nodes for the job.

SLURM

Allocates nodes for jobs as determined by LSF-HPC. It CONTROLS task/rank
distribution within the allocated nodes. SLURM also starts the executables on each
host as requested by the HP-MPI mpirun command.

HP-MPI

Determines HOW the job runs. It is part of the application, so it performs
communication. HP-

MPI

can also pinpoint the processor on which each rank runs.

1.5.5 HP-MPI

HP-

MPI

is a high-performance implementation of the Message Passing Interface (MPI) standard

and is included with the HP XC system. HP-MPI uses SLURM to launch jobs on an HP XC system

30

Overview of the User Environment