2 load sharing facility (lsf-hpc), 3 standard lsf, 4 how lsf-hpc and slurm interact – HP XC System 3.x Software User Manual
Page 25

1.5.2 Load Sharing Facility (LSF-HPC)
The Load Sharing Facility for High Performance Computing (LSF-HPC) from Platform Computing
Corporation is a batch system resource manager that has been integrated with SLURM for use on the HP
XC system. LSF-HPC for SLURM is included with the HP XC System Software, and is an integral part of
the HP XC environment. LSF-HPC interacts with SLURM to obtain and allocate available resources, and
to launch and control all the jobs submitted to LSF-HPC. LSF-HPC accepts, queues, schedules, dispatches,
and controls all the batch jobs that users submit, according to policies and configurations established by
the HP XC site administrator. On an HP XC system, LSF-HPC for SLURM is installed and runs on one HP
XC node, known as the
.
A complete description of LSF-HPC is provided in
. In addition, for your
convenience, the HP XC Documentation CD contains LSF manuals from Platform Computing.
1.5.3 Standard LSF
Standard LSF is also available on the HP XC system. The information for using
is documented
in the LSF manuals from Platform Computing. For your convenience, the HP XC documentation CD
contains these manuals.
1.5.4 How LSF-HPC and SLURM Interact
In the HP XC environment, LSF-HPC cooperates with SLURM to combine the powerful scheduling
functionality of LSF-HPC with the scalable parallel job launching capabilities of SLURM. LSF-HPC acts
primarily as a workload scheduler on top of the SLURM system, providing policy and topology-based
scheduling for end users. SLURM provides an execution and monitoring layer for LSF-HPC. LSF-HPC
uses SLURM to detect system topology information, make scheduling decisions, and launch jobs on
allocated resources.
When a job is submitted to LSF-HPC, LSF-HPC schedules the job based on job resource requirements.
LSF-HPC communicates with SLURM to allocate the required HP XC compute nodes for the job from the
SLURM lsf partition. LSF-HPC provides node-level scheduling for parallel jobs, and core-level scheduling
for serial jobs. Because of node-level scheduling, a parallel job may be allocated more cores than it requested,
depending on its resource request; the srun or mpirun -srun launch commands within the job still
honor the original request. LSF-HPC always tries to pack multiple serial jobs on the same node, with one
core per job. Parallel jobs and serial jobs cannot coexist on the same node.
After the LSF-HPC scheduler allocates the SLURM resources for a job, the SLURM allocation information
is recorded with the job. You can view this information with the bjobs and bhist commands.
When LSF-HPC starts a job, it sets the SLURM_JOBID and SLURM_NPROCS environment variables in the
job environment. SLURM_JOBID associates the LSF-HPC job with SLURM's allocated resources. The
SLURM_NPROCS
environment variable is set to the originally requested number of cores. LSF-HPC dispatches
the job from the
, which is the same node on which LSF-HPC daemons run. The
LSF-HPC JOB_STARTER script, which is configured for all queues, uses the srun command to launch a
user job on the first node in the allocation. Your job can contain additional srun or mpirun commands
to launch tasks to all nodes in the allocation.
While a job is running, all resource limits supported by LSF-HPC enforced, including core limit, CPU time
limit, data limit, file size limit, memory limit and stack limit. When you terminate a job, LSF-HPC uses
the SLURM scancel command to propagate the signal to the entire job.
After a job finishes, LSF-HPC releases all allocated resources.
A detailed description, along with an example and illustration, of how LSF-HPC and SLURM cooperate
to launch and manage jobs is provided in
“How LSF-HPC and SLURM Launch and Manage a Job”
. It is
highly recommended that you review this information.
In summary, and in general:
LSF-HPC
Determines WHEN and WHERE the job will run. LSF-HPC communicates with SLURM to
determine WHICH resources are available, and SELECTS the appropriate set of nodes for the
job.
SLURM
Allocates nodes for jobs as determined by LSF-HPC. It CONTROLS task/rank distribution
within the allocated nodes. SLURM also starts the executables on each host as requested by
the HP-MPI mpirun command.
1.5 Run-Time Environment
25