Introduction to lsf-hpc in the hp xc environment, Overview of lsf-hpc – HP XC System 3.x Software User Manual
Page 68
job management and information capabilities. LSF-HPC schedules, launches, controls, and tracks jobs that
are submitted to it according to the policies established by the HP XC site administrator.
This section describes the functionality of LSF-HPC in an HP XC system, and discusses how to use some basic
LSF commands to submit jobs, manage jobs, and access job information. The following topics are discussed:
•
Introduction to LSF-HPC in the HP XC Environment (page 68)
•
Determining the LSF Execution Host (page 75)
•
Determining Available LSF-HPC System Resources (page 75)
•
•
Getting Information About Jobs (page 80)
•
Translating SLURM and LSF-HPC JOBIDs (page 83)
•
Working Interactively Within an LSF-HPC Allocation (page 84)
•
LSF-HPC Equivalents of SLURM srun Options (page 86)
Introduction to LSF-HPC in the HP XC Environment
This section introduces you to LSF-HPC in the HP XC environment. It provides an overview of how LSF-HPC
works, and discusses some of the features and differences of standard LSF compared to LSF-HPC on an HP
XC system. This section also contains an important discussion of how LSF-HPC and SLURM work together to
provide the HP XC job management environment. A description of SLURM is provided in Chapter
Overview of LSF-HPC
LSF-HPC was integrated with SLURM for the HP XC system to merge the scalable and efficient resource
management of SLURM with the extensive scheduling capabilities of LSF. In this integration, SLURM manages
the compute resources while LSF-HPC performs the job management. SLURM extends the parallel capabilities
of LSF with its own fast parallel launcher (which is integrated with HP-MPI), full parallel I/O and signal
support, and parallel job accounting capabilities. Managing the compute resources of the HP XC system
with SLURM means that the LSF daemons run only on one HP XC node and can present the HP XC system
as a single LSF host.
LSF-HPC interacts with SLURM to obtain resource information about the HP XC system. This information is
consolidated and key information such as the total number of cores and the maximum memory available on
all nodes becomes the characteristics of the single HP XC “LSF Execution Host”. Additional resource information
from SLURM, such as pre-configured node “features”, are noted and processed during scheduling through
the external SLURM scheduler for LSF-HPC.
Integrating LSF-HPC with SLURM on HP XC systems provides you with a parallel launch command to distribute
and manage parallel tasks efficiently. The SLURM srun command offers much flexibility for requesting
requirements across an HP XC system; for example, you can request
•
Request contiguous nodes
•
Execute only one task per node
•
Request nodes with specific features
This flexibility is preserved in LSF-HPC through the external SLURM scheduler; this is discussed in more detail
in the section titled
"HP XCCompute Node Resource Support"
In an HP XC system, only one node runs LSF-HPC, but all the nodes are configured as LSF-HPC Client Hosts:
every node is able to access LSF-HPC. You can submit jobs from any node in the HP XC system.
The differences described in HP XC System Software documentation take precedence over descriptions in
the LSF documentation from Platform Computing Corporation. See
"Differences Between LSF-HPC and
and the lsf_diff(1) manpage for more information on the subtle differences between standard
LSF and LSF-HPC.
68
Using LSF