6 submitting jobs, 7 lsf-slurm external scheduler, 8 how lsf-hpc and slurm launch and manage a job – HP XC System 3.x Software User Manual
Page 88: How lsf-hpc and slurm launch and manage a job
10.6 Submitting Jobs
The bsub command submits jobs to LSF-HPC; it is used to request a set of resources on which to launch
a job. This section focuses on enhancements to this command from the LSF-HPC integration with SLURM
on the HP XC system; this section does not discuss standard bsub functionality or flexibility. See the
Platform LSF documentation and the bsub(1) manpage for more information on this important command.
The topic of submitting jobs with the LSF-SLURM External Scheduler is explored in detail in
a Parallel Job Using the SLURM External Scheduler”
.
The HP XC system has several features that make it optimal for running parallel applications, particularly
(but not exclusively)
applications. You can use the bsub command's -n to request more than one
core for a job. This option, coupled with the external SLURM scheduler, discussed in
, gives you much flexibility in selecting resources and shaping how the job is executed on those
resources.
LSF-HPC reserves the requested number of nodes and executes one instance of the job on the first reserved
node, when you request multiple nodes. Use the srun command or the mpirun command with the -srun
option in your jobs to launch parallel applications. The -srun can be set implicitly for the mpirun
command; see
“Submitting a Parallel Job That Uses the HP-MPI Message Passing Interface”
for more
information on using the mpirun -srun command.
Most parallel applications rely on rsh or ssh to "launch" remote tasks. The ssh utility is installed on the
HP XC system by default. If you configured the
keys to allow unprompted access to other nodes in
the HP XC system, the parallel applications can use ssh. See
“Enabling Remote Execution with OpenSSH”
for more information on ssh.
10.7 LSF-SLURM External Scheduler
The external scheduler option is an important option that can be included when submitting parallel jobs
with LSF-HPC integrated with SLURM. This option
•
Provides application-specific external scheduling options for jobs capabilities
•
Lets you include several SLURM options in the LSF command line.
For example, you can submit a job to run one task per node when you have a resource-intensive job that
needs to have sole access to the node's full resources. If your job needs particular resources found only on
a specific set of nodes, you can use this option to submit a job to those nodes.
The LSF host options enable you to identify an HP XC system "host" within a larger LSF cluster. After the
HP XC system is selected, LSF-HPC's external SLURM scheduler provides the additional flexibility to
request specific resources within the HP XC system
You can use the LSF-HPC external scheduler functionality within the bsub command and in LSF-HPC
queue configurations. See the LSF bqueues(1) command for more information on determining how the
available queues are configured on HP XC systems.
See
“Submitting a Parallel Job Using the SLURM External Scheduler”
for information and examples on
submitting jobs with the LSF-SLURM External Scheduler.
10.8 How LSF-HPC and SLURM Launch and Manage a Job
This section describes what happens in the HP XC system when a job is submitted to LSF-HPC.
illustrates this process. Use the numbered steps in the text and depicted in the illustration as an aid to
understanding the process.
Consider the HP XC system configuration shown in
, in which lsfhost.localdomain is the
virtual IP name assigned to the
, node n16 is the login node, and nodes n[1-10] are
compute nodes in the lsf partition. All nodes contain two cores, providing 20 cores for use by LSF-HPC
jobs.
88
Using LSF-HPC