1 useful commands, 2 job startup and job control, 3 preemption – HP XC System 3.x Software User Manual
Page 87: Using lsf-hpc, Integrated with slurm in the hp xc environment

Pseudo-parallel job
A job that requests only one slot but specifies any of these constraints:
•
mem
•
tmp
•
nodes=1
•
mincpus > 1
Pseudo-parallel jobs are allocated one node for their exclusive use.
NOTE:
Do NOT rely on this feature to provide node-level allocation
for small jobs in job scripts. Use the SLURM[nodes] specification instead,
along with mem, tmp, mincpus allocation options.
LSF-HPC considers this job type as a parallel job because the job requests
explicit node resources. LSF-HPC does not monitor these additional
resources, so it cannot schedule any other jobs to the node without risking
resource contention. Therefore LSF-HPC allocates the appropriate whole
node for exclusive use by the serial job in the same manner as it does for
parallel jobs, hence the name “pseudo-parallel”.
Parallel job
A job that requests more than one slot, regardless of any other constraints.
Parallel jobs are allocated up to the maximum number of nodes specified
by the following specifications:
•
SLURM[nodes=min-max]
(if specified)
•
SLURM[nodelist=node_list]
(if specified)
•
bsub -n
Parallel jobs and serial jobs cannot run on the same node.
Small job
A parallel job that can potentially fit into a single node, and does not
explicitly request more than one node (SLURM[nodes] or
SLURM[node_list] specification). LSF-HPC tries to allocate a single
node for a small job.
10.5 Using LSF-HPC Integrated with SLURM in the HP XC Environment
This section provides some additional information that should be noted about using LSF-HPC in the HP
XC Environment.
10.5.1 Useful Commands
The following describe useful commands for LSF-HPC Integrated with SLURM:
•
Use the bjobs -l and bhist -l commands to see the components of the actual SLURM allocation
command.
•
Use the bkill command to kill jobs.
•
Use the bjobs command to monitor job status in LSF-HPC integrated with SLURM.
•
Use the bqueues command to list the configured job queues in LSF-HPC integrated with SLURM.
10.5.2 Job Startup and Job Control
When LSF-HPC starts a SLURM job, it sets SLURM_JOBID to associate the job with the SLURM allocation.
While a job is running, all LSF-HPC supported operating-system-enforced resource limits are supported,
including core limit, CPU time limit, data limit, file size limit, memory limit, and stack limit. If the user
kills a job, LSF-HPC propagates signals to entire job, including the job file running on the local node and
all tasks running on remote nodes.
10.5.3 Preemption
LSF-HPC uses the SLURM "node share" feature to facilitate preemption. When a low-priority is job
preempted, job processes are suspended on allocated nodes, and LSF-HPC places the high-priority job on
the same node. After the high-priority job completes, LSF-HPC resumes suspended low-priority jobs.
10.5 Using LSF-HPC Integrated with SLURM in the HP XC Environment
87