beautypg.com

Introduction, Getting information about queues, Getting information about resources – HP XC System 3.x Software User Manual

Page 28

background image

Introduction

As described in

Run-Time Environment (page 24)

, SLURM and LSF-HPC cooperate to run and manage jobs

on the HP XC system, combining LSF-HPC's powerful and flexible scheduling functionality with SLURM's
scalable parallel job-launching capabilities.

SLURM is the low-level resource manager and job launcher, and performs core allocation for jobs. LSF-HPC
gathers information about the cluster from SLURM. When a job is ready to be launched, LSF-HPC creates a
SLURM node allocation and dispatches the job to that allocation.

Although you can launch jobs directly using SLURM, HP recommends that you use LSF-HPC to take advantage
of its scheduling and job management capabilities. You can add SLURM options to the LSF-HPC job launch
command line to further define job launch requirements. Use the HP-MPI mpirun command and its options
within LSF-HPC to launch jobs that require MPI's high-performance message-passing capabilities.

When the HP XC system is installed, a SLURM partition of nodes is created to contain LSF-HPC jobs. This
partition is called the lsf partition.

When a job is submitted to LSF-HPC, the LSF-HPC scheduler prioritizes the job and waits until the required
resources (compute nodes from the lsf partition) are available.

When the requested resources are available for the job, LSF-HPC creates a SLURM allocation of nodes on
behalf of the user, sets the SLURM JobID for the allocation, and dispatches the job with the LSF-HPC
JOB_STARTER

script to the first allocated node.

A detailed explanation of how SLURM and LSF-HPC interact to launch and manage jobs is provided in

How

LSF-HPC and SLURM Launch and Manage a Job (page 73)

.

Getting Information About Queues

The LSF bqueues command lists the configured job queues in LSF-HPC. By default, bqueues returns the
following information about all queues:

Queue name

Queue priority

Queue status

Job slot statistics

Job state statistics

To get information about queues, enter the bqueues as follows:

$ bqueues

For more information about using this command and a sample of its output, see

Examining LSF-HPC System

Queues (page 76)

Getting Information About Resources

The LSF bhosts, lshosts, and lsload commands are quick ways to get information about system
resources. LSF-HPC daemons run on only one node in the HP XC system, so the bhosts and lshosts
commands will list one host — which represents all the resources of the HP XC system. The total number of
cores for that host should be equal to the total number of cores assigned to the SLURM lsf partition.

The LSF bhosts command provides a summary of the jobs on the system and information about the
current state of LSF-HPC.

$ bhosts

For more information about using this command and a sample of its output, see

Getting the Status of

LSF-HPC (page 75)

The LSF lshosts command displays machine-specific information for the LSF execution host node.

$ lshosts

For more information about using this command and a sample of its output, see

Getting Information

About LSF Execution Host Node (page 75)

.

The LSF lsload command displays load information for the LSF execution host node.

28

Using the System