beautypg.com

Getting information about jobs, Getting job allocation information, Job allocation information for a running job – HP XC System 3.x Software User Manual

Page 80

background image

Getting Information About Jobs

There are several ways you can get information about a specific job after it has been submitted to LSF-HPC.
This section briefly describes some of the commands that are available under LSF-HPC to gather information
about a job. This section is not intended as complete information about this topic. It is intended only to give
you an idea of the commands that are commonly used, and to describe any differences there may be in the
way these commands operate in the HP XC environment. Refer to the LSF manpages for full information
about the commands described in this section.

The following LSF commands are described in this section:

bjobs

"Examining the Status of a Job"

bhist

"Viewing the Historical Information for a Job"

Getting Job Allocation Information

Before a job runs, LSF-HPC allocates SLURM compute nodes based on job resource requirements. After
LSF-HPC allocates nodes for a job, it attaches allocation information to the job. You can view job allocation
information through the bjobs -l and bhist -l commands. Refer to the LSF manpages for details about
using these commands.

A job allocation information string looks like the following:

slurm_id=slurm_jobid;ncpus=slurm_nprocs;slurm_alloc=node_list

This allocation string has the following values:

slurm_id

SLURM_JOBID

environment variable. This is SLURM allocation ID (Associates LSF-HPC

job with SLURM allocated resources.)

ncpus

SLURM_NPROCS

environment variable. This the actual number of allocated cores. Under

node-level allocation scheduling, this number may be bigger than what the job requests.)

slurm_alloc

Allocated node list (comma separated).

When LSF-HPC starts a job, LSF-HPC sets the SLURM_JOBID and SLURM_NPROCS environment variables.

Job Allocation Information for a Running Job

The following is an example of the output obtained using the bjobs -l command to obtain job allocation
information about a running job:

$ bjobs -l 24

Job <24>, User , Project ,

Status , Queue ,

Interactive pseudo-terminal shell mode,

Extsched , Command

date and time stamp: Submitted from host , CWD <$HOME>,

4 Processors Requested, Requested Resources ;

date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>;

date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8];

SCHEDULING PARAMETERS:

r15s r1m r15m ut pg io ls it tmp swp mem

loadSched - - - - - - - - - - -

loadStop - - - - - - - - - - -

EXTERNAL MESSAGES:

MSG_ID FROM POST_TIME MESSAGE ATTACHMENT

0 - - - -

1 lsfadmin date and time stamp SLURM[nodes=4] N

In particular, note the node and job allocation information provided in the above output:

date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>;

date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8];

80

Using LSF