5 terminating jobs with the scancel command, 7 job accounting, Example 9-3 – HP XC System 3.x Software User Manual
Page 81
Example 9-3 Reporting on Failed Jobs in the Queue
$ squeue --state=FAILED
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
59 amt1 hostname root F 0:00 0
9.5 Terminating Jobs with the scancel Command
The scancel command cancels a pending or running job or job step. It can also be used to send a specified
signal to all processes on all nodes associated with a job. Only job owners or administrators can cancel
jobs.
terminates job #415 and all its jobsteps.
Example 9-4 Terminating a Job by Its JobID
$ scancel 415
cancels all pending jobs.
Example 9-5 Cancelling All Pending Jobs
$ scancel --state=PENDING
sends the TERM signal to terminate jobsteps 421.2 and 421.3.
Example 9-6 Sending a Signal to a Job
$ scancel --signal=TERM 421.2 421.3
9.6 Getting System Information with the sinfo Command
The sinfo command reports the state of partitions and nodes managed by SLURM. It has a wide variety
of filtering, sorting, and formatting options. The sinfo command displays a summary of available partition
and node (not job) information, such as partition names, nodes/partition, and cores/node.
Example 9-7 Using the sinfo Command (No Options)
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
lsf up infinite 3 down* n[0,5,8]
lsf up infinite 14 idle n[1-4,6-7,9-16]
The node STATE codes in these examples may be appended by an asterisk character (*) ; this indicates
that the reported node is not responding. See the sinfo(1) manpage for a complete listing and description
of STATE codes.
Example 9-8 Reporting Reasons for Downed, Drained, and Draining Nodes
$ sinfo -R
REASON NODELIST
Memory errors n[0,5]
Not Responding n8
9.7 Job Accounting
HP XC System Software provides an extension to SLURM for job accounting. The sacct command displays
job accounting data in a variety of forms for your analysis. Job accounting data is stored in a log file; the
sacct
command filters that log file to report on your jobs, jobsteps, status, and errors. See your system
administrator if job accounting is not configured on your system.
By default, only the superuser is allowed to access the job accounting data. To grant all system users read
access to this data, the superuser must change the permission of the jobacct.log file, as follows:
9.5 Terminating Jobs with the scancel Command
81