Application development environment, Parallel applications, Serial applications – HP XC System 3.x Software User Manual
Page 24: Run-time environment, Slurm, Parallel applications serial applications
Documentation CD contains XC LSF manuals from Platform Computing. LSF
manpages are available on the HP XC system.
SLURM commands
HP XC uses the Simple Linux Utility for Resource Management (SLURM) for system
resource management and job scheduling. Standard SLURM commands are
available through the command line. SLURM functionality is described in
. Descriptions of SLURM commands are available
in the SLURM manpages. Invoke the man command with the SLURM command
name to access them.
HP-MPI commands
You can run standard HP-MPI commands from the command line. Descriptions of
HP-MPI commands are available in the HP-MPI documentation, which is supplied
with the HP XC system software.
Modules commands
The HP XC system uses standard Modules commands to load and unload
modulefiles, which are used to configure and modify the user environment. Modules
commands are described in
Application Development Environment
The HP XC system provides an environment that enables developing, building, and running applications
using multiple nodes with multiple cores. These applications can range from parallel applications using many
cores to serial applications using a single core.
Parallel Applications
The HP XC parallel application development environment allows parallel application processes to be started
and stopped together on a large number of application processors, along with the I/O and process control
structures to manage these kinds of applications.
Full details and examples of how to build, run, debug, and troubleshoot parallel applications are provided
in
"Developing Parallel Applications"
.
Serial Applications
You can build and run serial applications under the HP XC development environment. A serial application
is a command or application that does not use any form of parallelism.
Full details and examples of how to build, run, debug, and troubleshoot serial applications are provided in
"Building Serial Applications"
Run-Time Environment
This section describes LSF-HPC, SLURM, and HP-MPI, and how these components work together to provide
the HP XC run-time environment. LSF-HPC focuses on scheduling (and managing the workload) and SLURM
provides efficient and scalable resource management of the compute nodes.
Another HP XC environment features Standard LSF without the interaction with the SLURM resource manager.
SLURM
Simple Linux Utility for Resource Management (SLURM) is a resource management system that is integrated
into the HP XC system. SLURM is suitable for use on large and small Linux clusters. It was developed by
Lawrence Livermore National Lab and Linux Networks. As a resource manager, SLURM allocates exclusive
or nonexclusive access to resources (application and compute nodes) for users to perform work, and provides
a framework to start, execute and monitor work (normally a parallel job) on the set of allocated nodes.
A SLURM system consists of two daemons, one configuration file, and a set of commands and APIs. The
central controller daemon, slurmctld, maintains the global state and directs operations. A slurmd daemon
is deployed to each computing node and responds to job-related requests, such as launching jobs, signalling,
and terminating jobs. End users and system software (such as LSF-HPC) communicate with SLURM by means
of commands or APIs — for example, allocating resources, launching parallel jobs on allocated resources,
and terminating running jobs.
SLURM groups compute nodes (the nodes where jobs are run) together into “partitions”. The HP XC system
can have one or several partitions. When HP XC is installed, a single partition of compute nodes is created
24
Overview of the User Environment