beautypg.com

Application development environment, Parallel applications, Serial applications – HP XC System 3.x Software User Manual

Page 24: Run-time environment, Slurm, Parallel applications serial applications

background image

Documentation CD contains XC LSF manuals from Platform Computing. LSF
manpages are available on the HP XC system.

SLURM commands

HP XC uses the Simple Linux Utility for Resource Management (SLURM) for system
resource management and job scheduling. Standard SLURM commands are
available through the command line. SLURM functionality is described in

Chapter 8. Using SLURM

. Descriptions of SLURM commands are available

in the SLURM manpages. Invoke the man command with the SLURM command
name to access them.

HP-MPI commands

You can run standard HP-MPI commands from the command line. Descriptions of
HP-MPI commands are available in the HP-MPI documentation, which is supplied
with the HP XC system software.

Modules commands

The HP XC system uses standard Modules commands to load and unload
modulefiles, which are used to configure and modify the user environment. Modules
commands are described in

"Overview of Modules"

.

Application Development Environment

The HP XC system provides an environment that enables developing, building, and running applications
using multiple nodes with multiple cores. These applications can range from parallel applications using many
cores to serial applications using a single core.

Parallel Applications

The HP XC parallel application development environment allows parallel application processes to be started
and stopped together on a large number of application processors, along with the I/O and process control
structures to manage these kinds of applications.

Full details and examples of how to build, run, debug, and troubleshoot parallel applications are provided
in

"Developing Parallel Applications"

.

Serial Applications

You can build and run serial applications under the HP XC development environment. A serial application
is a command or application that does not use any form of parallelism.

Full details and examples of how to build, run, debug, and troubleshoot serial applications are provided in

"Building Serial Applications"

.

Run-Time Environment

This section describes LSF-HPC, SLURM, and HP-MPI, and how these components work together to provide
the HP XC run-time environment. LSF-HPC focuses on scheduling (and managing the workload) and SLURM
provides efficient and scalable resource management of the compute nodes.

Another HP XC environment features Standard LSF without the interaction with the SLURM resource manager.

SLURM

Simple Linux Utility for Resource Management (SLURM) is a resource management system that is integrated
into the HP XC system. SLURM is suitable for use on large and small Linux clusters. It was developed by
Lawrence Livermore National Lab and Linux Networks. As a resource manager, SLURM allocates exclusive
or nonexclusive access to resources (application and compute nodes) for users to perform work, and provides
a framework to start, execute and monitor work (normally a parallel job) on the set of allocated nodes.

A SLURM system consists of two daemons, one configuration file, and a set of commands and APIs. The
central controller daemon, slurmctld, maintains the global state and directs operations. A slurmd daemon
is deployed to each computing node and responds to job-related requests, such as launching jobs, signalling,
and terminating jobs. End users and system software (such as LSF-HPC) communicate with SLURM by means
of commands or APIs — for example, allocating resources, launching parallel jobs on allocated resources,
and terminating running jobs.

SLURM groups compute nodes (the nodes where jobs are run) together into “partitions”. The HP XC system
can have one or several partitions. When HP XC is installed, a single partition of compute nodes is created

24

Overview of the User Environment