beautypg.com

6 debugging applications, 7 monitoring node activity, 8 tuning applications – HP XC System 3.x Software User Manual

Page 5: 9 using slurm

background image

5.2 Submitting a Serial Job Using LSF-HPC.........................................................................................53

5.2.1 Submitting a Serial Job with the LSF bsub Command............................................................53
5.2.2 Submitting a Serial Job Through SLURM Only......................................................................54

5.3 Submitting a Parallel Job.................................................................................................................55

5.3.1 Submitting a Non-MPI Parallel Job.........................................................................................55
5.3.2 Submitting a Parallel Job That Uses the HP-MPI Message Passing Interface.........................56
5.3.3 Submitting a Parallel Job Using the SLURM External Scheduler...........................................57

5.4 Submitting a Batch Job or Job Script...............................................................................................60
5.5 Submitting Multiple MPI Jobs Across the Same Set of Nodes........................................................62

5.5.1 Using a Script to Submit Multiple Jobs...................................................................................62
5.5.2 Using a Makefile to Submit Multiple Jobs..............................................................................62

5.6 Submitting a Job from a Host Other Than an HP XC Host.............................................................65
5.7 Running Preexecution Programs....................................................................................................65

6 Debugging Applications.............................................................................................67

6.1 Debugging Serial Applications.......................................................................................................67
6.2 Debugging Parallel Applications....................................................................................................67

6.2.1 Debugging with TotalView.....................................................................................................68

6.2.1.1 SSH and TotalView..........................................................................................................68
6.2.1.2 Setting Up TotalView......................................................................................................68
6.2.1.3 Using TotalView with SLURM........................................................................................69
6.2.1.4 Using TotalView with LSF-HPC.....................................................................................69
6.2.1.5 Setting TotalView Preferences.........................................................................................69
6.2.1.6 Debugging an Application..............................................................................................70
6.2.1.7 Debugging Running Applications..................................................................................71
6.2.1.8 Exiting TotalView............................................................................................................71

7 Monitoring Node Activity............................................................................................73

7.1 Installing the Node Activity Monitoring Software.........................................................................73
7.2 Using the xcxclus Utility to Monitor Nodes....................................................................................73
7.3 Plotting the Data from the xcxclus Datafiles...................................................................................76
7.4 Using the xcxperf Utility to Display Node Performance................................................................77
7.5 Plotting the Node Performance Data..............................................................................................79
7.6 Running Performance Health Tests.................................................................................................80

8 Tuning Applications.....................................................................................................85

8.1 Using the Intel Trace Collector and Intel Trace Analyzer...............................................................85

8.1.1 Building a Program — Intel Trace Collector and HP-MPI......................................................85
8.1.2 Running a Program – Intel Trace Collector and HP-MPI.......................................................86

8.2 The Intel Trace Collector and Analyzer with HP-MPI on HP XC...................................................87

8.2.1 Installation Kit.........................................................................................................................87
8.2.2 HP-MPI and the Intel Trace Collector.....................................................................................87

8.3 Visualizing Data – Intel Trace Analyzer and HP-MPI....................................................................89

9 Using SLURM................................................................................................................91

9.1 Introduction to SLURM...................................................................................................................91
9.2 SLURM Utilities...............................................................................................................................91
9.3 Launching Jobs with the srun Command.......................................................................................91

9.3.1 The srun Roles and Modes......................................................................................................92

9.3.1.1 The srun Roles.................................................................................................................92
9.3.1.2 The srun Modes...............................................................................................................92

9.3.2 Using the srun Command with HP-MPI................................................................................92

Table of Contents

5