Using GPUs

This page provides generic information about how to access GPUs through the Slurm scheduler.

When to use a GPU¶

You should consider using a GPU for your work if:

Your job has GPU support/functionality, and
You job is substantially large or will run for a long time without GPU support
Or you are performing a task that needs a GPU (e.g. work with large language models, some machine learning methods such as neural networks)

Warning

Your first stop when looking into using GPUs should be the documentation of the application you are using.
Not every process can use a GPU, and how to use them effectively varies greatly!
There is a list of commonly used GPU supporting software at the bottom of this page.

Request GPU resources using Slurm¶

To request a GPU for your Slurm job, add the following option in the header of your submission script:

#SBATCH --gpus-per-node=1

You can specify the type and number of GPU you need using the following syntax

#SBATCH --gpus-per-node=<gpu_type>:<gpu_number>

It is recommended to specify the exact GPU type required; otherwise, the job may be allocated to any available GPU at the time of execution.

Note

Recall, memory associated with the GPUs is the VRAM, and is a separate resource from the RAM requested by Slurm. The memory values listed below are VRAM values.

Architecture	Note	VRAM	Max	Slurm Header
NVIDIA A100		80GB	4	`#SBATCH --partition=milan #SBATCH --gpus-per-node=a100:1`
NVIDIA A100		40GB	2	`#SBATCH --partition=genoa #SBATCH --gpus-per-node=a100:1`
NVIDIA H100		96GB	2	`#SBATCH --gpus-per-node=h100:1`
NVIDIA L4	No double precision floating point (fp64)	24GB	4	`#SBATCH --gpus-per-node=l4:1`

You can also use the --gpus-per-nodeoption in Slurm interactive sessions, with the srun and salloc commands. For example:

srun --job-name "InteractiveGPU" --gpus-per-node L4:1 --partition genoa --cpus-per-task 8 --mem 2GB --time 00:30:00 --pty bash

will request and then start a bash session with access to a L4 GPU, for a duration of 30 minutes.

Warning

When you use the --gpus-per-nodeoption, Slurm automatically sets the CUDA_VISIBLE_DEVICES environment variable inside your job environment to list the index/es of the allocated GPU card/s on each node.

srun --job-name "GPUTest" --gpus-per-node=L4:2 --time 00:05:00 --pty bash

srun: job 20015016 queued and waiting for resources
srun: job 20015016 has been allocated resources
$ echo $CUDA_VISIBLE_DEVICES
0,1

Load CUDA and cuDNN modules¶

To use an Nvidia GPU card with your application, you need to load the driver and the CUDA toolkit via the environment modules mechanism:

module load CUDA/11.0.2

You can list the available versions using:

module spider CUDA

Please Contact our Support Team if you need a version not available on the platform.

The CUDA module also provides access to additional command line tools:

nvidia-smi to directly monitor GPU resource utilisation,
nvcc to compile CUDA programs,
cuda-gdb to debug CUDA applications.

In addition, the cuDNN (NVIDIA CUDA® Deep Neural Network library) library is accessible via its dedicated module:

module load cuDNN/8.0.2.39-CUDA-11.0.2

which will automatically load the related CUDA version. Available versions can be listed using:

module spider cuDNN

Example Slurm script¶

The following Slurm script illustrates a minimal example to request a GPU card, load the CUDA toolkit and query some information about the GPU:

#!/bin/bash -e

#SBATCH --job-name       GPUJob      # job name (shows up in the queue)
#SBATCH --account        nesi99991   # Your account
#SBATCH --time           00-00:10:00 # Walltime (DD-HH:MM:SS)
#SBATCH --partition      genoa       # This means the job will land on A100 with 40GB VRAM
#SBATCH --gpus-per-node  A100:1      # GPU resources required per node
#SBATCH --cpus-per-task  2           # number of CPUs per task (1 by default)
#SBATCH --mem            512MB       # amount of memory per node (1 by default)

# load CUDA module
module purge
module load CUDA/11.0.2

# display information about the available GPUs
nvidia-smi

# check the value of the CUDA_VISIBLE_DEVICES variable
echo "CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES}"

Save this in a test_gpu.sl file and submit it using:

sbatch test_gpu.sl

The content of job output file would look like:

cat slurm-20016124.out

The following modules were not unloaded:
   (Use "module --force purge" to unload all):

  1) slurm   2) NeSI
Wed May 12 12:08:27 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  On   | 00000000:05:00.0 Off |                    0 |
| N/A   29C    P0    23W / 250W |      0MiB / 12198MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
CUDA_VISIBLE_DEVICES=0

Note

CUDA_VISIBLE_DEVICES=0 indicates that this job was allocated to CUDA GPU index 0 on this node. It is not a count of allocated GPUs.

Live monitoring your job's GPU(s)¶

It is possible to visually inspect your job's GPU usage live. To do this:

Obtain the job id for your job of interest by typing squeue --me into the terminal.

user.name@login03:$ squeue --me
JOBID         USER     ACCOUNT   NAME        CPUS MIN_MEM PARTITI START_TIME     TIME_LEFT STATE    NODELIST(REASON)    
1234567       user.nam nesi99999 Example_GPU_   8     24G genoa   Apr 30 17:36    23:58:08 RUNNING  g09

Jump onto the node your job is running by typing jump_into <JobId>, where you replace <JobId> with your Job of interest.
```
user.name@login03:$ jump_into 1234567
Jumping to node: g09 (job 1234567)    
```
Type into the terminal nvtop. This will open an interface that will allow you to inspect how your job run on the GPU.

Measuring GPU efficiency after a job has finished¶

It is possible to measure your GPU's processing and memory efficiency in two ways:

Using `seff`¶

Once your job has finished, it is possible to use seff to get a measure of the GPU utilisation and GPU memory efficiency. To use this feature, type into the terminal

seff <JobID>

Where <JobID> is the job ID for the job of interest. For example:

user.name@login03$ seff 1234567
Cluster: hpc
Job ID: 1234567
State: TIMEOUT
Cores: 4
Tasks: 1
Nodes: 1
Job Wall-time:   100.4%  00:15:04 of 00:15:00 time limit
CPU Utilisation:  98.5%  00:59:20 of 01:00:16 core-walltime
Mem Utilisation:   1.2%  284.46 MB of 24.00 GB
GPU Utilisation:  43  %
GPU Memory:        2.2%  510.00 MB of 23 GB

Using Slurm Native Profiling¶

Before you begin your slurm job, include the following line somewhere at the start of your slurm submission file:

#SBATCH --profile task

Then allow your job to run. Once your job has finished, type in to the terminal

profile_plot <JobID>

Where <JobID> is the job ID for the job of interest. This will create a file called <JobID>_profile.png, which will look something like this:

See Slurm Native Profiling for more information on this feature.

How to determine which GPU is best for your job¶

The following flow diagram explains the steps you should take to test which GPU is right for your job.

When running a 15-minute test job, add the following settings in your slurm submission script:

#SBATCH --time=00:15:00
#SBATCH --gpu-per-node=<gpu-type>:1
#SBATCH --qos=debug
#SBATCH --profile=task # Only for testing
#SBATCH --acctg-freq=1 # Only for testing

To record the GPU utilisation and GPU memory, see Measuring GPU efficiency after a job has finished for more information.

Application and toolbox specific support pages¶

See the Supported Applications for more information on what softwares have GPU support, as well as programming toolkits:

NVIDIA GPU Containers