Contact Support

MPI

MPI stands for Message Passing Interface, and is a communication protocol used to achieve distributed parallel computation.

Similar in some ways to multi-threading, MPI does not have the limitation of requiring shared memory and thus can be used across multiple nodes, but has higher communication and memory overheads.

For MPI jobs you need to set --ntasks to a value larger than 1, or if you want all nodes to run the same number of tasks, set --ntasks-per-node and --nodes instead.

MPI programs require a launcher to start the ntasks processes on multiple CPUs, which may belong to different nodes. On Slurm systems like ours, the preferred launcher is srun rather than mpi-run.

Since the distribution of tasks across different nodes may be unpredictable, --mem-per-cpu should be used instead of --mem.

#!/bin/bash -e
#SBATCH --job-name=MPIJob       # job name (shows up in the queue)
#SBATCH --time=00:01:00         # Walltime (HH:MM:SS)
#SBATCH --mem-per-cpu=512MB     # memory/cpu in MB (half the actual required memory)
#SBATCH --cpus-per-task=4       # 2 Physical cores per task.
#SBATCH --ntasks=2              # number of tasks (e.g. MPI)

srun pwd                        # Prints  working directory

The expected output being

/home/user001/demo
/home/user001/demo

Warning

For non-MPI programs, either set --ntasks=1 or do not use srun at all. Using srun in conjunction with --cpus-per-task=1 will cause --ntasks to default to 2.