Skip to content

How to Run MPI Programs

2

mpirun instead of srun#

As said in the textbook of the SPOC, when executing an MPI program, you need to have already obtained a container with the required ressources using salloc or sbatch. The integration of MPI in SLURM allows the mpirun command to inherite the number of allocated cores as srun does. mpirun command is a kind of « master » in charge of scattering « slave » processes over the allocated cores so:

  • it is executed on the submit host
  • it mustn't be submitted through srun because the number  n of allocated cores is then used twice:
  • once by srun to submit  n   mpirun commands,
  • once by mpirun to submit n  MPI processes.

For MPI programs,   mpirun  replaces srun but contrary the latter,   mpirun  is not able to allocate resources such as cores or memory. That's why it must be launched inside a salloc or sbatch that creates the allocation.

Prefer sbatch to salloc to submit MPI programs#

Some MPI programs show that the mpirun command can take a whole core all for itself. But we have seen that the command passed to  salloc  runs directly on the submit node:

salloc without srun (bash)

login@maestro-submit ~ $ salloc --mem=100M hostname
salloc: job 5714201 has been allocated resources
salloc: Granted job allocation 5714201
salloc: Waiting for resource configuration
salloc: Nodes maestro-1027 are ready for job
maestro-submit
salloc: Relinquishing job allocation 5714201

As a consequence, if a lot of users run mpirun inside a salloc, the submit node will be busy because of that.

Thus, to avoid loading maestro.pasteur.fr, use sbatch  instead of  salloc so that mpirun  runs on one of the nodes of the container (the BatchHost). To keep the one line command way, use option --wrap option of  sbatch:

sbatch mpi oneliner (bash)

login@maestro-submit ~ $ module load  gcc/9.2.0 openmpi/4.0.5
login@maestro-submit ~ $ sbatch -n 3 --tasks-per-node 1  --wrap="mpirun hostname"
Submitted batch job 5612375
login@maestro-submit ~ $ cat slurm-5612375.out
maestro-1003
maestro-1010
maestro-1015

Beware of using double quotes ", in particular if you need to use loop variables as in:

use of double quote when using variables (bash)

login@maestro-submit ~ $ for i in *.fastq; do sbatch -c 4 --wrap="pigz -p \$SLURM_CPUS_PER_TASK $i"; done

Bioinformatics MPI programs that need neither srun nor mpirun#

Some programs available from module use MPI. It is especially the case of programs from the ptools package:

* mGenomeAnalysisTK
* mblastall
* mbowtie2
* mbwa
* mcutadapt
* mpbayes
* mtaxoptimizer

These programs are MPI wrappers that already contain an invocation to mpirun. Thus, they just need to be launched inside a container using salloc or  sbatch as above.