1. Home
  2. Run a Job
  3. Open MPI / MPI / Infiniband Jobs

Open MPI / MPI / Infiniband Jobs

Outdated Article

This article is out of date and in the process of being updated for newer available versions of MPI. If you run into issues with MPI or have questions about its usage, please reach out to VACC Support!

Each processor system is fast, but the real power of supercomputing comes from putting multiple processors to work on a task, a technique known as parallel processing.

Message Passing Interface (MPI) is a standardized protocol for communicating between processes, both within one compute node and between multiple nodes. Open MPI is an implementation of this protocol which is widely used by many different software packages to distribute work between multiple processors or nodes.

Loading Open MPI

The current supported version of Open MPI on the cluster is 4 – there’s multiple different modules, depending on what network interface you’d like to have Open MPI use to communicate between different nodes.

  • mpi/openmpi-4.1.4-slurm-ib (for InfiniBand)
  • mpi/openmpi-4.1.1-slurm (for Ethernet)

Use the module command to configure the appropriate version of OpenMPI for your task.

For Infiniband, use:

module load mpi/openmpi-4.1.4-slurm-ib

For Ethernet, use:

module load mpi/openmpi-4.1.1-slurm

Requesting Partition and Architecture

InfiniBand

On Bluemoon, the nodes which include InfiniBand cards are located on the ib partition. That partition should be used for jobs which use InfiniBand

You can also, optionally, choose an architecture. So far, all of the sets of nodes which include the same architecture are each connected to one InfiniBand switch. These switches are connected via uplinks to eachother, so nodes are able to communicate with each other over InfiniBand, but won’t communicate at full bandwidth while doing so.

The current architectures/sets of IB-connected nodes is:

  • epyc_7763 – node300-node338
  • epyc_9654 – node400-node408

The node400-node408 range is still in the testing phase, and is currently unavailable.

Slurm uses the flag --constraintto specify features. The following batch script directive would specify the job to go to the node300-node338 range.

#SBATCH --partition=ib --constraint="epyc_7763"

The BlackDiamond and DeepGreen partitions also have InfiniBand cards. These can be used to parallelize GPU jobs using MPI.

Ethernet/TCP

Choose the partition that best suits your job. For partition information, see the section Partition in the article Submit a Job.

#SBATCH --partition=bigmem
Updated on April 15, 2024

Related Articles

Need Support?
Can't find the answer you're looking for?
Contact Support