Distributed memory parallelization

From ScientificComputing
Jump to: navigation, search

The Message Passing Interface (MPI) is a standardized and portable message-passing standard designed by a group of researchers from academia and industry to function on a wide variety of parallel computing architectures. MPI is a communication protocol for programming parallel computers. Both point-to-point and collective communication are supported. "MPI's goals are high performance, scalability, and portability. MPI remains the dominant model used in high-performance computing today. MPI is not sanctioned by any major standards body; nevertheless, it has become a de facto standard for communication among processes that model a parallel program running on a distributed memory system. Actual distributed memory supercomputers such as computer clusters often run such programs. Most MPI implementations consist of a specific set of routines directly callable from C, C++, Fortran (i.e., an API) and any language able to interface with such libraries, including C#, Java or Python.

MPI on our clusters

Multiple MPI libraries are available on Euler, see Message Passing Interface. Before executing an MPI job, the corresponding modules must be loaded (compiler + MPI, in that order):

module load compiler
module load mpi_library

The command used to launch an MPI application is mpirun.

To execute hello_world with 4 MPI nodes, given these are the versions of GCC and OpenMPI that were used to compile it, run:

module load gcc/8.2.0
module load openmpi/4.1.4
sbatch --ntasks=4 --wrap="mpirun ./hello_world"