Shared memory parallelization
OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.
OpenMP uses a portable, scalable model that gives programmers a simple and flexible interface for developing parallel applications for platforms ranging from the standard desktop computer to the supercomputer.
If your application is parallelized using OpenMP or linked against a library using OpenMP (Intel MKL, OpenBLAS, etc.), the number of cores (or threads) that it can use is controlled by the environment variable OMP_NUM_THREADS. This variable must be set before you submit your job:
export OMP_NUM_THREADS=number_of_cores sbatch --ntasks=1 --cpus-per-task=number_of_cores ...
Please note that for OpenMP, you request --ntasks=1 and then request the number of cores through the sbatch option --cpus-per-task.
NOTE: if OMP_NUM_THREADS is not set, your application will either use one core only, or will attempt to use all cores that it can find, stealing' them from other jobs if needed. In other words, your job will either use too few or too many cores.
Contents
Pthreads and other threaded applications
Their behavior is similar to OpenMP applications. It is important to limit the number of threads that the application spawns. There is no standard way to do this, so be sure to check the application's documentation on how to do this. Usually a program supports at least one of four ways to limit itself to N threads:
- it understands the OMP_NUM_THREADS=N environment variable,
- it has its own environment variable, such as GMX_NUM_THREADS=N for Gromacs,
- it has a command-line option, such as -nt N (for Gromacs), or
- it has an input-file option, such as num_threads N.
If you are unsure about the program's behavior, please contact us and we will analyze it.
Tips and Tricks
GNU libgomp
Virtually all OpenMP programs compiling using the GNU compilers use the GNU libgomp library. As such, some environment variables may make your program run faster or they may make it run much worse. It is safer not to use them than to use them without testing whether they help or not.
- OMP_PROC_BIND
- Binds threads to cores. This option is relatively safe to use. Some programs may run much slower with this option. To use, set the OMP_PROC_BIND environment variable to true before submitting the job (or in a job script):
export OMP_PROC_BIND=true bsub -n 4 my_program
- GOMP_CPU_AFFINITY
- Binds specific threads to specific cores. Some programs may run much slower with this option. This is strongly discouraged: in most cases the OMP_PROC_BIND option is sufficient. To use it anyway, set the GOMP_CPU_AFFINITY environment variable in a job script (e.g., my_script.sh) according to the cores assigned by LSF:
#!/bin/bash export GOMP_CPU_AFFINITY="${LSB_BIND_CPU_LIST//,/ }" my_program
and submit the script to LSF:bsub -n 4 < my_script.sh
Intel OpenMP library
Use the KMP_AFFINITY environment variable to control affinity. For example,
KMP_AFFINITY=compact
Refer to the Intel compiler documentation for details and other options.