Gnu parallel

From ScientificComputing
Jump to: navigation, search

Category

Development

Description

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.

Available versions (Euler, old software stack)

Legacy versions Supported versions New versions
20140622

Please note that this page refers to installations from the old software stack. There are two software stacks on Euler. Newer versions of software are found in the new software stack.

Environment modules (Euler, old software stack)

Version Module load command Additional modules loaded automatically
20140622 module load gcc/4.8.2 gnu_parallel/20140622

Please note that this page refers to installations from the old software stack. There are two software stacks on Euler. Newer versions of software are found in the new software stack.

How to submit a job

You can submit a GNU parallel job in batch mode with the following command:
sbatch [Slurm options] --wrap="parallel [GNU parallel options]"
Here you need to replace [GNU parallel options] with GNU parallel command line options and [Slurm options] with Slurm parameters for the resource requirements of the job. Please find a documentation about the parameters of sbatch on the wiki page about the batch system.

Example

As an example for GNU parallel we are printing out all combinations of the letters A,B and C with D, E and F in parallel.
[leonhard@euler07 ~]$ module load new gcc/4.8.2 gnu_parallel/20140622
[leonhard@euler07 ~]$ bsub -n 2 -W 0:10 -R "rusage[mem=20]" "parallel echo ::: A B C ::: D E F"
Generic job.
Job <35259068> is submitted to queue <normal.4h>.
[leonhard@euler07 ~]$ bjobs
JOBID      USER       STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
35259068   leonhard   PEND  normal.4h  euler07                 *::: D E F Jan  9 08:13
[leonhard@euler07 ~]$ bjobs
JOBID      USER       STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
35259068   leonhard   RUN   normal.4h  euler07     2*e2121     *::: D E F Jan  9 08:13
[leonhard@euler07 ~]$ bjobs
No unfinished job found
[leonhard@euler07 ~]$ tail -n 9 lsf.o35259068 
A D
B D
B F
C E
A E
C F
B E
A F
C D
You can find the resource usage summary in the corresponding LSF log file for the job.

License information

GPLv3 or later

Links

https://www.gnu.org/software/parallel

https://en.wikipedia.org/wiki/GNU_parallel
https://www.biostars.org/p/63816