Gnu parallel

From ScientificComputing
Jump to: navigation, search

Category

Development

Description

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.

Available versions

Legacy versions Supported versions New versions
20140622

Environment modules

Version Module load command Additional modules loaded automatically
20140622 module load gcc/4.8.2 gnu_parallel/20140622

How to submit a job

You can submit a GNU parallel job in batch mode with the following command:
bsub [LSF options] "parallel [GNU parallel options]"
Here you need to replace [GNU parallel options] with GNU parallel command line options and [LSF options] with LSF parameters for the resource requirements of the job. Please find a documentation about the parameters of bsub on the wiki page about the batch system.

Example

As an example for GNU parallel we are printing out all combinations of the letters A,B and C with D, E and F in parallel.
[leonhard@euler07 ~]$ module load new gcc/4.8.2 gnu_parallel/20140622
[leonhard@euler07 ~]$ bsub -n 2 -W 0:10 -R "rusage[mem=20]" "parallel echo ::: A B C ::: D E F"
Generic job.
Job <35259068> is submitted to queue <normal.4h>.
[leonhard@euler07 ~]$ bjobs
JOBID      USER       STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
35259068   leonhard   PEND  normal.4h  euler07                 *::: D E F Jan  9 08:13
[leonhard@euler07 ~]$ bjobs
JOBID      USER       STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
35259068   leonhard   RUN   normal.4h  euler07     2*e2121     *::: D E F Jan  9 08:13
[leonhard@euler07 ~]$ bjobs
No unfinished job found
[leonhard@euler07 ~]$ tail -n 9 lsf.o35259068 
A D
B D
B F
C E
A E
C F
B E
A F
C D
You can find the resource usage summary in the corresponding LSF log file for the job.

License information

GPLv3 or later

Links

https://www.gnu.org/software/parallel

https://en.wikipedia.org/wiki/GNU_parallel
https://www.biostars.org/p/63816