GNU Parallel
From ScientificComputing
Revision as of 13:34, 21 October 2022 by Sfux (talk | contribs) (→Using GNU Parallel on multiple nodes)
If you have many very short calculations to run, then you can use GNU parallel to run these calculations within a batch single job. This job can be serial, parallel, or even distributed over several nodes.
Using GNU Parallel
Assume you have a file listing several commands to run:
[sfux@eu-login-01 ~]$ cat parcommands.txt printf "Running 1 on $HOSTNAME\n" printf "Running 2 on $HOSTNAME\n" printf "Running 3 on $HOSTNAME\n" printf "Running 4 on $HOSTNAME\n" printf "Running 5 on $HOSTNAME\n" printf "Running 6 on $HOSTNAME\n" printf "Running 7 on $HOSTNAME\n" printf "Running 8 on $HOSTNAME\n" printf "Running 9 on $HOSTNAME\n" printf "Running 10 on $HOSTNAME\n" printf "Running 11 on $HOSTNAME\n"
GNU Parallel can take this list of commands and run them concurrently within a parallel job. For example,
[sfux@eu-login-01 ~]$ sbatch --ntasks=4 --wrap="parallel < parcommands.txt > paroutput.txt"
and when it is done
[sfux@eu-login-01 ~]$ cat paroutput.txt Running 1 on eu-ms-001-01 Running 2 on eu-ms-001-01 Running 3 on eu-ms-001-01 Running 4 on eu-ms-001-01 Running 5 on eu-ms-001-01 Running 6 on eu-ms-001-01 Running 7 on eu-ms-001-01 Running 8 on eu-ms-001-01 Running 9 on eu-ms-001-01 Running 10 on eu-ms-001-01 Running 11 on eu-ms-001-01
Using GNU Parallel on multiple nodes
You can also use GNU Parallel to run the commands over several nodes. For example,
[sfux@eu-login-01 ~]$ sbatch --ntasks=48 --wrap='parallel --ssh "srun" -S "$(/cluster/apps/local/hostlist_parallel.sh)" < parcommands.txt'