Difference between revisions of "Job arrays"

From ScientificComputing
Jump to: navigation, search
(Some clarifications.)
(Monitoring job arrays)
 
(33 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==Introduction==  
 
==Introduction==  
Many cluster users are running '''embarrassingly parallel''' simulations consisting of hundreds or thousands of similar calculations, each one executing the same program but with slightly different — or ''random'' in the case of Monte-Carlo simulation — parameters. The usual approach is to submit each one as an '''independent''' job. This works fine, although keeping track of all these jobs is not easy, and can get quite complicated if these jobs must be executed in a coordinated fashion (e.g. ''master/slave''). It would be much simpler if one could submit all these jobs at once, and manage them as a single entity. The good news is that it is indeed possible using a so-called '''job array'''. Jobs in an array have a common '''name''' and '''job-ID''', plus a specific '''job-index''' ($LSB_JOBINDEX) corresponding to their position in the array. The name is mandatory, as it is used to define the range of the job array.
+
Many cluster users are running '''embarrassingly parallel''' simulations consisting of hundreds or thousands of similar calculations, each one executing the same program but with slightly different — or ''random'' in the case of Monte-Carlo simulation — parameters. The usual approach is to submit each one as an '''independent''' job. This works fine, although keeping track of all these jobs is not easy, and can get quite complicated if these jobs must be executed in a coordinated fashion (e.g. ''master/slave''). It would be much simpler if one could submit all these jobs at once, and manage them as a single entity. The good news is that it is indeed possible using a so-called '''job array'''. Jobs in an array have a common '''job-ID''', plus a specific '''job-index''' ($SLURM_ARRAY_TASK_ID) corresponding to their position in the array.
  
 
==Submitting a job array==
 
==Submitting a job array==
 
Let's take for example a simulation consisting of 4 independent calculations. Normally, one would submit them as 4 individual jobs:
 
Let's take for example a simulation consisting of 4 independent calculations. Normally, one would submit them as 4 individual jobs:
  
  bsub -J "calc 1" ''./program'' [''arguments'']
+
  sbatch --job-name="calc 1" --wrap="''./program'' [''arguments'']"
  bsub -J "calc 2" ''./program'' [''arguments'']
+
  sbatch --job-name="calc 2" --wrap="''./program'' [''arguments'']"
  bsub -J "calc 3" ''./program'' [''arguments'']
+
  sbatch --job-name="calc 3" --wrap=''./program''"[''arguments'']"
  bsub -J "calc 4" ''./program'' [''arguments'']
+
  sbatch --job-name="calc 4" --wrap=''./program''"[''arguments'']"
  
 
or
 
or
  
 
  for ((n=1;n<=4;n++)); do
 
  for ((n=1;n<=4;n++)); do
     bsub -J "calc $n" ''./program'' [''arguments'']
+
     sbatch --job-name="calc $n" --wrap="''./program'' [''arguments'']"
 
  done
 
  done
  
Using a '''job array''', however, one can submit these calculations all at once:  
+
Using a '''job array''', however, one can submit these calculations all at once, using a single <tt>sbatch</tt> command:  
  
  bsub -J "calc[1-4]" ''./program'' [''arguments'']
+
  sbatch --array=1-4 --wrap="''./program'' [''arguments'']"
  
The option <tt>-J "calc[1-4]"</tt> defines both the '''name''' of the array, <tt>calc</tt>, but also the '''number''' of its elements, 4.
+
[sfux@eu-login-40 ~]$ '''sbatch --array=1-4 --wrap="echo \"Hello, I am an independent job\""'''
 +
Submitted batch job 1189055
 +
[sfux@eu-login-40 ~]$ '''squeue -u sfux'''
 +
              JOBID PARTITION    NAME    USER ST      TIME  NODES NODELIST(REASON)
 +
      1189055_[1-4] normal.4h    wrap    sfux PD      0:00      1 (None)
  
[leonhard@euler05 ~]$ '''bsub -J "calc[1-4]" echo "Hello, I am an independent job"'''
+
A job array creates a Slurm logfile for each element, which will have the name slurm-JOBID_ELEMENT:
Job array.
 
Job <33383740> is submitted to queue <normal.4h>.
 
[leonhard@euler05 ~]$ '''bjobs'''
 
JOBID      USER        STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
 
33383740  leonhard    PEND  normal.4h  euler05                calc[1]    Dec  2 08:50
 
33383740  leonhard    PEND  normal.4h  euler05                calc[2]    Dec  2 08:50
 
33383740  leonhard    PEND  normal.4h  euler05                calc[3]    Dec  2 08:50
 
33383740  leonhard    PEND  normal.4h  euler05                calc[4]    Dec  2 08:50
 
  
A job array creates only a single LSF logfile, which contains the stdout of all array elements. The job name, in this case <tt>calc</tt>, can also be used as a [[Job_chaining|dependency condition]].  For example, to run a job to <tt>analyze</tt> only after '''all''' 4 <tt>calc</tt> calculations are done, submit it like
+
[sfux@eu-login-40 ~]$ '''ls -ltr slurm*'''
  bsub -w "numdone(calc,*)" -J analyze "./program [arguments]"
+
  -rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_1.out
 +
-rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_2.out
 +
-rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_3.out
 +
-rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_4.out
 +
[sfux@eu-login-40 ~]$ '''cat slurm-1189055_1.out'''
 +
Hello, I am an independent job
  
when appropriate, the re-runnable option, <tt>-r</tt>, can be specified when submitting the job arrayIf any of the calculations fail ''due to a system failure'', they will be automatically re-run.  If any of the calculations fails on its own (segmentation fault, out of time, ...), it will '''not''' be re-run and the exit status of the job array will be considered to be unsuccessfull.
+
Setting a range of 1-4 will submit 4 jobs (using the default step size 1).
 +
<!--
 +
Please note that for dependency conditions regarding job arrays, you need to specify the jobid (jobnames do not work). You can extract the job name after submitting a job by parsing its stdout. For example:
 +
 
 +
  sbatch --wrap="echo 'hello world'" | awk '{print $4}''
 +
 
 +
or
 +
 
 +
jobid=$(sbatch --wrap="echo 'hello world'" | awk '{print $4}')
 +
-->
 +
 
 +
==Limiting the number of jobs that are allowed to run at the same time==
 +
A job array allows a large number of jobs to be submitted with one command, potentially flooding a system, and job slot limits provide a way to limit the impact a job array may have on a system. You can set this limit by adding <tt>%job_slot_limit</tt> after specifying the range of the array
 +
 
 +
'''sbatch --array=[1-10000]%10 --wrap="echo \"Hello, I am an independent job\""'''
 +
 
 +
In this example the array contains 10000 elements and maximally 10 jobs are allowed to run at the same time.
  
 
==Simulation parameters==
 
==Simulation parameters==
Line 41: Line 58:
 
* create a different input file for each job
 
* create a different input file for each job
 
* pass the job index as argument to the program
 
* pass the job index as argument to the program
* use environment variables set by LSF
+
* use a "commands" file with 1 command per line
  
 
===Input and output files===
 
===Input and output files===
One can use the special string <tt>%I</tt> in the job's input file name as a placeholder for the job's index in the array. For example:
+
One can use the special strings <tt>%A</tt> (jobid) and <tt>%a</tt> (task/element id) in the job's input file name as a placeholder. For example:
  
  bsub -i param.%I
+
  sbatch --job-name="testjob" --array=1-4 --input="param.%A.%a" --wrap="command [argument]"
  bsub -i calc%I.in
+
  sbatch --job-name="testjob" --array=1-4 --input="calc%A.%a.in" --wrap="command [argument]"
  
 
The same mechanism also applies to the output file:
 
The same mechanism also applies to the output file:
  
  bsub -o result.%I
+
  sbatch --job-name="testjob" --array=1-4 --output="result.%A.%a" --wrap="command [argument]"
  bsub -o calc%I.out
+
  sbatch --job-name="testjob" --array=1-4 --output="calc%A.%a.out" --wrap="command [argument]"
  
If the name of the input and/or output file does not contain <tt>%I</tt>, all jobs use the same input and/or output file.
+
or the error file:
  
The main drawback of this mechanism is that all jobs' input files must be created in advance.
+
sbatch --job-name="testjob" --array=1-4 --error="error.%A.%a" --wrap="command [argument]"
 +
sbatch --job-name="testjob" --array=1-4 --error="%A.%a.err" --wrap="command [arguments]"
  
 
===Program arguments===
 
===Program arguments===
A common case is to pass the parameter value (the array index '''$LSB_JOBINDEX''') as a command-line argument. Here is an example for a MATLAB function with the parameter as its sole argument:
+
A common case is to pass the parameter value (the array index '''$SLURM_ARRAY_TASK_ID''') as a command-line argument. Here is an example for a MATLAB function with the parameter as its sole argument:
 +
 
 +
sbatch --job-name="hello" '''--array=1-4''' --wrap="matlab -nodisplay -singleCompThread -r my_function('''\$SLURM_ARRAY_TASK_ID''')"
 +
 
 +
It is important that the <tt>$</tt> sign in front of <tt>SLURM_ARRAY_TASK_ID</tt> is masked with a backslash <tt>\$</tt>, as the variable needs to be evaluated at runtime. This example would be equivalent to submitting 4 jobs in a row:
 +
 
 +
sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function('''1''')"
 +
sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function('''2''')"
 +
sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function('''3''')"
 +
sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function('''4''')"
 +
 
 +
You can specify the range for the job array by using the format
 +
 
 +
start-end:step
 +
 
 +
For example
 +
 +
sbatch --job-name="testjob" --array=10-20:2 --wrap="echo \$SLURM_ARRAY_TASK_ID"
 +
 
 +
would create a job array with 6 elements that would be equivalent to submitting the following six commands:
 +
 +
sbatch --job-name="testjob" --wrap="echo 10"
 +
sbatch --job-name="testjob" --wrap="echo 12"
 +
sbatch --job-name="testjob" --wrap="echo 14"
 +
sbatch --job-name="testjob" --wrap="echo 16"
 +
sbatch --job-name="testjob" --wrap="echo 18"
 +
sbatch --job-name="testjob" --wrap="echo 20"
 +
 
 +
Please find below an overview on the available environment variables for job arrays in Slurm:
 +
 
 +
{| class="wikitable"
 +
! Environment variable !! Description
 +
|-
 +
| <tt>$SLURM_ARRAY_TASK_COUNT</tt> || Number of Slurm jobs in the array
 +
|-
 +
| <tt>$SLURM_ARRAY_TASK_ID</tt> || Array index of the elements in the array
 +
|-
 +
| <tt>$SLURM_ARRAY_TASK_MIN</tt> || Minimum index in the job array
 +
|-
 +
| <tt>$SLURM_ARRAY_TASK_MAX</tt> || Maximum index in the job array
 +
|}
 +
 
 +
===Using a "commands" file===
 +
The approach to use the job index works well for a single parameter, or a set of parameters that can be mapped to natural numbers (in this case, the different parameter would be calculated from the job index). There are also cases with multiple parameters that cannot be mapped to natural numbers. Then an alternative technique would be to create a text file "commands" which contains 1 command per line.
 +
 
 +
Then the variable <tt>$SLURM_ARRAY_TASK_ID</tt> is a pointer determining which line of the file a job executes.
 +
 +
sbatch --job-name="testjob" --array=1-4 --wrap="awk -v jindex=\$SLURM_ARRAY_TASK_ID 'NR==jindex' commands | bash"
 +
 
 +
The awk command extracts line number <tt>$SLURM_ARRAY_TASK_ID</tt> from the "commands" and passes it to bash, such that the command is executed.
 +
 
 +
The first job would then execute the first command from the commands files, the second job the second command etc.
 +
 
 +
== Group calculations into fewer jobs ==
 +
Often the jobs within a job array are too short (anything below a few minutes) because every job in the array runs just one short calculation.
 +
 
 +
You can increase the throughput of your entire job array be grouping several calculations into a fewer number of jobs instead of running a single calculation per job. You should target each job to run for at least about half an hour and 5&nbsp;minutes at the very least.
 +
 
 +
In the previous example, we showed how to run four matlab function calls (<tt>matlab -nodisplay -singleCompThread -r "my_function('''\$SLURM_ARRAY_TASK_ID''')"</tt>) as a job array with four jobs. Now let us convert this to a job array with two jobs, each of which runs two of the function calls. In the first step we will put the matlab call into a script, '''run_my_function.sh''':
 +
 
 +
#!/bin/bash
 +
matlab -nodisplay -singleCompThread -r "my_function($SLURM_ARRAY_TASK_ID)"
 +
 
 +
which can be submitted by redirecting it to the bsub command:
 +
 
 +
sbatch --job-name="hello" '''--array=1-4''' < run_my_function.sh
 +
 
 +
So far nothing has changed except for how the the command is passed to sbatch. '''Note''' that there is no backslash before '''$SLURM_ARRAY_TASK_ID''' in the script. In the second step, change the '''run_my_function.sh''' script to run two matlab function calls by writing a for loop. Define the <tt>STEP</tt> variable to be the number of calculations to run in each loop. In our case this is&nbsp;2:
 +
 
 +
<pre>
 +
#!/bin/bash
 +
STEP=2
 +
for ((i=1;i<=$STEP;i++)); do
 +
    MY_JOBINDEX=$((($SLURM_ARRAY_TASK_ID-1)*$STEP + $i))
 +
    matlab -nodisplay -singleCompThread -r "my_function($MY_JOBINDEX)"
 +
done
 +
</pre>
 +
 
 +
Note that we now pass <tt>MY_JOBINDEX</tt> instead of <tt>SLURM_ARRAY_TASK_ID</tt> to the my_function call so that each calculations gets its unique index. Submit this script but tell Slurm to run just two jobs in the job array (4&nbsp;calculations/(2&nbsp;calculations/job) = 2&nbsp;jobs):
  
  bsub -J "'''hello[1-4]'''" matlab -nodisplay -singleCompThread -r "my_function('''\$LSB_JOBINDEX''')"
+
  sbatch --job-name="hello '''--array=1-2''' < run_my_function.sh
  
Here it is important that the $ sign in front of <tt>LSB_JOBINDEX</tt> is masked with a backslash, as the variable needs to be evaluated at runtime. This example would be equivalent to submitting 4 jobs in a row:
+
If the number of calculations to run is not divisible by the number of calculations per job (let's say we want to run 3&nbsp;calculations per job), then expand the script to be as follows:
  
bsub -J "hello['''1''']" matlab -nodisplay -singleCompThread -r "my_function('''1''')"
+
<pre>
bsub -J "hello['''2''']" matlab -nodisplay -singleCompThread -r "my_function('''2''')"
+
#!/bin/bash
bsub -J "hello['''3''']" matlab -nodisplay -singleCompThread -r "my_function('''3''')"
+
STEP=3
bsub -J "hello['''4''']" matlab -nodisplay -singleCompThread -r "my_function('''4''')"
+
MAXINDEX=4
 +
for ((i=1;i<=$STEP;i++)); do
 +
    MY_JOBINDEX=$((($SLURM_ARRAY_TASK_ID-1)*$STEP + $i))
 +
    if [ $MY_JOBINDEX -gt $MAXINDEX ]; then
 +
        break
 +
    fi
 +
    matlab -nodisplay -singleCompThread -r "my_function($MY_JOBINDEX)"
 +
done
 +
</pre>
 +
Submit this script and set the ending value to ceiling(MAXINDEX/STEP)=ceiling(4/3)=2,
  
===Environment variables===
+
sbatch --job-name="hello" '''--array=1-2''' < run_my_function.sh
Each job will execute the same script or command, but can have its own input and output file (-i "in.%I" -o "out.%I") where %I corresponds to the job index in the array. The variables $LSB_JOBINDEX and $LSB_JOBINDEX_END can be used inside the script to find out what is the current job's index and the number of jobs in the array, for example:
 
  
  bsub -J "hello[1-10]" "echo Hello, I am job \$LSB_JOBINDEX of \$LSB_JOBINDEX_END"
+
==Monitoring job arrays==
  bsub -w hello "echo Everybody is here"
+
You can monitor a job array with the '''squeue''', '''scontrol''' or '''sacct''' command:
  
''to be continued…''
+
squeue -j JOBID                        # all jobs in an array
 +
squeue -j JOBID_ELEMENT                # specific job in an array
 +
scontrol show jobid -dd JOBID          # all jobs in an array
 +
scontrol show jobid -dd JOBID_ELEMENT  # specific job in an array
 +
sacct --format JobID,User,State,AllocCPUS,Elapsed,NNodes,NTasks,TotalCPU,REQMEM,MaxRSS,ExitCode JOBID # all jobs in an array
 +
sacct --format JobID,User,State,AllocCPUS,Elapsed,NNodes,NTasks,TotalCPU,REQMEM,MaxRSS,ExitCode JOBID_ELEMENT
  
==Rerunning failed jobs==
+
For instance
One of the advantages to job arrays is that it is easy to rerun just the failed jobs in a job array. Simply run
 
brequeue -e JOBID
 
and LSF will resubmit just the failed jobs from the job array. This assumes that a successful jobs exits with exit code&nbsp;0 while a failed job exits with a non-zero exit code.
 
  
You can combine this with a dependency condition on the entire job array:
+
  scontrol show jobid -dd 1010910        # all jobs in 1010910
  bsub -J "calc[1-4]" [other bsub options] ./program [arguments]
+
  scontrol show jobid -dd 1010910_4      # fourth job in the array 1010910
# Note the JOBID
 
  bsub -w "numended(JOBID,*)" -J "brequeue -e JOBID"
 
Any failed jobs will be requeued once all of the jobs have had a chance to run. The same JOBID will be reused when runnig the failed jobs a second time. This processes can be repeated if necessary.
 
  
An alternative is to create a dependency condition to rerun failed jobs in a new array job:
+
==Rerunning failed jobs==
bsub -J "calc[1-4]" [other bsub options] ./program [arguments]
+
If some jobs in the job array are failing, then Slurm will automatically try to rerun them.
bsub -J "calc_two[1-4]" -w "exit(calc[*])" [other bsub options] ./program [arguments]
 
  
 
== See also ==
 
== See also ==
 
+
* [https://slurm.schedmd.com/job_array.html Job array support in Slurm]
* [https://scicomp.ethz.ch/public/manual/LSF/9.1.3/lsf_admin/index.htm?job_arrays_lsf.html~main Job Arrays ''in'' Administering Platform LSF]
+
* [[Job_chaining#Setting_dependencies_on_job_arrays|Setting dependencies on job arrays]]

Latest revision as of 13:58, 26 October 2022

Introduction

Many cluster users are running embarrassingly parallel simulations consisting of hundreds or thousands of similar calculations, each one executing the same program but with slightly different — or random in the case of Monte-Carlo simulation — parameters. The usual approach is to submit each one as an independent job. This works fine, although keeping track of all these jobs is not easy, and can get quite complicated if these jobs must be executed in a coordinated fashion (e.g. master/slave). It would be much simpler if one could submit all these jobs at once, and manage them as a single entity. The good news is that it is indeed possible using a so-called job array. Jobs in an array have a common job-ID, plus a specific job-index ($SLURM_ARRAY_TASK_ID) corresponding to their position in the array.

Submitting a job array

Let's take for example a simulation consisting of 4 independent calculations. Normally, one would submit them as 4 individual jobs:

sbatch --job-name="calc 1" --wrap="./program [arguments]"
sbatch --job-name="calc 2" --wrap="./program [arguments]"
sbatch --job-name="calc 3" --wrap=./program"[arguments]"
sbatch --job-name="calc 4" --wrap=./program"[arguments]"

or

for ((n=1;n<=4;n++)); do
    sbatch --job-name="calc $n" --wrap="./program [arguments]"
done

Using a job array, however, one can submit these calculations all at once, using a single sbatch command:

sbatch --array=1-4 --wrap="./program [arguments]"
[sfux@eu-login-40 ~]$ sbatch --array=1-4 --wrap="echo \"Hello, I am an independent job\""
Submitted batch job 1189055
[sfux@eu-login-40 ~]$ squeue -u sfux
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
     1189055_[1-4] normal.4h     wrap     sfux PD       0:00      1 (None) 

A job array creates a Slurm logfile for each element, which will have the name slurm-JOBID_ELEMENT:

[sfux@eu-login-40 ~]$ ls -ltr slurm*
-rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_1.out
-rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_2.out
-rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_3.out
-rw-r--r-- 1 sfux sfux-group 31 Oct 24 10:50 slurm-1189055_4.out
[sfux@eu-login-40 ~]$ cat slurm-1189055_1.out
Hello, I am an independent job

Setting a range of 1-4 will submit 4 jobs (using the default step size 1).

Limiting the number of jobs that are allowed to run at the same time

A job array allows a large number of jobs to be submitted with one command, potentially flooding a system, and job slot limits provide a way to limit the impact a job array may have on a system. You can set this limit by adding %job_slot_limit after specifying the range of the array

sbatch --array=[1-10000]%10 --wrap="echo \"Hello, I am an independent job\""

In this example the array contains 10000 elements and maximally 10 jobs are allowed to run at the same time.

Simulation parameters

Since all jobs in an array execute the same program (or script), you need to define specific parameters for each calculation. You can do this using different mechanisms:

  • create a different input file for each job
  • pass the job index as argument to the program
  • use a "commands" file with 1 command per line

Input and output files

One can use the special strings %A (jobid) and %a (task/element id) in the job's input file name as a placeholder. For example:

sbatch --job-name="testjob" --array=1-4 --input="param.%A.%a" --wrap="command [argument]"
sbatch --job-name="testjob" --array=1-4 --input="calc%A.%a.in" --wrap="command [argument]"

The same mechanism also applies to the output file:

sbatch --job-name="testjob" --array=1-4 --output="result.%A.%a" --wrap="command [argument]"
sbatch --job-name="testjob" --array=1-4 --output="calc%A.%a.out" --wrap="command [argument]"

or the error file:

sbatch --job-name="testjob" --array=1-4 --error="error.%A.%a" --wrap="command [argument]"
sbatch --job-name="testjob" --array=1-4 --error="%A.%a.err" --wrap="command [arguments]"

Program arguments

A common case is to pass the parameter value (the array index $SLURM_ARRAY_TASK_ID) as a command-line argument. Here is an example for a MATLAB function with the parameter as its sole argument:

sbatch --job-name="hello" --array=1-4 --wrap="matlab -nodisplay -singleCompThread -r my_function(\$SLURM_ARRAY_TASK_ID)"

It is important that the $ sign in front of SLURM_ARRAY_TASK_ID is masked with a backslash \$, as the variable needs to be evaluated at runtime. This example would be equivalent to submitting 4 jobs in a row:

sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function(1)"
sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function(2)"
sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function(3)"
sbatch --job-name="hello" --wrap="matlab -nodisplay -singleCompThread -r my_function(4)"

You can specify the range for the job array by using the format

start-end:step

For example

sbatch --job-name="testjob" --array=10-20:2 --wrap="echo \$SLURM_ARRAY_TASK_ID"

would create a job array with 6 elements that would be equivalent to submitting the following six commands:

sbatch --job-name="testjob" --wrap="echo 10"
sbatch --job-name="testjob" --wrap="echo 12"
sbatch --job-name="testjob" --wrap="echo 14"
sbatch --job-name="testjob" --wrap="echo 16"
sbatch --job-name="testjob" --wrap="echo 18"
sbatch --job-name="testjob" --wrap="echo 20"

Please find below an overview on the available environment variables for job arrays in Slurm:

Environment variable Description
$SLURM_ARRAY_TASK_COUNT Number of Slurm jobs in the array
$SLURM_ARRAY_TASK_ID Array index of the elements in the array
$SLURM_ARRAY_TASK_MIN Minimum index in the job array
$SLURM_ARRAY_TASK_MAX Maximum index in the job array

Using a "commands" file

The approach to use the job index works well for a single parameter, or a set of parameters that can be mapped to natural numbers (in this case, the different parameter would be calculated from the job index). There are also cases with multiple parameters that cannot be mapped to natural numbers. Then an alternative technique would be to create a text file "commands" which contains 1 command per line.

Then the variable $SLURM_ARRAY_TASK_ID is a pointer determining which line of the file a job executes.

sbatch --job-name="testjob" --array=1-4 --wrap="awk -v jindex=\$SLURM_ARRAY_TASK_ID 'NR==jindex' commands | bash"

The awk command extracts line number $SLURM_ARRAY_TASK_ID from the "commands" and passes it to bash, such that the command is executed.

The first job would then execute the first command from the commands files, the second job the second command etc.

Group calculations into fewer jobs

Often the jobs within a job array are too short (anything below a few minutes) because every job in the array runs just one short calculation.

You can increase the throughput of your entire job array be grouping several calculations into a fewer number of jobs instead of running a single calculation per job. You should target each job to run for at least about half an hour and 5 minutes at the very least.

In the previous example, we showed how to run four matlab function calls (matlab -nodisplay -singleCompThread -r "my_function(\$SLURM_ARRAY_TASK_ID)") as a job array with four jobs. Now let us convert this to a job array with two jobs, each of which runs two of the function calls. In the first step we will put the matlab call into a script, run_my_function.sh:

#!/bin/bash
matlab -nodisplay -singleCompThread -r "my_function($SLURM_ARRAY_TASK_ID)"

which can be submitted by redirecting it to the bsub command:

sbatch --job-name="hello" --array=1-4 < run_my_function.sh

So far nothing has changed except for how the the command is passed to sbatch. Note that there is no backslash before $SLURM_ARRAY_TASK_ID in the script. In the second step, change the run_my_function.sh script to run two matlab function calls by writing a for loop. Define the STEP variable to be the number of calculations to run in each loop. In our case this is 2:

#!/bin/bash
STEP=2
for ((i=1;i<=$STEP;i++)); do
    MY_JOBINDEX=$((($SLURM_ARRAY_TASK_ID-1)*$STEP + $i))
    matlab -nodisplay -singleCompThread -r "my_function($MY_JOBINDEX)"
done

Note that we now pass MY_JOBINDEX instead of SLURM_ARRAY_TASK_ID to the my_function call so that each calculations gets its unique index. Submit this script but tell Slurm to run just two jobs in the job array (4 calculations/(2 calculations/job) = 2 jobs):

sbatch --job-name="hello --array=1-2 < run_my_function.sh

If the number of calculations to run is not divisible by the number of calculations per job (let's say we want to run 3 calculations per job), then expand the script to be as follows:

#!/bin/bash
STEP=3
MAXINDEX=4
for ((i=1;i<=$STEP;i++)); do
    MY_JOBINDEX=$((($SLURM_ARRAY_TASK_ID-1)*$STEP + $i))
    if [ $MY_JOBINDEX -gt $MAXINDEX ]; then
        break
    fi
    matlab -nodisplay -singleCompThread -r "my_function($MY_JOBINDEX)"
done

Submit this script and set the ending value to ceiling(MAXINDEX/STEP)=ceiling(4/3)=2,

sbatch --job-name="hello" --array=1-2 < run_my_function.sh

Monitoring job arrays

You can monitor a job array with the squeue, scontrol or sacct command:

squeue -j JOBID                         # all jobs in an array
squeue -j JOBID_ELEMENT                 # specific job in an array
scontrol show jobid -dd JOBID           # all jobs in an array
scontrol show jobid -dd JOBID_ELEMENT   # specific job in an array
sacct --format JobID,User,State,AllocCPUS,Elapsed,NNodes,NTasks,TotalCPU,REQMEM,MaxRSS,ExitCode JOBID # all jobs in an array
sacct --format JobID,User,State,AllocCPUS,Elapsed,NNodes,NTasks,TotalCPU,REQMEM,MaxRSS,ExitCode JOBID_ELEMENT

For instance

scontrol show jobid -dd 1010910         # all jobs in 1010910
scontrol show jobid -dd 1010910_4       # fourth job in the array 1010910

Rerunning failed jobs

If some jobs in the job array are failing, then Slurm will automatically try to rerun them.

See also