Difference between revisions of "Job submission"

From ScientificComputing
Jump to: navigation, search
Line 118: Line 118:
 
* [[Job chaining]]
 
* [[Job chaining]]
 
* [[Using the batch system|The complete guide: Using the batch system]]
 
* [[Using the batch system|The complete guide: Using the batch system]]
 
== Job array ==
 
Multiple similar jobs can be submitted at once using a so-called “job array”
 
* All jobs in an array share the same JobID
 
* Use job index between brackets to distinguish between individual jobs in an array
 
* LSF stores job index and array size in environment variables
 
* Each job can have its own standard output
 
 
Submit N jobs at once
 
bsub-J "array_name[1-N]" ./program
 
 
Monitor jobs
 
bjobs -J array_name          # all jobs in an array
 
bjobs -J jobID                # all jobs in an array
 
bjobs -J array_name[index]    # specific job in an array
 
bjobs -J jobID[index]        # specific job in an array
 
 
=== Examples ===
 
 
[sfux@eu-login-03 ~] bsub -J "hello[1-8]"
 
bsub> echo "Hello, I am job $LSB_JOBINDEX of $LSB_JOBINDEX_END"
 
bsub> ctrl-D
 
Job array.
 
Job <29976045> is submitted to queue <normal.4h>.
 
 
[sfux@eu-login-03 ~]$ bjobs
 
JOBID      USER  STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
 
29976045  sfuxPEND  normal.4h  euler03                hello[1]  Oct 10 11:03
 
29976045  sfuxPEND  normal.4h  euler03                hello[2]  Oct 10 11:03
 
29976045  sfuxPEND  normal.4h  euler03                hello[3]  Oct 10 11:03
 
29976045  sfuxPEND  normal.4h  euler03                hello[4]  Oct 10 11:03
 
29976045  sfuxPEND  normal.4h  euler03                hello[5]  Oct 10 11:03
 
29976045  sfuxPEND  normal.4h  euler03                hello[6]  Oct 10 11:03
 
29976045  sfuxPEND  normal.4h  euler03                hello[7]  Oct 10 11:03
 
29976045  sfuxPEND  normal.4h  euler03                hello[8]  Oct 10 11:03
 
 
[leonhard@euler03 ~]$ bjobs -J hello[6]
 
JOBID      USER  STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
 
29976045  sfuxPEND  normal.4h  euler03                hello[6]  Oct 10 11:03
 
 
== Lightweight jobs ==
 
Light-weight jobs are jobs that do not consume a lot of CPU time, for example
 
* Master process in some type of parallel jobs
 
* File transfer program
 
* Interactive shell
 
 
=== Example ===
 
Submit a 15-minute interactive bash shell and logout (type “logout” or “exit”) when you’re done.
 
[sfux@eu-login-03 ~]$ bsub-W 15 -Is -R light /bin/bash
 
Generic job.
 
Job <27877012> is submitted to queue <light.5d>.
 
<<Waiting for dispatch ...>>
 
<<Starting on eu-c7-133-05>>
 
 
[sfux@eu-c7-133-05 ~]$ pwd/cluster/home/sfux
 
[sfux@eu-c7-133-05 ~]$ hostname
 
eu-c7-133-05
 
[sfux@eu-c7-133-05 ~]$ exit
 
exit
 
[sfux@eu-login-03 ~]$
 

Revision as of 10:51, 22 January 2021

Basic job submission

A basic BSUB job submission command consists of three parts:

bsub LSF options job
  1. The BSUB executable command
  2. LSF options requesting resources and defining job-related options
  3. A job to be submitted

Here is an example:

bsub -n 1 -W 4:00 -R "rusage[mem=4096]" "python myscript.py"

When the job is submitted, LSF shows job's information:

$ bsub -n 1 -W 4:00 -R "rusage[mem=4096]" "python myscript.py"
Generic job.
Job <8146539> is submitted to queue <normal.4h>
  1. Job type, e.g., Generic Job or MPI Job
  2. Job ID, e.g., 8146539
  3. The queue, e.g., normal.4h

Job

A job can be one of the following:

Job Command Examples of job submission command
a single Linux command cmd
a program with its path /path/to/myprogram bsub ./bin/hello
a command or program with its arguments cmd arg1 arg2 bsub echo hello
multiple commands "cmd1 ; cmd2" bsub "date; pwd; ls -l"
piped command "cmd1 | cmd2"
a command with I/O redirection, quote "cmd<in >out" bsub "du -sk /scratch > du.out"
a here document, passed via "<<" << EOF ... EOF
a shell script, passed via "<" < script bsub < hello.sh

LSF options

Requesting resources

Resources Format Default values
Maximum run time -W HH:MM 04:00 (4 hours)
Number of processors -n nprocs 1 processor
Memory -R "rusage[mem=2048]" 1024 MB per core
Scratch space -R "rusage[scratch=10000]"

Other LSF options

-o outfile append job’s standard output to outfile
-e errfile append job’s error messages to errfile
-R "rusage[...]" advanced resource requirement (memory,...)
-J jobname assign a jobname to the job
-w "depcond" wait until dependency condition is satisfied
-Is submit an interactive job with pseudo-terminal
-B /-N send an email when the job begins/ends
-u user@domain use this address instead of username@ethz.ch

LSF submission line advisor can assist your to find LSF options you need.

Job script and #BSUB pragmas

Create a job script called job_script.bsub

#!/bin/bash
#BSUB -n 24                     # 24 cores
#BSUB -W 8:00                   # 8-hour run-time
#BSUB -R "rusage[mem=4000]"     # 4000 MB per core
#BSUB -J analysis1
#BSUB -o analysis1.out
#BSUB -e analysis1.err
#BSUB -N

module load gcc/6.3.0 openmpi/3.0.2
cd /path/to/execution/folder
mpirun myprogram arg1

Submit a job

bsub < job_script.bsub

Interactive session on a compute node

To run a quick test or a benchmark, you can request an interactive session on a compute node by using the BSUB option -I, -Ip or -Is, for example:

[jarunanp@eu-login-38 ~]$ bsub -n 4 -W 01:00 -Is bash
Generic job.
Job <161197292> is submitted to queue <normal.4h>.
<<Waiting for dispatch ...>>
<<Starting on eu-ms-001-15>>
[jarunanp@eu-ms-001-15 ~]$

Further reading