Difference between revisions of "Job submission"

From ScientificComputing
Jump to: navigation, search
Line 149: Line 149:
 
* File transfer program
 
* File transfer program
 
* Interactive shell
 
* Interactive shell
bsub -n 4 -W 4:00 -R "rusage[mem=4096]" -Is bash
+
 
 
=== Example ===
 
=== Example ===
 
Submit a 15-minute interactive bash shell and logout (type “logout” or “exit”) when you’re done.
 
Submit a 15-minute interactive bash shell and logout (type “logout” or “exit”) when you’re done.

Revision as of 04:57, 22 January 2021

Basic job submission

A basic BSUB job submission command consists of three parts:

bsub LSF options job
  1. The BSUB executable command
  2. LSF options requesting resources and defining job-related options
  3. A job to be submitted

Here is an example:

bsub -n 1 -W 4:00 -R "rusage[mem=4096]" "python myscript.py"

When the job is submitted, LSF shows job's information:

$ bsub -n 1 -W 4:00 -R "rusage[mem=4096]" "python myscript.py"
Generic job.
Job <8146539> is submitted to queue <normal.4h>
  1. Job type, e.g., Generic Job or MPI Job
  2. Job ID, e.g., 8146539
  3. The queue, e.g., normal.4h

Job

A job can be one of the following:

Job Command Examples of job submission command
a single Linux command cmd
a program with its path /path/to/myprogram bsub ./bin/hello
a command or program with its arguments cmd arg1 arg2 bsub echo hello
multiple commands "cmd1 ; cmd2" bsub "date; pwd; ls -l"
piped command "cmd1 | cmd2"
a command with I/O redirection, quote "cmd<in >out" bsub "du -sk/scratch > du.out"
a here document, passed via "<<" << EOF ... EOF
a shell script, passed via "<" < script bsub < hello.sh

LSF options

Requesting resources

Resources Format Default values
Maximum run time -W HH:MM 04:00 (4 hours)
Number of processors -n nprocs 1 processor
Memory -R "rusage[mem=2048]" 1024 MB per core
Scratch space -R "rusage[scratch=10000]"

Other LSF options

-o outfile append job’s standard output to outfile
-e errfile append job’s error messages to errfile
-R "rusage[...]" advanced resource requirement (memory,...)
-J jobname assign a jobname to the job
-w "depcond" wait until dependency condition is satisfied
-Is submit an interactive job with pseudo-terminal
-B /-N send an email when the job begins/ends
-u user@domain use this address instead of username@ethz.ch

LSF submission line advisor can assist your to find LSF options you need.

Job script and #BSUB pragmas

Create a job script called job_script.bsub

#!/bin/bash
#BSUB -n 24                     # 24 cores
#BSUB -W 8:00                   # 8-hour run-time
#BSUB -R "rusage[mem=4000]"     # 4000 MB per core
#BSUB -J analysis1
#BSUB -o analysis1.out
#BSUB -e analysis1.err
#BSUB -N

module load gcc/6.3.0 openmpi/3.0.2
cd /path/to/execution/folder
mpirun myprogram arg1

Submit a job

bsub < job_script.bsub

Job array

Multiple similar jobs can be submitted at once using a so-called “job array”

  • All jobs in an array share the same JobID
  • Use job index between brackets to distinguish between individual jobs in an array
  • LSF stores job index and array size in environment variables
  • Each job can have its own standard output

Submit N jobs at once

bsub-J "array_name[1-N]" ./program 

Monitor jobs

bjobs -J array_name           # all jobs in an array
bjobs -J jobID                # all jobs in an array
bjobs -J array_name[index]    # specific job in an array
bjobs -J jobID[index]         # specific job in an array

Examples

[sfux@eu-login-03 ~] bsub -J "hello[1-8]"
bsub> echo "Hello, I am job $LSB_JOBINDEX of $LSB_JOBINDEX_END"
bsub> ctrl-D
Job array.
Job <29976045> is submitted to queue <normal.4h>.
[sfux@eu-login-03 ~]$ bjobs
JOBID      USER  STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
29976045   sfuxPEND  normal.4h  euler03                 hello[1]   Oct 10 11:03
29976045   sfuxPEND  normal.4h  euler03                 hello[2]   Oct 10 11:03
29976045   sfuxPEND  normal.4h  euler03                 hello[3]   Oct 10 11:03
29976045   sfuxPEND  normal.4h  euler03                 hello[4]   Oct 10 11:03
29976045   sfuxPEND  normal.4h  euler03                 hello[5]   Oct 10 11:03
29976045   sfuxPEND  normal.4h  euler03                 hello[6]   Oct 10 11:03
29976045   sfuxPEND  normal.4h  euler03                 hello[7]   Oct 10 11:03
29976045   sfuxPEND  normal.4h  euler03                 hello[8]   Oct 10 11:03
[leonhard@euler03 ~]$ bjobs -J hello[6]
JOBID      USER  STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
29976045   sfuxPEND  normal.4h  euler03                 hello[6]   Oct 10 11:03

Lightweight jobs

Light-weight jobs are jobs that do not consume a lot of CPU time, for example

  • Master process in some type of parallel jobs
  • File transfer program
  • Interactive shell

Example

Submit a 15-minute interactive bash shell and logout (type “logout” or “exit”) when you’re done.

[sfux@eu-login-03 ~]$ bsub-W 15 -Is -R light /bin/bash
Generic job.
Job <27877012> is submitted to queue <light.5d>.
<<Waiting for dispatch ...>>
<<Starting on eu-c7-133-05>>

[sfux@eu-c7-133-05 ~]$ pwd/cluster/home/sfux
[sfux@eu-c7-133-05 ~]$ hostname
eu-c7-133-05
[sfux@eu-c7-133-05 ~]$ exit
exit
[sfux@eu-login-03 ~]$