Job submission
From ScientificComputing
Basic job submission
A basic BSUB job submission command consists of three parts:
bsub | LSF options | job |
- The BSUB executable command
- LSF options requesting resources and defining job-related options
- A job to be submitted
Here is an example:
bsub | -n 1 -W 4:00 -R "rusage[mem=4096]" | "python myscript.py" |
When the job is submitted, LSF shows job's information:
$ bsub -n 1 -W 4:00 -R "rusage[mem=4096]" "python myscript.py" Generic job. Job <8146539> is submitted to queue <normal.4h>
- Job type, e.g., Generic Job or MPI Job
- Job ID, e.g., 8146539
- The queue, e.g., normal.4h
Job
A job can be one of the following:
Job | Command | Examples of job submission command |
---|---|---|
a single Linux command | cmd | |
a program with its path | /path/to/myprogram | bsub ./bin/hello |
a command or program with its arguments | cmd arg1 arg2 | bsub echo hello |
multiple commands | "cmd1 ; cmd2" | bsub "date; pwd; ls -l" |
piped command | "cmd1 | cmd2" | |
a command with I/O redirection, quote | "cmd<in >out" | bsub "du -sk /scratch > du.out" |
a here document, passed via "<<" | << EOF ... EOF | |
a shell script, passed via "<" | < script | bsub < hello.sh |
LSF options
Requesting resources
Resources | Format | Default values |
---|---|---|
Maximum run time | -W HH:MM | 04:00 (4 hours) |
Number of processors | -n nprocs | 1 processor |
Memory | -R "rusage[mem=2048]" | 1024 MB per core |
Scratch space | -R "rusage[scratch=10000]" |
Other LSF options
-o outfile | append job’s standard output to outfile |
-e errfile | append job’s error messages to errfile |
-R "rusage[...]" | advanced resource requirement (memory,...) |
-J jobname | assign a jobname to the job |
-w "depcond" | wait until dependency condition is satisfied |
-Is | submit an interactive job with pseudo-terminal |
-B /-N | send an email when the job begins/ends |
-u user@domain | use this address instead of username@ethz.ch |
LSF submission line advisor can assist your to find LSF options you need.
Job script and #BSUB pragmas
Create a job script called job_script.bsub
#!/bin/bash #BSUB -n 24 # 24 cores #BSUB -W 8:00 # 8-hour run-time #BSUB -R "rusage[mem=4000]" # 4000 MB per core #BSUB -J analysis1 #BSUB -o analysis1.out #BSUB -e analysis1.err #BSUB -N module load gcc/6.3.0 openmpi/3.0.2 cd /path/to/execution/folder mpirun myprogram arg1
Submit a job
bsub < job_script.bsub
Further reading
Job array
Multiple similar jobs can be submitted at once using a so-called “job array”
- All jobs in an array share the same JobID
- Use job index between brackets to distinguish between individual jobs in an array
- LSF stores job index and array size in environment variables
- Each job can have its own standard output
Submit N jobs at once
bsub-J "array_name[1-N]" ./program
Monitor jobs
bjobs -J array_name # all jobs in an array bjobs -J jobID # all jobs in an array bjobs -J array_name[index] # specific job in an array bjobs -J jobID[index] # specific job in an array
Examples
[sfux@eu-login-03 ~] bsub -J "hello[1-8]" bsub> echo "Hello, I am job $LSB_JOBINDEX of $LSB_JOBINDEX_END" bsub> ctrl-D Job array. Job <29976045> is submitted to queue <normal.4h>.
[sfux@eu-login-03 ~]$ bjobs JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 29976045 sfuxPEND normal.4h euler03 hello[1] Oct 10 11:03 29976045 sfuxPEND normal.4h euler03 hello[2] Oct 10 11:03 29976045 sfuxPEND normal.4h euler03 hello[3] Oct 10 11:03 29976045 sfuxPEND normal.4h euler03 hello[4] Oct 10 11:03 29976045 sfuxPEND normal.4h euler03 hello[5] Oct 10 11:03 29976045 sfuxPEND normal.4h euler03 hello[6] Oct 10 11:03 29976045 sfuxPEND normal.4h euler03 hello[7] Oct 10 11:03 29976045 sfuxPEND normal.4h euler03 hello[8] Oct 10 11:03
[leonhard@euler03 ~]$ bjobs -J hello[6] JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 29976045 sfuxPEND normal.4h euler03 hello[6] Oct 10 11:03
Lightweight jobs
Light-weight jobs are jobs that do not consume a lot of CPU time, for example
- Master process in some type of parallel jobs
- File transfer program
- Interactive shell
Example
Submit a 15-minute interactive bash shell and logout (type “logout” or “exit”) when you’re done.
[sfux@eu-login-03 ~]$ bsub-W 15 -Is -R light /bin/bash Generic job. Job <27877012> is submitted to queue <light.5d>. <<Waiting for dispatch ...>> <<Starting on eu-c7-133-05>> [sfux@eu-c7-133-05 ~]$ pwd/cluster/home/sfux [sfux@eu-c7-133-05 ~]$ hostname eu-c7-133-05 [sfux@eu-c7-133-05 ~]$ exit exit [sfux@eu-login-03 ~]$