Difference between revisions of "Job management with LSF"

From ScientificComputing
Jump to: navigation, search
Line 10: Line 10:
  
 
<div style="width: 60%; background: #B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
 
<div style="width: 60%; background: #B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Job submission|1. Submit a job]]
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Job submission|'''1. Submit a job''']]
 
</div>
 
</div>
  
 
<div style="width: 60%; background:#B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
 
<div style="width: 60%; background:#B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Parallel job submission|2. Submit a parallel job]]
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Parallel job submission|'''2. Submit a parallel job''']]
 
</div>
 
</div>
  
 
<div style="width: 60%; background: #B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
 
<div style="width: 60%; background: #B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Job monitoring|3. Monitor a job]]
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Job monitoring|'''3. Monitor a job''']]
 
</div>
 
</div>
  
 
<div style="width: 60%; background: #B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
 
<div style="width: 60%; background: #B2D9EA; height: 35px; border-radius: 10px; padding: 5px; margin:5px">
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Job output | 4. Job output]]
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[[Job output |'''4. Job output''']]
 
</div>
 
</div>
  
Line 35: Line 35:
 
</table>
 
</table>
  
== Quick examples ==
+
== Why should I use the LSF batch system? ==
=== Submit a job with a command line ===
+
Users can access the computing resources on the cluster solely through the batch system. On the ETH HPC clusters, we use the LSF batch system to manage computing jobs.
$ env2lmod
 
$ module load gcc/6.3.0 openmpi/4.0.2 python/3.8.5
 
$ bsub -n 4 -W 4:00 -R "rusage[mem=2048]" "python myscript.py"
 
  
=== Submit with a job script ===
+
== What are the steps to use the compute nodes on the cluster? ==
Create a job script called job_script.bsub
+
* Log in to a login node on the cluster
#!/bin/bash
+
* Transfer your data to the cluster
#BSUB -n 4                    # 4 cores
+
* Load necessary modules
#BSUB -W 4:00                  # 8-hour run-time
+
* Prepare a BSUB command with LSF options which request the computing resources that you need
#BSUB -R "rusage[mem=2048]"    # 2048 MB per core
+
* Submit a job with a BSUB command line or a job script
+
* Wait for your job to run
source /cluster/apps/local/env2lmod.sh
+
* Your job is run on compute nodes
module load gcc/6.3.0 openmpi/4.0.2 python/3.8.5
+
* Get your job results and output
python myscript.py
 
 
 
Submit the script
 
$ bsub < job_script.bsub
 
 
 
=== Monitor submitted job ===
 
Check the status of your submitted job
 
$ bjobs
 

Revision as of 12:33, 1 February 2021

250px

     1. Submit a job

     3. Monitor a job

     4. Job output

Why should I use the LSF batch system?

Users can access the computing resources on the cluster solely through the batch system. On the ETH HPC clusters, we use the LSF batch system to manage computing jobs.

What are the steps to use the compute nodes on the cluster?

  • Log in to a login node on the cluster
  • Transfer your data to the cluster
  • Load necessary modules
  • Prepare a BSUB command with LSF options which request the computing resources that you need
  • Submit a job with a BSUB command line or a job script
  • Wait for your job to run
  • Your job is run on compute nodes
  • Get your job results and output