Difference between revisions of "Job monitoring"

From ScientificComputing
Jump to: navigation, search
Line 1: Line 1:
 
__NOTOC__
 
__NOTOC__
== Check job status ==
+
= Check job status =
=== bjobs ===
+
== bjobs ==
After submitting a job, this job was dispatch to a queue and had the PENDING status.
+
After submitting a job, the job will wait in a queue to be run on a compute node and has the PENDING status.
 
  $ bjobs
 
  $ bjobs
 
  JOBID      USER    STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
 
  JOBID      USER    STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
 
  161182423  jarunan PEND  normal.4h  eu-login-43            *cho hello Jan 22 06:01
 
  161182423  jarunan PEND  normal.4h  eu-login-43            *cho hello Jan 22 06:01
  
Then, the job was run on a compute node and had the RUNNING status.
+
When the job is running on a compute node, it has the RUNNING status.
 
  $ bjobs
 
  $ bjobs
 
  JOBID      USER    STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
 
  JOBID      USER    STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
Line 35: Line 35:
 
| job-ID(s) || list of job-IDs (this must be the last option)
 
| job-ID(s) || list of job-IDs (this must be the last option)
 
|}
 
|}
 +
 +
 +
== bbjobs ==
 +
bbjobs displays more human-friendly information than bjobs. Here are examples in PENDING and RUNNING status.
 +
<table style="width: 100%">
 +
<tr valign=top>
 +
<td style="width: 40%; background: white;">
 +
==== PENDING status ====
 +
$ bbjobs
 +
Job information
 +
  Job ID                      : 161182479
 +
  Status                      : PENDING
 +
  User                        : jarunanp
 +
  Queue                        : normal.4h
 +
  Command                      : sleep 10; echo hello
 +
  Working directory            : $HOME/-
 +
Requested resources
 +
  Requested cores              : 1
 +
  Requested runtime            : 4 h 0 min
 +
  Requested memory            : 1024 MB per core
 +
  Requested scratch            : not specified
 +
  Dependency                  : -
 +
Job history
 +
  Submitted at                : 06:03 2021-01-22
 +
  Queue wait time              : 18 sec
 +
</td>
 +
<td style="width: 3%; background: white;">
 +
</td>
 +
<td style="width: 50%; background: white;">
 +
==== RUNNING status ====
 +
$ bbjobs
 +
Job information
 +
  Job ID                        : 161182479
 +
  Status                        : RUNNING
 +
  Running on node              : eu-ms-025-27
 +
  User                          : jarunanp
 +
  Queue                        : normal.4h
 +
  Command                      : sleep 10; echo hello
 +
  Working directory            : $HOME/-
 +
Requested resources
 +
  Requested cores              : 1
 +
  Requested runtime            : 4 h 0 min
 +
  Requested memory              : 1024 MB per core
 +
  Requested scratch            : not specified
 +
  Dependency                    : -
 +
Job history
 +
  Submitted at                  : 06:03 2021-01-22
 +
  Started at                    : 06:03 2021-01-22
 +
  Queue wait time              : 20 sec
 +
Resource usage
 +
  Updated at                    : 06:04 2021-01-22
 +
  Wall-clock                    : 4 sec
 +
  Tasks                        : 4
 +
  Total CPU time                : 0 sec
 +
  CPU utilization              : 0.0 %
 +
  Sys/Kernel time              : 0.0 %
 +
  Total resident Memory        : 2 MB
 +
  Resident memory utilization  : 0.2 %
 +
</td>
 +
</tr>
 +
</table>
 +
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 57: Line 119:
  
  
== bbjobs ==
 
  
 
== bkill ==
 
== bkill ==

Revision as of 05:23, 22 January 2021

Check job status

bjobs

After submitting a job, the job will wait in a queue to be run on a compute node and has the PENDING status.

$ bjobs
JOBID      USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
161182423  jarunan PEND  normal.4h  eu-login-43             *cho hello Jan 22 06:01

When the job is running on a compute node, it has the RUNNING status.

$ bjobs
JOBID      USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
161182423  jarunan RUN   normal.4h  eu-login-43 eu-ms-005-0 *cho hello Jan 22 06:01
bjobs options Description
(no option) list all your jobs in all queues
-p list only pending(waiting) jobs and indicate why they are pending
-r list only running jobs
-d list only done job (finished within the last hour)
-l display status in long format
-w display status in wide format
-o "format" use custom output format (see LSF documentation for details)
-J jobname show only job(s) called jobname
-q queue show only jobs in a specific queue
job-ID(s) list of job-IDs (this must be the last option)


bbjobs

bbjobs displays more human-friendly information than bjobs. Here are examples in PENDING and RUNNING status.

PENDING status

$ bbjobs
Job information
  Job ID                       : 161182479
  Status                       : PENDING
  User                         : jarunanp
  Queue                        : normal.4h
  Command                      : sleep 10; echo hello
  Working directory            : $HOME/-
Requested resources
  Requested cores              : 1
  Requested runtime            : 4 h 0 min
  Requested memory             : 1024 MB per core
  Requested scratch            : not specified
  Dependency                   : -
Job history
  Submitted at                 : 06:03 2021-01-22
  Queue wait time              : 18 sec

RUNNING status

$ bbjobs
Job information
  Job ID                        : 161182479
  Status                        : RUNNING
  Running on node               : eu-ms-025-27 
  User                          : jarunanp
  Queue                         : normal.4h
  Command                       : sleep 10; echo hello
  Working directory             : $HOME/-
Requested resources
  Requested cores               : 1
  Requested runtime             : 4 h 0 min
  Requested memory              : 1024 MB per core
  Requested scratch             : not specified
  Dependency                    : -
Job history
  Submitted at                  : 06:03 2021-01-22
  Started at                    : 06:03 2021-01-22
  Queue wait time               : 20 sec
Resource usage
  Updated at                    : 06:04 2021-01-22
  Wall-clock                    : 4 sec
  Tasks                         : 4
  Total CPU time                : 0 sec
  CPU utilization               : 0.0 %
  Sys/Kernel time               : 0.0 %
  Total resident Memory         : 2 MB
  Resident memory utilization   : 0.2 % 


Job control commands Description
busers user limits, number of pending and running jobs
bqueues queues status (open/closed; active/inactive)
bjobs more or less detailed information about pending and running jobs, and recently finished jobs
bbjobs better bjobs
bhist info about jobs finished in the last hours/days
bpeek display the standard output of a given joblsf_loadshow the CPU load of all nodes used by a job
bjob_connect login to a node where your job is running
bkill kill a job


bkill

bjobs options Description
job-ID kill job-ID
0 kill all jobs (yours only)
-J jobname kill most recent job called jobname
-J jobname 0 kill all jobs called jobname
-q queue kill most recent job in queue
-q queue 0 kill all jobs in queue