Difference between revisions of "Job monitoring"

From ScientificComputing
Jump to: navigation, search
Line 1: Line 1:
{| class="wikitable"
+
__NOTOC__
! Job control commands || Description
+
== Check job status ==
|-
+
=== bjobs ===
| busers || user limits, number of pending and running jobs
+
After submitting a job, this job was dispatch to a queue and had the PENDING status.
|-
+
$ bjobs
| bqueues || queues status (open/closed; active/inactive)
+
JOBID      USER    STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
|-
+
161182423  jarunan PEND  normal.4h  eu-login-43            *cho hello Jan 22 06:01
| bjobs || more or less detailed information about pending and running jobs, and recently finished jobs
+
 
|-
+
Then, the job was run on a compute node and had the RUNNING status.
| bbjobs || better bjobs
+
$ bjobs
|-  
+
JOBID      USER    STAT  QUEUE      FROM_HOST  EXEC_HOST  JOB_NAME  SUBMIT_TIME
| bhist || info about jobs finished in the last hours/days
+
161182423  jarunan RUN  normal.4h  eu-login-43 eu-ms-005-0 *cho hello Jan 22 06:01
|-
 
| bpeek || display the standard output of a given joblsf_loadshow the CPU load of all nodes used by a job
 
|-  
 
| bjob_connect || login to a node where your job is running
 
|-
 
| bkill || kill a job
 
|}
 
  
== bjobs ==
 
 
{| class="wikitable"
 
{| class="wikitable"
 
! bjobs options || Description
 
! bjobs options || Description
Line 43: Line 35:
 
| job-ID(s) || list of job-IDs (this must be the last option)
 
| job-ID(s) || list of job-IDs (this must be the last option)
 
|}
 
|}
 +
 +
{| class="wikitable"
 +
! Job control commands || Description
 +
|-
 +
| busers || user limits, number of pending and running jobs
 +
|-
 +
| bqueues || queues status (open/closed; active/inactive)
 +
|-
 +
| bjobs || more or less detailed information about pending and running jobs, and recently finished jobs
 +
|-
 +
| bbjobs || better bjobs
 +
|-
 +
| bhist || info about jobs finished in the last hours/days
 +
|-
 +
| bpeek || display the standard output of a given joblsf_loadshow the CPU load of all nodes used by a job
 +
|-
 +
| bjob_connect || login to a node where your job is running
 +
|-
 +
| bkill || kill a job
 +
|}
 +
  
 
== bbjobs ==
 
== bbjobs ==

Revision as of 05:08, 22 January 2021

Check job status

bjobs

After submitting a job, this job was dispatch to a queue and had the PENDING status.

$ bjobs
JOBID      USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
161182423  jarunan PEND  normal.4h  eu-login-43             *cho hello Jan 22 06:01

Then, the job was run on a compute node and had the RUNNING status.

$ bjobs
JOBID      USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
161182423  jarunan RUN   normal.4h  eu-login-43 eu-ms-005-0 *cho hello Jan 22 06:01
bjobs options Description
(no option) list all your jobs in all queues
-p list only pending(waiting) jobs and indicate why they are pending
-r list only running jobs
-d list only done job (finished within the last hour)
-l display status in long format
-w display status in wide format
-o "format" use custom output format (see LSF documentation for details)
-J jobname show only job(s) called jobname
-q queue show only jobs in a specific queue
job-ID(s) list of job-IDs (this must be the last option)
Job control commands Description
busers user limits, number of pending and running jobs
bqueues queues status (open/closed; active/inactive)
bjobs more or less detailed information about pending and running jobs, and recently finished jobs
bbjobs better bjobs
bhist info about jobs finished in the last hours/days
bpeek display the standard output of a given joblsf_loadshow the CPU load of all nodes used by a job
bjob_connect login to a node where your job is running
bkill kill a job


bbjobs

bkill

bjobs options Description
job-ID kill job-ID
0 kill all jobs (yours only)
-J jobname kill most recent job called jobname
-J jobname 0 kill all jobs called jobname
-q queue kill most recent job in queue
-q queue 0 kill all jobs in queue