Difference between revisions of "Job monitoring"
From ScientificComputing
Line 1: | Line 1: | ||
− | + | __NOTOC__ | |
− | + | == Check job status == | |
− | + | === bjobs === | |
− | + | After submitting a job, this job was dispatch to a queue and had the PENDING status. | |
− | + | $ bjobs | |
− | + | JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME | |
− | + | 161182423 jarunan PEND normal.4h eu-login-43 *cho hello Jan 22 06:01 | |
− | + | ||
− | + | Then, the job was run on a compute node and had the RUNNING status. | |
− | + | $ bjobs | |
− | + | JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME | |
− | + | 161182423 jarunan RUN normal.4h eu-login-43 eu-ms-005-0 *cho hello Jan 22 06:01 | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
{| class="wikitable" | {| class="wikitable" | ||
! bjobs options || Description | ! bjobs options || Description | ||
Line 43: | Line 35: | ||
| job-ID(s) || list of job-IDs (this must be the last option) | | job-ID(s) || list of job-IDs (this must be the last option) | ||
|} | |} | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Job control commands || Description | ||
+ | |- | ||
+ | | busers || user limits, number of pending and running jobs | ||
+ | |- | ||
+ | | bqueues || queues status (open/closed; active/inactive) | ||
+ | |- | ||
+ | | bjobs || more or less detailed information about pending and running jobs, and recently finished jobs | ||
+ | |- | ||
+ | | bbjobs || better bjobs | ||
+ | |- | ||
+ | | bhist || info about jobs finished in the last hours/days | ||
+ | |- | ||
+ | | bpeek || display the standard output of a given joblsf_loadshow the CPU load of all nodes used by a job | ||
+ | |- | ||
+ | | bjob_connect || login to a node where your job is running | ||
+ | |- | ||
+ | | bkill || kill a job | ||
+ | |} | ||
+ | |||
== bbjobs == | == bbjobs == |
Revision as of 07:08, 22 January 2021
Check job status
bjobs
After submitting a job, this job was dispatch to a queue and had the PENDING status.
$ bjobs JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 161182423 jarunan PEND normal.4h eu-login-43 *cho hello Jan 22 06:01
Then, the job was run on a compute node and had the RUNNING status.
$ bjobs JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 161182423 jarunan RUN normal.4h eu-login-43 eu-ms-005-0 *cho hello Jan 22 06:01
bjobs options | Description |
---|---|
(no option) | list all your jobs in all queues |
-p | list only pending(waiting) jobs and indicate why they are pending |
-r | list only running jobs |
-d | list only done job (finished within the last hour) |
-l | display status in long format |
-w | display status in wide format |
-o "format" | use custom output format (see LSF documentation for details) |
-J jobname | show only job(s) called jobname |
-q queue | show only jobs in a specific queue |
job-ID(s) | list of job-IDs (this must be the last option) |
Job control commands | Description |
---|---|
busers | user limits, number of pending and running jobs |
bqueues | queues status (open/closed; active/inactive) |
bjobs | more or less detailed information about pending and running jobs, and recently finished jobs |
bbjobs | better bjobs |
bhist | info about jobs finished in the last hours/days |
bpeek | display the standard output of a given joblsf_loadshow the CPU load of all nodes used by a job |
bjob_connect | login to a node where your job is running |
bkill | kill a job |
bbjobs
bkill
bjobs options | Description |
---|---|
job-ID | kill job-ID |
0 | kill all jobs (yours only) |
-J jobname | kill most recent job called jobname |
-J jobname 0 | kill all jobs called jobname |
-q queue | kill most recent job in queue |
-q queue 0 | kill all jobs in queue |