Job monitoring
From ScientificComputing
Check job status
bjobs
After submitting a job, this job was dispatch to a queue and had the PENDING status.
$ bjobs JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 161182423 jarunan PEND normal.4h eu-login-43 *cho hello Jan 22 06:01
Then, the job was run on a compute node and had the RUNNING status.
$ bjobs JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 161182423 jarunan RUN normal.4h eu-login-43 eu-ms-005-0 *cho hello Jan 22 06:01
bjobs options | Description |
---|---|
(no option) | list all your jobs in all queues |
-p | list only pending(waiting) jobs and indicate why they are pending |
-r | list only running jobs |
-d | list only done job (finished within the last hour) |
-l | display status in long format |
-w | display status in wide format |
-o "format" | use custom output format (see LSF documentation for details) |
-J jobname | show only job(s) called jobname |
-q queue | show only jobs in a specific queue |
job-ID(s) | list of job-IDs (this must be the last option) |
Job control commands | Description |
---|---|
busers | user limits, number of pending and running jobs |
bqueues | queues status (open/closed; active/inactive) |
bjobs | more or less detailed information about pending and running jobs, and recently finished jobs |
bbjobs | better bjobs |
bhist | info about jobs finished in the last hours/days |
bpeek | display the standard output of a given joblsf_loadshow the CPU load of all nodes used by a job |
bjob_connect | login to a node where your job is running |
bkill | kill a job |
bbjobs
bkill
bjobs options | Description |
---|---|
job-ID | kill job-ID |
0 | kill all jobs (yours only) |
-J jobname | kill most recent job called jobname |
-J jobname 0 | kill all jobs called jobname |
-q queue | kill most recent job in queue |
-q queue 0 | kill all jobs in queue |