Difference between revisions of "LSF mini reference"
From ScientificComputing
(2 intermediate revisions by 2 users not shown) | |||
Line 11: | Line 11: | ||
|width=80%|Wall-clock time required by the job. Can also be expressed in minutes. | |width=80%|Wall-clock time required by the job. Can also be expressed in minutes. | ||
|-valign="top" | |-valign="top" | ||
− | |<tt>-n ''N''</tt> || Number of | + | |<tt>-n ''N''</tt> || Number of cores required by the job. |
|-valign="top" | |-valign="top" | ||
− | |<tt>-R "rusage[mem=''X'']"</tt> || Amount of memory (in MB per | + | |<tt>-R "rusage[mem=''X'']"</tt> || Amount of memory (in MB per core) required by the job. |
|-valign="top" | |-valign="top" | ||
|<tt>-o ''outfile''</tt> || Append the job's output (stdout) to ''outfile.'' The keyword "%J" is interpreted as the job's numerical ID. | |<tt>-o ''outfile''</tt> || Append the job's output (stdout) to ''outfile.'' The keyword "%J" is interpreted as the job's numerical ID. | ||
Line 29: | Line 29: | ||
|<tt>-w ''"depcond"''</tt> || Wait (do not start the job) until the specified dependency condition is satisfied. For example: "'''done'''(''jobID'')", "'''ended'''(''jobname'')". Quotes are recommended. | |<tt>-w ''"depcond"''</tt> || Wait (do not start the job) until the specified dependency condition is satisfied. For example: "'''done'''(''jobID'')", "'''ended'''(''jobname'')". Quotes are recommended. | ||
|-valign="top" | |-valign="top" | ||
− | |<tt>-B / -N</tt> || Send an e-mail to the job's owner (''username''@ | + | |<tt>-B / -N</tt> || Send an e-mail to the job's owner (''username''@ethz.ch) when the job begins / ends. |
|-valign="top" | |-valign="top" | ||
|<tt>-u ''user''</tt> || Send e-mail to ''user'' instead of the job's owner. The recipient's address must be inside the ETH domain. The firewall blocks e-mail sent to other addresses. ('''Note''': This switch alone does not imply <tt>-B</tt> nor <tt>-N</tt>.) | |<tt>-u ''user''</tt> || Send e-mail to ''user'' instead of the job's owner. The recipient's address must be inside the ETH domain. The firewall blocks e-mail sent to other addresses. ('''Note''': This switch alone does not imply <tt>-B</tt> nor <tt>-N</tt>.) | ||
|-valign="top" | |-valign="top" | ||
|<tt>-r</tt> || Indicate that the job is re-runnable. If the compute node where your job is running crashes, LSF will automatically re-run it '''from the beginning''' on a different node. | |<tt>-r</tt> || Indicate that the job is re-runnable. If the compute node where your job is running crashes, LSF will automatically re-run it '''from the beginning''' on a different node. | ||
+ | |-valign="top" | ||
+ | |<tt>-G ''share_name''</tt> || Use the ''share_name'' shareholder share to run this job | ||
|} | |} | ||
Latest revision as of 12:26, 2 October 2018
Below is an overview of the most important LSF commands and their frequently used options.
bsub
Submit a job to the batch system.
Option | Description |
---|---|
-W HH:MM | Wall-clock time required by the job. Can also be expressed in minutes. |
-n N | Number of cores required by the job. |
-R "rusage[mem=X]" | Amount of memory (in MB per core) required by the job. |
-o outfile | Append the job's output (stdout) to outfile. The keyword "%J" is interpreted as the job's numerical ID. |
-e errfile | Append the job's error (stderr) to errfile. By default, stderr is merged with stdout. |
-oo outfile | Write the job's output (stdout) to outfile, overwriting it if it already exists. |
-eo errfile | Write the job's error (stderr) to errfile, overwriting it if it already exists. |
-I / -Ip / -Is | Run the job interactively. Input/output are redirected from/to your terminal. Use -Ip to create a pseudo-terminal, and -Is to enable shell support. |
-J jobname | Assign a (non necessarily unique) name to the job. Used to define job chains. To avoid confusion with numerical job IDs, jobname should contain at least one letter. |
-w "depcond" | Wait (do not start the job) until the specified dependency condition is satisfied. For example: "done(jobID)", "ended(jobname)". Quotes are recommended. |
-B / -N | Send an e-mail to the job's owner (username@ethz.ch) when the job begins / ends. |
-u user | Send e-mail to user instead of the job's owner. The recipient's address must be inside the ETH domain. The firewall blocks e-mail sent to other addresses. (Note: This switch alone does not imply -B nor -N.) |
-r | Indicate that the job is re-runnable. If the compute node where your job is running crashes, LSF will automatically re-run it from the beginning on a different node. |
-G share_name | Use the share_name shareholder share to run this job |
bjobs
Monitor one or more batch jobs.
Option | Description |
---|---|
(no option) | List all your jobs — running, pending or suspended. |
-l / -w | Long / wide format (mutually exclusive). |
-r | Show only running jobs. |
-p | Show only pending jobs, and the reason why they are pending. |
-d | Show only jobs that ended recently (done). |
-x | Show jobs that have triggered an exception (e.g. "idle"). |
-q queue | Show jobs in the specified batch queue. |
-u user | Show jobs submitted by another user (or "all"). |
-J jobname | Show information about the specified job(s). |
jobID(s) | Show information about the specified job(s). This must be the last argument. |
bkill
Kill (or signal) one or more jobs.
Option | Description |
---|---|
jobID(s) | Kill the specified job(s). |
0 | Kill all jobs submitted by you (that's a zero, not the letter "O"). |
-J jobname | Kill the last job submitted under that name. |
-J jobname 0 | Kill all jobs submitted under that name. Used to kill a series of jobs at once. |
-s signal | Send a signal (e.g. URG, USR2) to the job. Your job must be designed to handle that signal, for example to save data. Sending the wrong signal may be fatal. |
bmod
Modify a job's parameters.
Option | Description |
---|---|
-w "depcond" | Modify a job's dependency condition. Use -wn to remove it. |
jobID | ID of the job to be modified. |
bqueues
Show information about one or more batch queues.
Option | Description |
---|---|
(no option) | List all queues (name, priority, status, limits, number of pending/running jobs). |
-w | The same, with slightly more details. |
-l queue | Show a long description of one or more queues (all by default). |
lsload
Show processor load.
Option | Description |
---|---|
host(s) | Show load information for the specified hosts (all by default). Can be used in conjunction with bjobs to see how much memory a job is using, for example. |