Difference between revisions of "GPU job submission"
From ScientificComputing
(Created page with "To use the GPUs for a job node you need to request the '''ngpus_excl_p''' resource. It refers to the number of GPUs '''per node'''. This is unlike other resources, which are r...") |
|||
Line 1: | Line 1: | ||
+ | __NOTOC__ | ||
+ | <table style="width: 100%;"> | ||
+ | <tr valign=top> | ||
+ | <td style="width: 30%; text-align:left"> | ||
+ | < [[Job submission | Submit a job]] | ||
+ | </td> | ||
+ | <td style="width: 35%; text-align:center"> | ||
+ | [[Sandbox Home | Home]] | ||
+ | </td> | ||
+ | <td style="width: 35%; text-align:right"> | ||
+ | [[Job monitoring | Monitor a job]] > | ||
+ | </td> | ||
+ | </tr> | ||
+ | </table> | ||
+ | |||
To use the GPUs for a job node you need to request the '''ngpus_excl_p''' resource. It refers to the number of GPUs '''per node'''. This is unlike other resources, which are requested '''per core'''. | To use the GPUs for a job node you need to request the '''ngpus_excl_p''' resource. It refers to the number of GPUs '''per node'''. This is unlike other resources, which are requested '''per core'''. | ||
Line 12: | Line 27: | ||
== Further reading == | == Further reading == | ||
* [[Getting started with GPUs]] | * [[Getting started with GPUs]] | ||
+ | |||
+ | <table style="width: 100%;"> | ||
+ | <tr valign=top> | ||
+ | <td style="width: 30%; text-align:left"> | ||
+ | < [[Job submission | Submit a job]] | ||
+ | </td> | ||
+ | <td style="width: 35%; text-align:center"> | ||
+ | [[Sandbox Home | Home]] | ||
+ | </td> | ||
+ | <td style="width: 35%; text-align:right"> | ||
+ | [[Job monitoring | Monitor a job]] > | ||
+ | </td> | ||
+ | </tr> | ||
+ | </table> |
Revision as of 10:06, 14 June 2021
To use the GPUs for a job node you need to request the ngpus_excl_p resource. It refers to the number of GPUs per node. This is unlike other resources, which are requested per core.
For example, to run a serial job with one GPU,
bsub -R "rusage[ngpus_excl_p=1]" ./my_cuda_program
or on a full node with all 8 GeForce GTX 1080 Ti GPUs and up to 90 GB of RAM,
bsub -n 20 -R "rusage[mem=4500,ngpus_excl_p=8]" -R "select[gpu_model0==GeForceGTX1080Ti]" ./my_cuda_program
or on two full nodes:
bsub -n 40 -R "rusage[mem=4500,ngpus_excl_p=8]" -R "select[gpu_model0==GeForceGTX1080Ti]" -R "span[ptile=20]" ./my_cuda_program
While your jobs will see all GPUs, LSF will set the CUDA_VISIBLE_DEVICES environment variable, which is honored by CUDA programs.