Difference between revisions of "GPU job submission"
(→Available GPU node types) |
(→Available GPU node types) |
||
Line 63: | Line 63: | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | ! GPU Model !! Specifier (GPU driver <= 450.80.02) !! Specifier (GPU driver > 450.80.02 !! GPU memory per GPU !! CPU cores per node !! CPU memory per node | + | ! GPU Model !! Specifier (GPU driver <= 450.80.02) !! Specifier (GPU driver > 450.80.02) !! GPU memory per GPU !! CPU cores per node !! CPU memory per node |
|- | |- | ||
| NVIDIA GeForce GTX 1080 || <tt>GeForceGTX1080</tt> || || 8 GiB || 20 || 256 GiB | | NVIDIA GeForce GTX 1080 || <tt>GeForceGTX1080</tt> || || 8 GiB || 20 || 256 GiB | ||
Line 69: | Line 69: | ||
| NVIDIA GeForce GTX 1080 Ti || <tt>GeForceGTX1080Ti</tt> || || 11 GiB || 20 || 256 GiB | | NVIDIA GeForce GTX 1080 Ti || <tt>GeForceGTX1080Ti</tt> || || 11 GiB || 20 || 256 GiB | ||
|- | |- | ||
− | | NVIDIA GeForce RTX 2080 Ti || <tt>GeForceRTX2080Ti</tt> || || 11 GiB || 36 || 384 GiB | + | | NVIDIA GeForce RTX 2080 Ti || <tt>GeForceRTX2080Ti</tt> || <tt>NVIDIAGeForceRTX2080Ti</tt> || 11 GiB || 36 || 384 GiB |
|- | |- | ||
− | | NVIDIA GeForce RTX 2080 Ti || <tt>GeForceRTX2080Ti</tt> || || 11 GiB || 128 || 512 GiB | + | | NVIDIA GeForce RTX 2080 Ti || <tt>GeForceRTX2080Ti</tt> || <tt>NVIDIAGeForceRTX2080Ti</tt> || 11 GiB || 128 || 512 GiB |
|- | |- | ||
− | | NVIDIA TITAN RTX || <tt>TITANRTX</tt> || || 24 GiB || 128 || 512 GiB | + | | NVIDIA TITAN RTX || <tt>TITANRTX</tt> || <tt>NVIDIATITANRTX</tt> || 24 GiB || 128 || 512 GiB |
|- | |- | ||
| [[Nvidia_DGX-1_with_Tensor_Cores| NVIDIA Tesla V100-SXM2 32 GB]] || <tt>TeslaV100_SXM2_32GB</tt> || || 32 GiB || 48 || 768 GiB | | [[Nvidia_DGX-1_with_Tensor_Cores| NVIDIA Tesla V100-SXM2 32 GB]] || <tt>TeslaV100_SXM2_32GB</tt> || || 32 GiB || 48 || 768 GiB |
Revision as of 07:19, 9 December 2021
Figure: Here is an example of CPU & GPU system architecture. There are several system architectures on the cluster. |
To use the GPUs for a job node you need to request the ngpus_excl_p resource. It refers to the number of GPUs per node. This is unlike other resources, which are requested per core. For example, to run a serial job with one GPU, $ bsub -R "rusage[ngpus_excl_p=1]" ./my_cuda_program How to select GPU memoryIf you know that you will need more memory on a GPU than some models provide, i.e., more than 8 GB, then you can request that your job will run only on GPUs that have enough memory. Use the gpu_mtotal0 host selection to do this. For example, if you need 10 GB (=10240 MB) per GPU: $ bsub -R "rusage[ngpus_excl_p=1]" -R "select[gpu_mtotal0>=10240]" ./my_cuda_program This ensures your job will not run on GPUs with less than 10 GB of GPU memory. How to select a GPU modelIn some cases it is desirable or necessary to select the GPU model on which your job runs, for example if you know you code runs much faster on a newer model. However, you should consider that by narrowing down the list of allowable GPUs, your job may need to wait for a longer time. To select a certain GPU model, add the -R "select[gpu_model1==GPU_MODEL]" resource requirement to bsub, $ bsub -R "rusage[ngpus_excl_p=1]" -R "select[gpu_model0==GeForceGTX1080]" ./my_cuda_program While your jobs will see all GPUs, LSF will set the CUDA_VISIBLE_DEVICES environment variable, which is honored by CUDA programs. |
Available GPU node types
GPU Model | Specifier (GPU driver <= 450.80.02) | Specifier (GPU driver > 450.80.02) | GPU memory per GPU | CPU cores per node | CPU memory per node |
---|---|---|---|---|---|
NVIDIA GeForce GTX 1080 | GeForceGTX1080 | 8 GiB | 20 | 256 GiB | |
NVIDIA GeForce GTX 1080 Ti | GeForceGTX1080Ti | 11 GiB | 20 | 256 GiB | |
NVIDIA GeForce RTX 2080 Ti | GeForceRTX2080Ti | NVIDIAGeForceRTX2080Ti | 11 GiB | 36 | 384 GiB |
NVIDIA GeForce RTX 2080 Ti | GeForceRTX2080Ti | NVIDIAGeForceRTX2080Ti | 11 GiB | 128 | 512 GiB |
NVIDIA TITAN RTX | TITANRTX | NVIDIATITANRTX | 24 GiB | 128 | 512 GiB |
NVIDIA Tesla V100-SXM2 32 GB | TeslaV100_SXM2_32GB | 32 GiB | 48 | 768 GiB | |
NVIDIA Tesla A100 | A100_PCIE_40GB | 40 GiB | 48 | 768 GiB |
Example
Further reading