GPU job submission
Figure: Here is an example of CPU & GPU system architecture. There are several system architectures on the cluster.
To use the GPUs for a job node you need to request the ngpus_excl_p resource. It refers to the number of GPUs per node. This is unlike other resources, which are requested per core.
For example, to run a serial job with one GPU,
$ bsub -R "rusage[ngpus_excl_p=1]" ./my_cuda_program
How to select GPU memory
If you know that you will need more memory on a GPU than some models provide, i.e., more than 8 GB, then you can request that your job will run only on GPUs that have enough memory. Use the gpu_mtotal0 host selection to do this. For example, if you need 10 GB (=10240 MB) per GPU:
$ bsub -R "rusage[ngpus_excl_p=1]" -R "select[gpu_mtotal0>=10240]" ./my_cuda_program
This ensures your job will not run on GPUs with less than 10 GB of GPU memory.
How to select a GPU model
In some cases it is desirable or necessary to select the GPU model on which your job runs, for example if you know you code runs much faster on a newer model. However, you should consider that by narrowing down the list of allowable GPUs, your job may need to wait for a longer time.
To select a certain GPU model, add the -R "select[gpu_model1==GPU_MODEL]" resource requirement to bsub,
$ bsub -R "rusage[ngpus_excl_p=1]" -R "select[gpu_model0==GeForceGTX1080]" ./my_cuda_program
While your jobs will see all GPUs, LSF will set the CUDA_VISIBLE_DEVICES environment variable, which is honored by CUDA programs.
Available GPU node types
|GPU Model||Specifier||GPU memory per GPU||CPU cores per node||CPU memory per node|
|NVIDIA GeForce GTX 1080||GeForceGTX1080||8 GiB||20||256 GiB|
|NVIDIA GeForce GTX 1080 Ti||GeForceGTX1080Ti||11 GiB||20||256 GiB|
|NVIDIA GeForce RTX 2080 Ti||GeForceRTX2080Ti||11 GiB||36||384 GiB|
|NVIDIA GeForce RTX 2080 Ti||GeForceRTX2080Ti||11 GiB||128||512 GiB|
|NVIDIA TITAN RTX||TITANRTX||24 GiB||128||512 GiB|
|NVIDIA Tesla V100-SXM2 32 GB||TeslaV100_SXM2_32GB||32 GiB||48||768 GiB|
|NVIDIA Tesla A100||A100_PCIE_40GB||40 GiB||48||768 GiB|