CUDA 10 on Leonhard
This page contains information about the Leonhard Open cluster, which is now obsolete as the cluster has been integrated into the Euler cluster on 14/15 September 2021
Introduction
Nvidia has released CUDA 10 end of February 2019. This new CUDA SDK release requires a sufficiently new GPU driver (>=410.48), which was the reason that CUDA 10 was not provided on Leonhard yet. We have now updated the GPU driver on all GPU nodes in Leonhard Open and installed the CUDA 10.0.130 SDK. In order to make use of the new CUDA 10 SDK, we provide a new Python installation (python_gpu/3.7.1) and installed the most recent versions of the common machine learning frameworks.
Modules
To use the new CUDA 10 release and cuDNN 7.6, please load the cuda/10.0.130 and the cudnn/7.5 module:
[sfux@lo-gtx-001 ~]$ module list Currently Loaded Modules: 1) StdEnv 2) gcc/4.8.5 [sfux@lo-gtx-001 ~]$ module load cuda/10.0.130 cudnn/7.5 [sfux@lo-gtx-001 ~]$ module list Currently Loaded Modules: 1) StdEnv 2) gcc/4.8.5 3) cuda/10.0.130 4) cudnn/7.5
If you load the python_gpu/3.7.1 module, then it will automatically load cuda/10.0.130 and cudnn/7.5
[sfux@lo-gtx-001 ~]$ module list Currently Loaded Modules: 1) StdEnv 2) gcc/4.8.5 [sfux@lo-gtx-001 ~]$ module load python_gpu/3.7.1 [sfux@lo-gtx-001 ~]$ module list Currently Loaded Modules: 1) StdEnv 3) openblas/0.2.19 5) cudnn/7.5 7) jpeg/9b 9) python_gpu/3.7.1 2) gcc/4.8.5 4) cuda/10.0.130 6) nccl/2.3.7-1 8) libpng/1.6.27
Available frameworks
- Tensorflow 1.13.1
- Scikit-learn 0.20.3
- Keras 2.2.4
- Theano 1.0.4
- PyTorch 1.0.1
[sfux@lo-gtx-001 ~]$ module load python_gpu/3.7.1 [sfux@lo-gtx-001 ~]$ python Python 3.7.1 (default, Mar 20 2019, 09:01:27) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow >>> tensorflow.__version__ '1.13.1' >>> import sklearn >>> sklearn.__version__ '0.20.3' >>> import keras Using TensorFlow backend. >>> keras.__version__ '2.2.4' >>> import theano >>> theano.__version__ '1.0.4' >>> import torch >>> torch.__version__ '1.0.1.post2' >>>
Tensorflow 1.13.1
The precompiled wheels for tensorflow 1.13.1 provided on pypi will not work on Leonhard, as it does not support CentOS 7.5 (it requires a newer libc). We have therefore compiled tensorflow 1.13.1 with support for CUDA 10.0.130, cuDNN 7.5. The code has been optimized for AVX2 (CPU part) and with regards to GPU architectures (compute 61, 70 and 75). It is therefore optimized for the GeForce 1080 Ti, our new DGX-1 (Tesla V100) as well as the coming generation of GPUs (RTX 2080 and similar).