RTX 4090 Testing

From ScientificComputing
Jump to: navigation, search

Introduction

We have added the first nodes with RTX 4090 GPUs to Euler. During the beta test phase, users that already have access to GPUs in Euler can test the new GPU type.

Usage

To run a job on the new RTX 4090 GPUs, you need to specify the exact model:

sbatch --ntasks=1 --gpus=rtx_4090:1 --wrap="./my_gpu_program"

Of course, other options options, such as more system memory or CPU cores can be included:

sbatch --ntasks=1 --cpus-per-task=16 --mem-per-cpu=3g --gpus=rtx_4090:1 --wrap="./my_gpu_program"

You should avoid using any other GPU options for this evaluation, such as selecting GPU memory. Please note that during the test phase, we only allow jobs with a maximal runtime of 24 hours.

Software

Please note that support for the RTX 40XX series of GPUs was added in CUDA 11.8. Therefore please use CUDA 11.8.0 or 12.1.1 that are available as modules on Euler. Python 3.11.2 is also suited to be used on the new GPUs:

GCC 8.2.0:

[sfux@eu-login-14 ~]$ module load gcc/8.2.0 python_gpu/3.11.2

The following have been reloaded with a version change:
  1) gcc/4.8.5 => gcc/8.2.0

[sfux@eu-login-14 ~]$ module list

Currently Loaded Modules:
  1) StdEnv   2) gcc/8.2.0   3) openblas/0.3.20   4) cuda/11.8.0   5) cudnn/8.8.1.3   6) nccl/2.11.4-1   7) python_gpu/3.11.2


[sfux@eu-login-14 ~]$

This Python installation provides PyTorch 2.0.1 that should support for RTX 4090.

When using an older CUDA version with the RTX 4090 GPU, then warning messages such as

W tensorflow/stream_executor/gpu/asm_compiler.cc:235] Your CUDA software stack is old. We fallback to the NVIDIA driver for some compilation. Update your CUDA version to get the best performance. The ptxas error was: ptxas fatal   : Value 'sm_86' is not defined for option 'gpu-name'

are to be expected.

Support

We can only provide limited support for running software on this evaluation GPU. However, do not hesitate to contact us in case of problems submitting jobs or accessing the GPU.