New CPU and GPU nodes in Euler (January 2022)

From ScientificComputing
Revision as of 11:25, 31 January 2022 by Byrdeo (talk | contribs)

Jump to: navigation, search

Despite a very difficult market situation due to the global semiconductor shortage, we managed to get all the hardware that we ordered in 2021 just before Christmas:

  • 248 CPU nodes, each equipped with 128 cores (2 x 64-core AMD EPYC 7763) and 256 GB of memory
  • 20 GPU nodes, each equipped with 128 cores (2 x 64-core AMD EPYC 7742), 512 GB of memory and 8 GPUs (Nvidia Quadro RTX 6000)

The installation and testing of these nodes was completed in January 2022. All nodes are now operational, except for a few that had some hardware issues and are being repaired.

This major expansion increases the computing capacity of Euler by 536 CPUs (34,304 cores) and 160 GPUs (737,280 CUDA cores + 92,160 Tensor cores).

Notes

  • We originally ordered nodes equipped with Nvidia Titan RTX GPUs. However, since Nvidia could not deliver this model, we had to switch to Quadro RTX 6000. Both models are based on the same chip and have similar specifications (CUDA cores, Tensor cores, memory). The main difference is that the Quadro RTX 6000 is a professional GPU with longer term support than the Titan RTX. Although the Quadro has a significantly higher price, the change was done at no additional cost to ETH.
  • Since a Quadro RTX 6000 is technically equivalent to a Titan RTX, the new GPU nodes are considered as "high-end" nodes and are therefore included in the "gpuhe.*" queues of Euler.
  • Users who want to run jobs on these new GPUs can do so by requesting the GPU model "QuadroRTX6000" as shown here.

Useful links