Difference between revisions of "Euler"
m (→Euler VI: changed from future to present tense) |
(→GPU nodes: Fixed broken link for DGX-1) |
||
(38 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | This page describes the hardware of the different generations (I- | + | This page describes the hardware of the different generations (I-VII) of Euler. For information on how to use the cluster, please check the [[Tutorials|tutorials]] page. |
==Introduction== | ==Introduction== | ||
Line 6: | Line 6: | ||
[[File:ETH_Zurich_Euler_II_and_I_in LCA.jpg|thumb|none|800px|Euler II (left) and Euler I (right)]] | [[File:ETH_Zurich_Euler_II_and_I_in LCA.jpg|thumb|none|800px|Euler II (left) and Euler I (right)]] | ||
− | Euler has been regularly expanded since its conception in 2013. The first phase — Euler I — was purchased at the end of 2013 and was in operation from 2014 to 2018. The second phase — Euler II — was purchased at the end of 2014 and | + | [[File:Euler_VII_front_view.jpg|thumb|none|800px|Euler VII]] |
+ | |||
+ | Euler has been regularly expanded since its conception in 2013. The first phase — Euler I — was purchased at the end of 2013 and was in operation from 2014 to 2018. The second phase — Euler II — was purchased at the end of 2014 and was in operation from 2015 tol 2020. Euler III was purchased at the end of 2016 and was in operation from 2017 to 2022. Euler IV was purchased at the end of 2017 and was in operation from 2018 to 2022. Euler V, which replaced Euler I, was purchased in the fall of 2018 and is in operation since the end of 2018. Euler VI was purchased at the end of 2019 and is in operation since the beginning of 2020. Euler VII has been purchased in two phases, the first one is operational since January 2021 and the second one since January 2022. Euler VIII will be installed at the end of 2022 and will be in operation at the beginning of 2023. | ||
==Specifications== | ==Specifications== | ||
Line 18: | Line 20: | ||
===Euler II=== | ===Euler II=== | ||
− | Euler II | + | Euler II (2015-2020) contained '''768''' compute nodes of a newer generation — [http://www.hpe.com/us/en/product-catalog/servers/proliant-servers/pip.overview.hpe-proliant-bl460c-gen9-server-blade.7271227.html BL460c Gen9] —, each equipped with: |
* Two 12-core [http://ark.intel.com/products/81908/ Intel Xeon E5-2680v3] processors (2.5-3.3 GHz) | * Two 12-core [http://ark.intel.com/products/81908/ Intel Xeon E5-2680v3] processors (2.5-3.3 GHz) | ||
* Between 64 and 512 GB of DDR4 memory clocked at 2133 MHz (32 × 512 GB; 32 × 256 GB; 32 × 128 GB; 672 × 64 GB) | * Between 64 and 512 GB of DDR4 memory clocked at 2133 MHz (32 × 512 GB; 32 × 256 GB; 32 × 128 GB; 672 × 64 GB) | ||
− | Euler II also | + | Euler II also contained '''4''' very large memory nodes — Hewlett-Packard [http://www.hpe.com/us/en/product-catalog/servers/proliant-servers/pip.hpe-proliant-dl580-gen9-server.8090149.html DL580 Gen9] —, each equipped with: |
* Four 16-core [http://ark.intel.com/products/84681/ Intel Xeon E7-8867v3] processors (2.5 GHz) | * Four 16-core [http://ark.intel.com/products/84681/ Intel Xeon E7-8867v3] processors (2.5 GHz) | ||
* '''3072''' GB of DDR4 memory clocked at 2133 MHz | * '''3072''' GB of DDR4 memory clocked at 2133 MHz | ||
+ | |||
+ | '''''Euler II was decommissioned in July 2020 to make room for new GPU nodes.''''' | ||
=== Euler III === | === Euler III === | ||
− | Euler III | + | Euler III (2016-2022) contained '''1215''' compute nodes — [https://www.hpe.com/us/en/product-catalog/servers/proliant-servers/pip.hpe-proliant-m710x-server-cartridge.1009011712.html Hewlett-Packard m710x] —, each equipped with: |
* A quad-core [http://ark.intel.com/products/93741/ Intel Xeon E3-1585Lv5] processor (3.0-3.7 GHz) | * A quad-core [http://ark.intel.com/products/93741/ Intel Xeon E3-1585Lv5] processor (3.0-3.7 GHz) | ||
* 32 GB of DDR4 memory clocked at 2133 MHz | * 32 GB of DDR4 memory clocked at 2133 MHz | ||
− | * | + | * Local scratch: 211,293.0 MB |
+ | |||
+ | All these nodes were connected to the rest of the cluster via 10G/40G Ethernet. | ||
− | + | '''''Euler III was decommissioned in May 2022 to make room for new GPU nodes''''' | |
=== Euler IV === | === Euler IV === | ||
Line 44: | Line 50: | ||
* Two '''18-core''' [https://ark.intel.com/products/120490/Intel-Xeon-Gold-6150-Processor-24_75M-Cache-2_70-GHz Intel Xeon Gold 6150] processors (2.7-3.7 GHz) | * Two '''18-core''' [https://ark.intel.com/products/120490/Intel-Xeon-Gold-6150-Processor-24_75M-Cache-2_70-GHz Intel Xeon Gold 6150] processors (2.7-3.7 GHz) | ||
* 192 GB of DDR4 memory clocked at 2666 MHz | * 192 GB of DDR4 memory clocked at 2666 MHz | ||
+ | * Local scratch: 348,582.0 MB (a few nodes have even 1,874,012.0 MB) | ||
All these nodes are connected together via a new 100 Gb/s InfiniBand EDR network. | All these nodes are connected together via a new 100 Gb/s InfiniBand EDR network. | ||
+ | |||
+ | '''''Euler IV will be decommissioned in November 2022 to make room for Euler VIII.''''' | ||
=== Euler V === | === Euler V === | ||
Line 53: | Line 62: | ||
* Two '''12-core''' [https://ark.intel.com/products/120473/ Intel Xeon Gold 5118] processors (2.3 GHz nominal, 3.2 GHz peak) | * Two '''12-core''' [https://ark.intel.com/products/120473/ Intel Xeon Gold 5118] processors (2.3 GHz nominal, 3.2 GHz peak) | ||
* 96 GB of DDR4 memory clocked at 2400 MHz | * 96 GB of DDR4 memory clocked at 2400 MHz | ||
+ | * Local scratch: 348,582.0 MB | ||
=== Euler VI === | === Euler VI === | ||
Line 60: | Line 70: | ||
* Two '''64-core''' [https://www.amd.com/en/products/cpu/amd-epyc-7742 AMD EPYC 7742] processors (2.25 GHz nominal, 3.4 GHz peak) | * Two '''64-core''' [https://www.amd.com/en/products/cpu/amd-epyc-7742 AMD EPYC 7742] processors (2.25 GHz nominal, 3.4 GHz peak) | ||
* 512 GB of DDR4 memory clocked at 3200 MHz | * 512 GB of DDR4 memory clocked at 3200 MHz | ||
+ | * Local scratch: 920,618.0 MB | ||
+ | |||
+ | All these nodes are connected together via a dedicated 100 Gb/s InfiniBand HDR network. | ||
+ | |||
+ | === Euler VII — phase 1 === | ||
+ | |||
+ | The first phase of Euler VII contains '''292''' compute nodes — HPE ProLiant XL225n Gen10 Plus —, each equipped with: | ||
+ | |||
+ | * Two '''64-core''' [https://www.amd.com/en/products/cpu/amd-epyc-7H12 AMD EPYC 7H12] processors (2.6 GHz nominal, 3.3 GHz peak) | ||
+ | * 256 GB of DDR4 memory clocked at 3200 MHz | ||
+ | |||
+ | All these nodes are connected together via a dedicated 100 Gb/s InfiniBand HDR network. | ||
+ | |||
+ | === Euler VII — phase 2 === | ||
− | All these nodes are connected | + | The 2nd phase of Euler VII contains '''248''' compute nodes — HPE ProLiant XL225n Gen10 Plus —, each equipped with: |
+ | |||
+ | * Two '''64-core''' [https://www.amd.com/en/products/cpu/amd-epyc-7763 AMD EPYC 7763] processors (2.45 GHz nominal, 3.5 GHz peak) | ||
+ | * 256 GB of DDR4 memory clocked at 3200 MHz | ||
+ | |||
+ | All these nodes share the same network as Euler VII phase 1. | ||
+ | |||
+ | === Euler VIII === | ||
+ | |||
+ | Euler VIII contains '''192''' compute nodes from Swiss company Dalco AG, each equipped with: | ||
+ | |||
+ | * Two '''64-core''' [https://www.amd.com/en/products/cpu/amd-epyc-7742 AMD EPYC 7742] processors (2.25 GHz nominal, 3.4 GHz peak) | ||
+ | * 512 GB of DDR4 memory clocked at 3200 MHz | ||
+ | * Local scratch: 920,618.0 MB | ||
+ | |||
+ | All these nodes are connected to the cluster's 100 Gb/s Ethernet network. | ||
+ | |||
+ | === GPU nodes === | ||
+ | |||
+ | Euler contains dozens of GPU nodes equipped with different types of GPUs: | ||
+ | |||
+ | * 9 nodes with 8 x [https://www.nvidia.com/en-us/geforce/news/geforce-gtx-1080 Nvidia GTX 1080] (formerly in [[Leonhard|Leonhard Open]]) | ||
+ | * 47 nodes with 8 x [https://www.nvidia.com/en-us/geforce/news/nvidia-geforce-gtx-1080-ti Nvidia GTX 1080 Ti] (formerly in [[Leonhard|Leonhard Open]]) | ||
+ | * 4 nodes with 8 x [https://docs.nvidia.com/dgx/dgx1-user-guide/introduction-to-dgx1.html Nvidia Tesla V100] (including some formerly in [[Leonhard|Leonhard Open]]) | ||
+ | * 93 nodes with 8 x [https://www.nvidia.com/en-me/geforce/graphics-cards/rtx-2080-ti Nvidia RTX 2080 Ti] (including some formerly in [[Leonhard|Leonhard Open]]) | ||
+ | * 16 nodes with 8 x [https://www.nvidia.com/en-us/deep-learning-ai/products/titan-rtx Nvidia Titan RTX] | ||
+ | * 20 nodes with 8 x [https://www.nvidia.com/en-us/design-visualization/rtx-a6000 Nvidia Quadro RTX 6000] | ||
+ | * 33 nodes with 8 x [https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090-3090ti Nvidia RTX 3090] | ||
+ | * 3 nodes with 8 x [https://www.nvidia.com/en-us/data-center/a100 Nvidia Tesla A100] (40 GB PCIe) | ||
+ | * 3 nodes with 10 x [https://www.nvidia.com/en-us/data-center/a100 Nvidia Tesla A100] (80 GB PCIe) | ||
+ | |||
+ | GPU nodes are not linked to a specific Euler expansion cycle but are purchased according to shareholders' needs. Their specifications (CPU, RAM, disk, networks) vary considerably, even between nodes equipped with similar GPUs. | ||
=== Storage === | === Storage === | ||
Line 67: | Line 122: | ||
Euler contains two types of storage system: | Euler contains two types of storage system: | ||
− | * An enterprise-class NAS system (NetApp FAS | + | * An enterprise-class NAS system (NetApp FAS 9000 & AFF A800) for long-term storage, such as home directories, applications, virtual machines, project data, etc. |
* A high-performance Lustre parallel file system (DDN ES14KX) for short- and medium-term storage, such as scratch and work file systems | * A high-performance Lustre parallel file system (DDN ES14KX) for short- and medium-term storage, such as scratch and work file systems | ||
Line 76: | Line 131: | ||
Euler contains multiple networks: | Euler contains multiple networks: | ||
− | * A common 10 Gb/s Ethernet network for data transfer between the storage systems and the cluster's compute and login nodes | + | * A common 100/25/10 Gb/s Ethernet network for data transfer between the storage systems and the cluster's compute and login nodes |
* Three separate 56 Gb/s InfiniBand FDR networks for data transfer between the compute nodes themselves (e.g. MPI) | * Three separate 56 Gb/s InfiniBand FDR networks for data transfer between the compute nodes themselves (e.g. MPI) | ||
* A 100 Gb/s InfiniBand EDR network for data transfer within Euler IV (MPI) and between Euler IV and the new Lustre high-performance storage system | * A 100 Gb/s InfiniBand EDR network for data transfer within Euler IV (MPI) and between Euler IV and the new Lustre high-performance storage system | ||
− | * A 200/100 Gb/s InfiniBand HDR network for data transfer within Euler | + | * A 200/100 Gb/s InfiniBand HDR network for data transfer within Euler VI–VII with 100 Gb/s from compute nodes to switches and multiple 200 Gb/s links between switches |
+ | |||
+ | === Summary === | ||
+ | {| class="wikitable" | ||
+ | ! Clusters || #nodes || #cores/node || CPUs || Clock speed || Memory || Mem clock speed || Local scratch [MB] || Network | ||
+ | |- | ||
+ | | Euler IV || 288 || 36 || Intel Xeon Gold 6150 || 2.7-3.7 GHz || 192 GB || 2666 MHz || 348,582.0 || 100 Gb/s InfiniBand EDR | ||
+ | |- | ||
+ | | Euler V || 352 || 24 || Intel Xeon Gold 5118 || 2.3 GHz nominal, 3.2 GHz peak || 96 GB || 2400 MHz || 348,582.0 || 10 Gb/s Ethernet | ||
+ | |- | ||
+ | | Euler VI || 216 || 128 || AMD EPYC 7742 || 2.25 GHz nominal, 3.4 GHz peak|| 512 GB || 3200 MHz || 920,618.0 || 100 Gb/s InfiniBand HDR | ||
+ | |- | ||
+ | | Euler VII p1 || 292 || 128 || AMD EPYC 7H12 || 2.6 GHz nominal, 3.3 GHz peak || 256 GB || 3200 MHz || 348,582.0 || 100 Gb/s InfiniBand HDR | ||
+ | |- | ||
+ | | Euler VII p2 || 248 || 128 || AMD EPYC 7763 || 2.45 GHz nominal, 3.5 GHz peak || 256 GB || 3200 MHz || 348,582.0 || 100 Gb/s InfiniBand HDR | ||
+ | |- | ||
+ | | Euler VIII || 192 || 128 || AMD EPYC 7742 || 2.25 GHz nominal, 3.4 GHz peak|| 512 GB || 3200 MHz || 920,618.0 || 100 Gb/s Ethernet | ||
+ | |} | ||
==Service description== | ==Service description== | ||
− | The official service description and the current price list are available on the [https:// | + | The official service description and the current price list are available on the [https://ethz.ch/services/en/it-services/katalog/server-cluster/hpc.html IT service catalogue]. |
== External Links == | == External Links == | ||
* https://www.ethz.ch/de/news-und-veranstaltungen/eth-news/news/2014/05/euler-mehr-power-fuer-die-forschung.html | * https://www.ethz.ch/de/news-und-veranstaltungen/eth-news/news/2014/05/euler-mehr-power-fuer-die-forschung.html | ||
* https://blogs.ethz.ch/id/2014/05/09/der-neue-hpc-cluster-euler-ist-verfugbar | * https://blogs.ethz.ch/id/2014/05/09/der-neue-hpc-cluster-euler-ist-verfugbar |
Latest revision as of 09:37, 8 September 2023
This page describes the hardware of the different generations (I-VII) of Euler. For information on how to use the cluster, please check the tutorials page.
Contents
Introduction
Euler stands for Erweiterbarer, Umweltfreundlicher, Leistungsfähiger ETH-Rechner. It is an evolution of the Brutus concept. Euler also incorporates new ideas from the Academic Compute Cloud project in 2012–2013 as well as the Calculus prototype in 2013.
Euler has been regularly expanded since its conception in 2013. The first phase — Euler I — was purchased at the end of 2013 and was in operation from 2014 to 2018. The second phase — Euler II — was purchased at the end of 2014 and was in operation from 2015 tol 2020. Euler III was purchased at the end of 2016 and was in operation from 2017 to 2022. Euler IV was purchased at the end of 2017 and was in operation from 2018 to 2022. Euler V, which replaced Euler I, was purchased in the fall of 2018 and is in operation since the end of 2018. Euler VI was purchased at the end of 2019 and is in operation since the beginning of 2020. Euler VII has been purchased in two phases, the first one is operational since January 2021 and the second one since January 2022. Euler VIII will be installed at the end of 2022 and will be in operation at the beginning of 2023.
Specifications
Euler I
Euler I (2014-2018) contained 448 compute nodes — Hewlett-Packard BL460c Gen8 —, each equipped with:
- Two 12-core Intel Xeon E5-2697v2 processors (2.7 GHz nominal, 3.0–3.5 GHz peak)
- Between 64 and 256 GB of DDR3 memory clocked at 1866 MHz (64 × 256 GB; 32 × 128 GB; 352 × 64 GB)
All compute nodes of Euler I were decommissioned in August 2018 to make room for Euler V.
Euler II
Euler II (2015-2020) contained 768 compute nodes of a newer generation — BL460c Gen9 —, each equipped with:
- Two 12-core Intel Xeon E5-2680v3 processors (2.5-3.3 GHz)
- Between 64 and 512 GB of DDR4 memory clocked at 2133 MHz (32 × 512 GB; 32 × 256 GB; 32 × 128 GB; 672 × 64 GB)
Euler II also contained 4 very large memory nodes — Hewlett-Packard DL580 Gen9 —, each equipped with:
- Four 16-core Intel Xeon E7-8867v3 processors (2.5 GHz)
- 3072 GB of DDR4 memory clocked at 2133 MHz
Euler II was decommissioned in July 2020 to make room for new GPU nodes.
Euler III
Euler III (2016-2022) contained 1215 compute nodes — Hewlett-Packard m710x —, each equipped with:
- A quad-core Intel Xeon E3-1585Lv5 processor (3.0-3.7 GHz)
- 32 GB of DDR4 memory clocked at 2133 MHz
- Local scratch: 211,293.0 MB
All these nodes were connected to the rest of the cluster via 10G/40G Ethernet.
Euler III was decommissioned in May 2022 to make room for new GPU nodes
Euler IV
Euler IV contains 288 high-performance nodes — Hewlett-Packard XL230k Gen10 —, each equipped with:
- Two 18-core Intel Xeon Gold 6150 processors (2.7-3.7 GHz)
- 192 GB of DDR4 memory clocked at 2666 MHz
- Local scratch: 348,582.0 MB (a few nodes have even 1,874,012.0 MB)
All these nodes are connected together via a new 100 Gb/s InfiniBand EDR network.
Euler IV will be decommissioned in November 2022 to make room for Euler VIII.
Euler V
Euler V contains 352 compute nodes — Hewlett-Packard BL460c Gen10 —, each equipped with:
- Two 12-core Intel Xeon Gold 5118 processors (2.3 GHz nominal, 3.2 GHz peak)
- 96 GB of DDR4 memory clocked at 2400 MHz
- Local scratch: 348,582.0 MB
Euler VI
Euler VI contains 216 compute nodes from Swiss company Dalco AG, each equipped with:
- Two 64-core AMD EPYC 7742 processors (2.25 GHz nominal, 3.4 GHz peak)
- 512 GB of DDR4 memory clocked at 3200 MHz
- Local scratch: 920,618.0 MB
All these nodes are connected together via a dedicated 100 Gb/s InfiniBand HDR network.
Euler VII — phase 1
The first phase of Euler VII contains 292 compute nodes — HPE ProLiant XL225n Gen10 Plus —, each equipped with:
- Two 64-core AMD EPYC 7H12 processors (2.6 GHz nominal, 3.3 GHz peak)
- 256 GB of DDR4 memory clocked at 3200 MHz
All these nodes are connected together via a dedicated 100 Gb/s InfiniBand HDR network.
Euler VII — phase 2
The 2nd phase of Euler VII contains 248 compute nodes — HPE ProLiant XL225n Gen10 Plus —, each equipped with:
- Two 64-core AMD EPYC 7763 processors (2.45 GHz nominal, 3.5 GHz peak)
- 256 GB of DDR4 memory clocked at 3200 MHz
All these nodes share the same network as Euler VII phase 1.
Euler VIII
Euler VIII contains 192 compute nodes from Swiss company Dalco AG, each equipped with:
- Two 64-core AMD EPYC 7742 processors (2.25 GHz nominal, 3.4 GHz peak)
- 512 GB of DDR4 memory clocked at 3200 MHz
- Local scratch: 920,618.0 MB
All these nodes are connected to the cluster's 100 Gb/s Ethernet network.
GPU nodes
Euler contains dozens of GPU nodes equipped with different types of GPUs:
- 9 nodes with 8 x Nvidia GTX 1080 (formerly in Leonhard Open)
- 47 nodes with 8 x Nvidia GTX 1080 Ti (formerly in Leonhard Open)
- 4 nodes with 8 x Nvidia Tesla V100 (including some formerly in Leonhard Open)
- 93 nodes with 8 x Nvidia RTX 2080 Ti (including some formerly in Leonhard Open)
- 16 nodes with 8 x Nvidia Titan RTX
- 20 nodes with 8 x Nvidia Quadro RTX 6000
- 33 nodes with 8 x Nvidia RTX 3090
- 3 nodes with 8 x Nvidia Tesla A100 (40 GB PCIe)
- 3 nodes with 10 x Nvidia Tesla A100 (80 GB PCIe)
GPU nodes are not linked to a specific Euler expansion cycle but are purchased according to shareholders' needs. Their specifications (CPU, RAM, disk, networks) vary considerably, even between nodes equipped with similar GPUs.
Storage
Euler contains two types of storage system:
- An enterprise-class NAS system (NetApp FAS 9000 & AFF A800) for long-term storage, such as home directories, applications, virtual machines, project data, etc.
- A high-performance Lustre parallel file system (DDN ES14KX) for short- and medium-term storage, such as scratch and work file systems
Home directories and other critical data are backed up daily; all other data (except scratch) are backed up at least once per week for disaster recovery.
Networks
Euler contains multiple networks:
- A common 100/25/10 Gb/s Ethernet network for data transfer between the storage systems and the cluster's compute and login nodes
- Three separate 56 Gb/s InfiniBand FDR networks for data transfer between the compute nodes themselves (e.g. MPI)
- A 100 Gb/s InfiniBand EDR network for data transfer within Euler IV (MPI) and between Euler IV and the new Lustre high-performance storage system
- A 200/100 Gb/s InfiniBand HDR network for data transfer within Euler VI–VII with 100 Gb/s from compute nodes to switches and multiple 200 Gb/s links between switches
Summary
Clusters | #nodes | #cores/node | CPUs | Clock speed | Memory | Mem clock speed | Local scratch [MB] | Network |
---|---|---|---|---|---|---|---|---|
Euler IV | 288 | 36 | Intel Xeon Gold 6150 | 2.7-3.7 GHz | 192 GB | 2666 MHz | 348,582.0 | 100 Gb/s InfiniBand EDR |
Euler V | 352 | 24 | Intel Xeon Gold 5118 | 2.3 GHz nominal, 3.2 GHz peak | 96 GB | 2400 MHz | 348,582.0 | 10 Gb/s Ethernet |
Euler VI | 216 | 128 | AMD EPYC 7742 | 2.25 GHz nominal, 3.4 GHz peak | 512 GB | 3200 MHz | 920,618.0 | 100 Gb/s InfiniBand HDR |
Euler VII p1 | 292 | 128 | AMD EPYC 7H12 | 2.6 GHz nominal, 3.3 GHz peak | 256 GB | 3200 MHz | 348,582.0 | 100 Gb/s InfiniBand HDR |
Euler VII p2 | 248 | 128 | AMD EPYC 7763 | 2.45 GHz nominal, 3.5 GHz peak | 256 GB | 3200 MHz | 348,582.0 | 100 Gb/s InfiniBand HDR |
Euler VIII | 192 | 128 | AMD EPYC 7742 | 2.25 GHz nominal, 3.4 GHz peak | 512 GB | 3200 MHz | 920,618.0 | 100 Gb/s Ethernet |
Service description
The official service description and the current price list are available on the IT service catalogue.