Power outage 2023-08-29
From ScientificComputing
Due to a short power outage in the CSCS datacenter, hundreds of compute nodes came down around 11:15 today. All jobs running on these compute nodes were lost.
Many of these nodes rebooted and came back up when the power was restored, but some were left in a bad state. We are currently investigating this issue with CSCS.
Updates
- 2023-08-29 16:40
- As we investigate a network issue, we are keeping all Euler VII, which represents almost ⅔ of all CPU nodes, closed for the time being.
- 2023-08-30 11:15
- We could resolve the network issue and the cluster is again fully operational