Power outage 2024-06-19
From ScientificComputing
Due to a complete power outage in the CSCS datacenter at 1:35 AM, Euler went down and all running jobs were lost. Regular updates about this situation will be published on this wiki page.
We are sorry for the inconvenience
Updates
- 2024-06-19 11:20
- There are no news from CSCS yet. We are still waiting for the power to be restored in the datacenter.
- 2024-06-19 12:10
- Power and cooling at CSCS have been restored. We can now start powering up and testing the various components of Euler.
- 2024-06-19 14:30
- Login nodes are again open and the batch system accepts jobs (queues are still closed, so jobs submitted now will stay pending until the queues are open).
- 2024-06-19 17:00
- The 4h queues for CPU jobs are open in both the CentOS and Ubuntu parts of the cluster.
- 2024-06-20 07:45
- The 4h queues for GPU jobs are open in the CentOS part of the cluster.
- The powering up and testing of compute nodes is on-going, most of them seem to have survived the power outage without significant problems.