Euler power outage (01 Oct 2018)

From ScientificComputing
Revision as of 12:01, 12 October 2018 by Sfux (talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Due to a power outage in the CSCS datacenter on Sunday, most of the Euler compute nodes went down around 17:45. All running jobs have been lost. Currently, the Euler cluster is partially back online, but the computing capacity is reduced due to network problems. We are in close contact with the ID Network group and our team is working on bringing Euler fully back online.

We are sorry for the inconvenience.

We will update this page as the situation evolves.

Updates

2018-10-01 14:50
Our system administrators are running health checks on the compute nodes that were powered down. Compute nodes that passed the health check are progressively put back into production.
2018-10-02 11:30
The majority of all compute nodes in Euler passed the health check and is back in production.