Network issue (16 August 2023)

From ScientificComputing
Jump to: navigation, search

On 16 August 2023, we noticed that Euler is suffering from networking issues.

This causes some random side effects:

  • home directory permissions
  • slow responses
  • issues with Slurm and LSF
  • issues with domain name resolution for jobs accessing the internet (only working in about 50% of all cases)

We are sorry for the inconvenience and our system specialists are working on fixing those issues ASAP.

Updates

2023-17-08 10:00
We have inactivated the queues to prevent jobs from failing and set the system status to orange. We will open the queue again as soon as the network issues are resolved.
2023-17-08 11:00
Our system administrators made a change to stabilize the network. We are therefore activated the queues again and set the system status back to green. We will continue to investigate the issue and closely monitor the network on Euler.
2023-17-08 15:30
Jupyterhub is still suffering from the networking issues.
2023-18-08 09:00
A potential fix for Jupyterhub is currently in development. We are hoping to release it around lunch.
2023-18-08 12:30
Jobs submission in Jupyterhub are disabled for a maintenance at 14h00. It should slightly improve the networking performances and reduce the length of the downtime when the networking fails.
2023-18-08 14:10
Jupyterhub is back online.