Infiniband problems on Euler VII nodes (November 2021)

From ScientificComputing
Revision as of 13:35, 16 November 2021 by Sfux (talk | contribs)

Jump to: navigation, search

We are currently experiencing a problem with the Infiniband network on Euler VII nodes. We are in close contact with the hardware vendors and are investigating the problem. We will make some changes in the scheduling of jobs to avoid that multi-node MPI jobs are starting on Euler VII nodes.

If you encounter problems with stuck jobs on nodes whose hostname does not start with eu-a2p, then please report those cases to cluster support