Reopening of Euler and Leonhard (May 2020)

From ScientificComputing
Revision as of 06:46, 18 September 2020 by Sfux (talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

On Friday 15 May the HPC Group found evidence of an unauthorised access into Euler and Leonhard. This was not an isolated incident: dozens of other HPC sites throughout the world have been attacked in a similar fashion. Since we did not know how the attacker managed to break into our system, if they were still present, what they did and/or intended to do, we had no choice but to completely lock down all our clusters. All other sites affected by this cyber-attack, including our colleagues at CSCS, did the same.

Attack vector

Apparently the attacker obtained a user's credentials (passwords / SSH keys) to get inside the ETH network and to log into our clusters. Once inside, they used a yet-unknown exploit to install a back-door in our clusters. There is no evidence that the attacker did anything else. However, this does not rule out the possibility that the attacker installed other back-doors or malicious software that eschewed detection.

Reinstallation

Given these uncertainties, the only responsible course of action is to wipe clean every machine in our clusters and reinstall them from scratch. This includes an array of specialized servers necessary for the operation of the clusters, dozens of login nodes, and thousands (!) of compute nodes. We are also setting up new networks and firewalls, and taking additional measures to increase the security of our clusters.

New passwords and keys

Since the attacker is in possession of a user's credentials, all users are required to change their LDAP password before accessing Euler or Leonhard again. We also recommend that they change their VPN password too, as an extra precaution.

All SSH keys found in users' home directories have been blocked. (This action was approved by the Chief IT Security Officer of ETH.) If you are using SSH keys to login, you must generate new keys protected with a strong passphrase. The procedure is explained in the following paragraphs.

This change is mandatory: only users who have changed their LDAP password after 15 May 2020 will be allowed to login. For those using SSH keys, only new keys will be allowed. Old keys have been black-listed, any login attempt with an old key will be considered suspect and will be investigated.

How to change your ETH passwords

Please have a look at the documentation and the video of IT Services on how to change your password in the Identity and Access Management (IAM) system of ETH.

Note that you need to select the password for web applications, AAI (LDAP). Do not change your mail password (Active Directory)!

How to generate SSH keys (Linux & MacOS)

For a good documentation on SSH please have a look at the SSH website. It contains a general overview on SSH, instructions on how to create SSH keys and instructions on how to copy an SSH key.

On your computer, please use

ssh-keygen -t ed25519

to generate a key pair with the ed25519 algorithm. By default the private key is stored as $HOME/.ssh/id_ed25519 and the public key as $HOME/.ssh/id_ed25519.pub.

For security reasons, we recommend that you use a different key pair for every computer you want to connect to. For instance, if you are using both Euler and Leonhard:

ssh-keygen -t ed25519 -f $HOME/.ssh/id_ed25519_euler            # please enter a strong, non-empty passphrase when prompted
ssh-keygen -t ed25519 -f $HOME/.ssh/id_ed25519_leonhard         # please enter a strong, non-empty passphrase when prompted

Once this is done, copy the public key to Euler or Leonhard using one of the commands:

ssh-copy-id -i $HOME/.ssh/id_ed25519_euler.pub    username@euler.ethz.ch
ssh-copy-id -i $HOME/.ssh/id_ed25519_leonhard.pub username@login.leonhard.ethz.ch

Where username is your ETH username. You will need to enter your ETH (LDAP) password to connect to Euler / Leonhard.

PLEASE NOTE:

  • The key can be copied only when the corresponding cluster is open
  • You can check the status of Euler and Leonhard in the color codes in the top right corner of this page (green or orange: copy possible; red: copy not possible)
  • You should first verify that you can login with your password, before trying to copy the key to the cluster

How to generate SSH keys (Windows)

For Windows a third party software (PuTTYgen,MobaXterm) is required to create SSH keys. Windows 10 contains an SSH implementation that allows Windows users to use the same commands in the cmd shell as documented for Linux and Mac OS X without installing a third party software. For a good documentation on SSH please have a look at the SSH website.

Please either use PuTTYgen or the command (MobaXterm)

ssh-keygen -t ed25519

to generate a key pair with the ed25519 algorithm and store both, the public and the private key on your local computer. For security reasons, we recommend that you use a different key pair for every computer you want to connect to. For instance, if you are using both Euler and Leonhard, then save the keys as id_ed25519_euler.pub and id_ed25519_leonhard.pub.

Afterwards please login to the cluster and create the hidden directory $HOME/.ssh which needs to have the unix permission 700.

mkdir -p -m 700 $HOME/.ssh

In order to setup passwordless access to a cluster, copy the public key from your workstation to the .ssh directory on the cluster (for this example, we use the Euler cluster, if you would like to setup access to another cluster, then you need to use the corresponding hostname instead of euler.ethz.ch) using for instance WinSCP or MobaXterm. The contents of the public key file must be appended to the authorized_keys file, for example for Euler:

cd $HOME/.ssh
cat id_ed25519_euler.pub >> authorized_keys
chmod 600 authorized_keys

The chmod command ensures that the authorized_keys file cannot be read or modified by another user.

How to use keys with non-default names

If you use different key pairs for different computers (as recommended above), you need to specify the right key when you connect, for instance:

ssh -i $HOME/.ssh/id_ed25519_euler username@euler.ethz.ch

To make your life easier, you can configure your ssh client to use this option automatically by adding the following lines in your $HOME/.ssh/config file:

Host login.leonhard.ethz.ch
IdentityFile ~/.ssh/id_ed25519_leonhard

Host euler.ethz.ch
IdentityFile ~/.ssh/id_ed25519_euler

Stay safe!

  • Always use a (strong) passphrase to protect your SSH key. Do not leave it empty!
  • Never share your private key with somebody else, or copy it to another computer. It must only be stored on your personal computer
  • Use a different key pair for each computer you want to connect to
  • Do not reuse the key pairs for Euler / Leonhard for other systems
  • Do not keep open SSH connections in detached screen sessions
  • Disable the ForwardAgent option in your SSH configuration and do not use ssh -A (or use ssh -a to disable agent forwarding)

Troubleshooting

No connection

Euler and Leonhard are only accessible from inside the ETH network. If you are connecting from outside, you must open a VPN connection to the ETH network.

Warning about host keys

Since the host keys of the login nodes have changed, you may see a warning when you try to login the first time or when you attempt to copy your keys to the cluster. You can check here signatures of the keys for Euler and Leonhard. If the key signature shown by SSH matches the new key, you can proceed and accept the new host key.

Depending how your SSH client is configured, you may need to remove the offending (old) host key using the command:

ssh-keygen -R euler.ethz.ch
ssh-keygen -R login.leonhard.ethz.ch

Connection closed

If the SSH connection is closed immediately after you entered your password, please check the following:

  • Did you change your password as required above? If not, change it, wait one working day, and try again
  • Do you still have an old key pair in your $HOME/.ssh directory that you used to access the cluster(s) before 15 May? If yes, delete it and try to login again
  • Do you have 6 or more keys in your $HOME/.ssh directory? If yes, you need to specify the correct key with the option ssh -i $HOME/.ssh/key_name -o IdentitiesOnly=yes ...
  • Do you already have an account on the cluster? If not, you will not be able to login because automatic account creation has been disabled for security reasons

Still not working?

If you still cannot login, add the option -vv (very verbose) to your SSH command, try to login again, copy the whole command & output (text only, not screenshot) and send it to Cluster Support

Timeline for the reopening of our clusters

Euler

  • The login nodes are again open since 2 June, 2:00 pm
  • All batch queues are active
  • Some services are still unavailable; see "Known issues" below for details

Leonhard

  • A data access node was provided from 22 May to 22 June
  • The login nodes are again open since 22 June 2:00 pm
  • All batch queues are active
  • Some services are still unavailable; see "Known issues" below for details

New security measures

Firewall

  • VPN is required for all incoming connections from outside ETH
  • Connections from the clusters to servers inside ETH are allowed
  • Euler: Connections to servers outside ETH are blocked except when using the secure protocols SSH and HTTPS; Connections using other protocols can be made via the ETH's proxy server, if the corresponding ports are open; for git, see below
  • Leonhard Open: Connections to servers outside ETH are blocked except when using the secure protocols SSH and HTTPS; Connections using other protocols can be made via the ETH's proxy server, if the corresponding ports are open; for git, see below
  • Euler: Connections from the login nodes to servers outside ETH are allowed without restriction
  • Connections from the compute nodes to servers outside ETH can be made via the ETH's proxy server

Password

  • Users have been asked to change their passwords. It can take up to 1 working day until such a password change becomes visible on the cluster. Therefore if you change your password today, wait until tomorrow morning (around 8:00) and try again.

Primary group

  • For cluster users, the primary group has been changed from T0000 to the personal group of each user ${USER}-group
[sfux@eu-login-13 ~]$ id -g -n
sfux-group
[sfux@eu-login-13 ~]$

When new files are created they will no longer be owned by ${USER}:T0000 but instead by ${USER}:${USER}-group.

[sfux@eu-login-13 ~]$ touch new_file
[sfux@eu-login-13 ~]$ ls -ltr new_file
-rw-r--r-- 1 sfux sfux-group 0 Jun 16 14:50 new_file
[sfux@eu-login-13 ~]$

Known issues / work in progress

Batch system

  • Only the 4h and 24h queues are active at this time. Longer queues will be reactivated Monday 8 June
  • Batch interactive jobs can momentarily not be run. This will be fixed as soon as the queues are reactivated
  • The option to send a notification by email when a job start or ends (bsub -B | -N) does not work on Euler and Leonhard
  • Huge-memory nodes are not available at this time. Jobs that require > 500 GB per node will be held in the queue until these nodes are available again
  • Euler II nodes have been decommissioned. Jobs that explicitly request "XeonE5_2680v3" CPUs will never run

User management

  • The automatic creation of new user accounts has been disabled until further notice. Shareholders who urgently need an account for a new team member should contact Cluster Support
  • It is not yet possible to authorise users for commercial software (Gaussian, ORCA, VASP, etc.). The corresponding support requests will be put on hold until this is possible again

Applications

  • Commercial software cannot contact external license servers. We work on bringing this feature back
  • Singularity is no longer available on Euler, as we need to reevaluate the security of it
  • The CLC Genomics Server is off-line and needs to be reinstalled. At this point there is no estimate when this service will be available again

Email

  • It is not possible to send email from the Euler and the Leonhard Open cluster

Git protocol

Update 24 June 2020: git connections from the Euler login nodes are unrestricted and do not require the proxy anymore

You can configure git to use the proxy server of ETH:

git config --global http.proxy http://proxy.ethz.ch:3128
git config --global https.proxy http://proxy.ethz.ch:3128

For unsetting the git proxy configuration please use the following command:

git config --global --unset http.proxy
git config --global --unset https.proxy

The git protocol is blocked in the ETH proxy server. If would like to access external repositories with a git+https:// URL, then you would need to configure git to always use https:

git config --global url.https://github.com/.insteadOf git://github.com/

After running this command once, git is configured to only use https even when you access a repository that tries to use the git protocol.