Euler-tunnel

From ScientificComputing
Jump to: navigation, search

Introduction

Euler-tunnel is a tool that allows users to establish SSH tunnels to running batch jobs. This is for instance required for using the remote SSH plugin from VSCode, which would otherwise just connect to the login nodes of Euler, where users cannot run computations.

Another use case are terminal multiplexers like screen or tmux. Since Euler is using a load balancer which distributes the users login sessions among 50 login nodes, it often happens that detached tmux/screen sessions become orphaned as the user connecting again to euler.ethz.ch ends up on a different login node and can't find the the session any more.

Euler-tunnel allows users to reconnect to an existing session, running in a batch job.

Initial setup

Using euler-tunnel requires passwordless access with SSH keys to Euler. Please follow the documentation and setup SSH keys for passwordless login.

Then you need to run euler-tunnel config on Euler:

sfux@eu-login-11:~$ euler-tunnel config
[INFO] First time running. Generating ssh server host key.
[INFO] Add this line to your ~/.ssh/known_hosts file on YOUR computer.

euler-tunnel ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBVOd5pk+UK9dzO/9/xQRreDyvRaYSVr7xAPc4oNtoBZ euler-tunnel

[INFO] Add this Host block to your ~/.ssh/config file on YOUR computer.

Host euler-tunnel
   User sfux
   ServerAliveInterval 10
   ServerAliveCountMax 10
   # Note: Windows users will have to remove the Control* settings
   #       as that feature uses sockets which don't exist on windows.
   ControlMaster auto
   ControlPath ~/.ssh/cs-%r@%h:%p
   ControlPersist 15
   ProxyCommand ssh sfux@euler.ethz.ch euler-tunnel connect

This will give some instructions on how to setup the SSH configuration on your local computer. Please follow those steps and change your known_hosts and config file for SSH. Note that you might need additional settings for your setup, similar to those you have for euler.ethz.ch.

After changing the known_hosts and the config file, you are ready to use euler-tunnel

Commands availabe for euler-tunnel

$ euler-tunnel -h
Usage: euler-tunnel [OPTIONS] COMMAND [COMMAND_ARGS]

Manages ssh tunneling to a batch job.

Commands:
   start: [SLURM_JOB_ARGS]
      Submit a batch job which runs a ssh server.
      The given SLURM_JOB_ARGS are passed to the batch job.

   status:
      Show the status of the tunnel batch job.

   stop:
      Cancel the running tunnel batch job.

   server:
      Start the ssh server.
      Use this from your own sbatch script to handle more advanced use cases
      which can not be handled with the 'start' command.
      The environment in which the ssh server is started is preserved and
      reused for any ssh client sessions that it starts.

   config:
      Show ssh client configuration for users computer.

   connect:
      Connect to the ssh server running inside the tunnel batch job.
      Use this with a ProxyCommand in your ssh config (on your computer).
      e.g. ProxyCommand ssh euler.ethz.ch euler-tunnel connect
      Use the 'config' command to get a full configuration example.

   wait: TIMEOUT
      Wait for the ssh server inside the tunnel job to be usable.

   ensure: [SLURM_JOB_ARGS]
      Ensure that the tunnel job is running and usable.
      Uses an existing tunnel if available, starts a new tunnel with the
      optionally given SLURM_JOB_ARGS otherwise.
      Wait's for up to 60 seconds for the tunnel to be usable.
      If you have to wait longer, use the 'start' followed by the 'wait'
      commmand which allows you to set a custom timeout.

   reset:
      Delete all files related to euler-tunnel.
      This is safe to do, no data will be lost.
      Note that you will be prompted to re-configure your ~/.ssh/known_hosts
      file when using euler-tunnel the next time.

Options:
    -h    show this help message
    -x    run with 'set -x' set

Examples:
   # Submit tunnel batch job with default settings.
   euler-tunnel start

   # Submit tunnel batch job with custom settings.
   euler-tunnel start --time=4:00:00 --cpus-per-task=2 --mem-per-cpu=2G

   # Show the status of the tunnel batch job.
   euler-tunnel status

   # Shows you how to configure your ssh client.
   euler-tunnel config

   # Stop the tunnel batch job.
   euler-tunnel stop

   # Start the ssh server manually, e.g. from inside a sbatch script with
   # custom config and modules loaded.
   euler-tunnel server

Using euler-tunnel

The command

euler-tunnel start

needs to be run on a login node of Euler. It accepts the same options as sbatch:

euler-tunnel start --time=4:00:00 --cpus-per-task=2 --mem-per-cpu=2G

but you don't run any command with --wrap="". This will setup a batch job that you can then connect to.

As a second step, you can use the SSH client on your local computer and ssh to euler-tunnel

ssh euler-tunnel

This will create an SSH tunnel into the batch job that was started before.

Running simple commands through SSH in a remote batch job

After starting a batch job with euler-tunnel

eu-login-42:~$ euler-tunnel start
Submitted batch job 3035916
eu-login-42:~$ euler-tunnel status
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           3035916 normal.4h euler-tu  asteven  R       0:04      1 eu-g5-047-2
eu-login-42:~$

you can run commands from your local computer through the tunnel:

eos:~% ssh euler-tunnel hostname
eu-g5-047-2
eos:~%

In this example, we run the command hostname in the batch job 3035916 that is running on the compute node eu-g5-047-2.

VSCode

If you have setup the SSH config according to the description above, then you can perform the following steps to connect VSCode to a batch job running on Euler

  • Start a batch job with euler-tunnel
  • Start VSCode on your local computer
  • Start the Remote SSH plugin
  • Select "Connect to Host..." to see the list of hosts, among which you should see euler-tunnel
  • Chose the entry and your local VSCode session will connect to the running batch job on Euler

If you loose the connection, then you can start the Remote SSH plugin again and reconnect.