JupyterHub
Contents
- 1 Introduction
- 2 Prerequisites
- 3 Starting a session
- 4 Stopping your job
- 5 Debugging
- 6 Installing an Extension
- 7 Disabling an Extension
- 8 Other services
- 9 Known Issues
- 10 FAQ
- 10.1 I cannot login to the Jupyter service
- 10.2 My server is too slow to start
- 10.3 My server has been killed before starting
- 10.4 I cannot request a JupyterLab for more than 24h
- 10.5 My service crashes when starting (e.g. tensorboard)
- 10.6 Build Recommended but fails
- 10.7 My jupyterlab with 1 GPU is not starting
- 10.8 I lost all my settings when migrating from the script to the hub
- 10.9 I want to load a cluster module / I want to activate a virtualenv / Jupyterlab is missing some features
- 10.10 I wish to use custom arguments to jupyterhub-singleuser
Introduction
JupyterLab and Jupyter notebooks are widely used in the scientific community at ETH as they provide an easy way to run Python code (or to use other programming languages) in a browser window. We therefore developed a service that allows users to start a JupyterLab session in their browser without having to login to Euler via an SSH client. It provides an easy access to computational resources of the Euler cluster and you can use it to interactively work and to develop and test your code.
Prerequisites
The only prerequisite to use this service is that you have a local computer with a browser installed. As the Euler cluster itself, the service can only be used from within the ETH network. If you are working from home, then you would first need to establish a VPN connection to the ETH network.
Please note that if you have never logged into the Euler cluster before using this service, then you first need to login once with an SSH client to verify your ETH account and to accept the clusters usage rules. Please check our wiki page about accessing the cluster. On this page you can find all information required to login to the Euler cluster with your SSH agent. When you login for the first time, an access code will be sent to your ETH email address that you need to enter and then you need to accept the clusters usage rules. After this initial procedure you can use the Jupyter service.
Starting a session
You can start a session by opening your favorite browser and by entering the URL (FIXME: put URL here once the service is productive). Then you will be asked to login with your ETH credentials. After entering your ETH credentials and clicking on the Sign in button, you can choose the amount of resources that you request for your session. Please only request multiple cores if you are planning to run some code that can make use of multiple cores. By clicking on the Start button, a batch job with your session will be started. It might takes some time until the batch job has started, but then JupyterLab will start in your browser window.
Please note that the service is currently using our Python 3.10.4 (GCC 8.2.0) installation. It has several hundred packages preinstalled that you can use right away in your session when starting a Python kernel:
https://scicomp.ethz.ch/wiki/Python_on_Euler#python_gpu.2F3.10.4
You can find a comprehensive tutorial about JupyterLab on
https://jupyterlab.readthedocs.io/en/stable/getting_started/overview.html
Stopping your job
Please don't forget to kill your job when you are done with your JupyterLab (or stop your server in jupyterhub)
If you just stop the current kernel or close the browser window, then the batch job on Euler will continue to run and waste resources that could be used by other cluster users. To properly stop a jupyter session, you need to click the File menu and choose the entry Hub Control Panel (see picture) and then click the Stop my server button. Afterwards you can close the browser window and your session is terminated.
If you don't have access to this menu (e.g. in tensorboard or other services), you can also access the hub by changing the URL. You just need to replace everything after /user (included) by /hub/home.
Debugging
Before opening a ticket, please check the logs of your jupyterlab. They are available in your home directory under the following name ~/jupyterhub_slurmspawner*. If you are not able to debug it yourself, please add this file to the ticket.
Installing an Extension
It is possible to extend the basic functionality of JupyterLab with extensions. We provide some preinstalled extensions for the users, but there are probably still some useful extensions missing. You can not directly use the extension manager from JupyterLab as this would required write permission in the central installation directory of JupyterLab which users don't have. There is no easy way to configure JupyterLab to store the extensions in a user-writable directory. For some extensions it is possible to install them with pip:
For example if you wish to install jupyterlab-slurm, you will need to run the following commands:
module load REQUIRED_MODULES pip install --user jupyterlab_slurm jupyter labextension enable jupyterlab_slurm
where REQUIRED_MODULES are the ones required by Jupyterlab. In order to have the current configuration, please look at the top of your log files (~/jupyterhub_slurmspawner*).
If an extension for JupyterLab is useful for many users, then you can also ask {cluster_support} if the extension can be installed centrally.
Disabling an Extension
If you are unhappy with an extension, you can disable it with:
jupyter labextension disable my-extension
Other services
By using a proxy on the server, we can provide other services within jupyterhub. Unfortunately, depending on the service, it might run only as the main server and not a named server (which means that you cannot run more than 1 non jupyter service at a time). Therefore if you plan to use another service in parallel to jupyterhub, please use a named server for jupyterhub.
Feel free to copy the settings of tensorboard to create your own web services.
Tensorboard
Tensorboard can be selected when starting the server in the option Software from SIS. It will load the data in $HOME/tensorboard_logs. You can either move / copy your to match this directory, create a link with this name to the correct directory or write a bash script in =$HOME/.config_tensorboard where you set the variable LOGDIR to the directory you wish to load.
WARNING: Tensorboard is not able to deal with HTTPS, therefore anyone sharing the same computational node than you could extract all the data contained in tensorboard.
Known Issues
- Currently, the plugins cannot be installed directly from the UI. Please use the command line to install them
FAQ
I cannot login to the Jupyter service
If it is the first time that you are using Euler, you will need to connect first with SSH. Please read this page for more information on how to do it.
My server is too slow to start
We rely on the Slurm batch system to provide the JupyterLab instances. So it could be either due to a low amount of available resources in Euler or that your priority is too low (already used too much resources or too many jobs running at the same time).
My server has been killed before starting
JupyterHub relies on a timeout system to manage the starting jobs (currently around 10 minutes). If your job takes more time than that to start, it will be automatically killed. If you are unable to get one after multiple tries, please check your queue by using ssh and running squeue on Euler.
I cannot request a JupyterLab for more than 24h
This service aims at cluster beginners and therefore we chose to only allow short sessions up to 24 hours. For running longer jobs for more than 24h, we recommend to submit them directly to the queue and to not use JupyterLab for that.
My service crashes when starting (e.g. tensorboard)
Unfortunately, only jupyterhub can run as a named server. All the other services need to run as the main server.
Build Recommended but fails
JupyterLab is trying to build all its files within the system directories which is of course not allowed. No worries about this issue, we will try to keep up to date the JupyterLab, but we will not do it with every minor releases of a plugin.
My jupyterlab with 1 GPU is not starting
GPUs are only available to shareholders that purchased GPU resources in Euler. Please ensure that you indeed have access to GPUs on Euler before submitting a ticket to cluster support.
I lost all my settings when migrating from the script to the hub
With the JupyterHub, we are using the directory ~/.jupyterlab and not ~/.jupyter to store all the configurations. Replacing the content of the new directory by the old one should be sufficient.
I want to load a cluster module / I want to activate a virtualenv / Jupyterlab is missing some features
You can add your own instruction by writing your own bash script in ~/.jupyterlabrc
This script will be sourced (. ~/.jupyterlabrc) before starting the jupyterlab. So you can load some modules, update some environment variables, replace jupyterlab by another service (advanced usage: see how tensorboard is done), ...
I wish to use custom arguments to jupyterhub-singleuser
A few environment variables can be defined in your ~/.jupyterlabrc file:
- JUPYTER_DIR: Available directory for the users - JUPYTER_HOME: Default directory - JUPYTER_EXTRA_ARGS: any additional argument (e.g. '--debug')