Starting JupyterHub within a custom venv on Euler

From ScientificComputing
Jump to: navigation, search

Introduction

JupyterLab and Jupyter notebooks are widely used in the scientific community at ETH as they provide an easy way to run Python code in a browser window. We therefore developed a service that allows users to start a JupyterLab session in their browser without having to login to Euler via an SSH client. It provides an easy access to computational resources of the Euler cluster, and you can use it to interactively work and to develop and test your code. You can access it here.

By default, jupyterHub loads python/3.11.6 from stack/2024-05 (and other additional modules). You can see it by opening a terminal tab on jupyterHub and typing "module list".

However, when working with python, there is a strong need to be able to create and customise your own python virtual environments (venv) that can include (or not) centrally installed packages. This page provides you with steps to use a custom venv and/or a different python version with our JupyterHub service.


Steps to follow

1. Load the desired modules and create your venv

Connect to the cluster in your favourite way, load the desired python module and create your custom venv. You can also read the instruction on how to create your venv here :

module load stack/2024-06 gcc/12.2.0 python/3.11.6
python -m venv env
source env/bin/activate
pip install <...your python packages...>

You will also need to make sure that the following packages are installed in your environment :

pip install jupyter ipykernel

2. Create and register a Jupyter kernel

Please use this command :

python -m ipykernel install --user --name env --display-name "Python (env)"

Replace the strings following --name and --display-name flags with something that makes sense in your context.

3. Create/open the jupyterlabrc file

This file should be located in :

/cluster/home/<your_username>/.config/euler/jupyterhub/jupyterlabrc 

Do not add any suffixes like .txt or .dat to it. The path should also be respected: jupyterHub will not search for any other location on the cluster. If the file already exists, feel free to start editing it.

4. Add the appropriate commands to the jupyterlabrc file

The commands should look like something you would type in a terminal when you want to use your venv. In this example, it would look something like :

module load stack/2024-06  gcc/12.2.0 python/3.11.6 eth_proxy hdf5/1.14.3
source env/bin/activate

  We advise to add the eth_proxy module to allow jupyterhub to have internet access (this is not enabled by default). hdf5 is a commonly used library on python and pre-loading it could avoid other confusing errors in the future.

5. Start JupyterHub in your browser

When it starts, jupyterHub will be sourcing the file that you have just created/modified. If there is a typo in jupyterlabrc file, or any other error, you can find the error logs in  /cluster/home/<your_username>/jupyterhub-logs folder and adjust the jupyterlabrc accordingly.

Then, upon starting a new jupyter notebook, select the kernel on the top right of the python jupyter notebook tab. The name of the kernel should be the one you have created with the --display-name at step 2.

6. Sanity check

You can double-check again that the proper modules are loaded in the terminal tab in JupyterHub. The pip list command should reflect exactly what you have installed within your venv. You should be able to load the packages that you have installed whithin the jupyter notebook.