Difference between revisions of "Jupyter on Euler and Leonhard Open"
(→Installing additional Python and R packages locally)
|Line 93:||Line 93:|
==Terminate the Jupyter session==
==Terminate the Jupyter session==
Please note that when you finish working with the jupyter notebook, you need to click on the "Quit" or "Logout" button in your Browser.
Please note that when you finish working with the jupyter notebook, you need to click on the "Quit" or "Logout" button in your Browser. will stop the batch job running on Euler. Afterwards you also need to clean up the SSH tunnel that is running in the background.
Revision as of 06:54, 2 October 2019
Since Jupyter notebooks are becoming more widely used among the scientific community, the HPC group developed a script that you can run on your local computer. This shell script then starts a Jupyter notebook in a batch job on Euler/Leonhard Open (depending on which cluster you choose) and connects your local browser with it.
At the moment, the script can be used with Linux and Mac computers yet. There is no support for Windows computers. Maybe Windows user can try to run the script using Windows subsystem for Linux (WSL), but this has not been tested yet.
Please note, that with this script we are addressing beginners that start to use Jupyter notebooks on the cluster. It is not addressing advanced users that need a wide range of additional features going beyond simple Jupyter notebooks. Advanced users can take the script and adapt it, such that it can be used with other Python versions (centrally installed, or local installations) and add support for GPU, adding new kernels etc.
- 01 Oct 2019 — Today the script has been updated, such that the jupyter notebooks have next to the Python 3.6 kernel also a bash and an R kernel (3.6.0 on Euler, 3.5.1 on Leonhard Open) available. If you use an older version of the script and you would like to use the newly added kernels, then you need to update your script from the gitlab repository with the command git pull origin master
samfux@bullvalene:~/Jupyter-on-Euler-or-Leonhard-Open$ git pull origin master warning: redirecting to https://gitlab.ethz.ch/sfux/Jupyter-on-Euler-or-Leonhard-Open.git/ From https://gitlab.ethz.ch/sfux/Jupyter-on-Euler-or-Leonhard-Open * branch master -> FETCH_HEAD Already up to date. samfux@bullvalene:~/Jupyter-on-Euler-or-Leonhard-Open$
Getting the script
The script is available on the Gitlab instance of ETH Zurich:
In order to use this script, users need to make sure, that they have set up SSH keys for passwordless access to the cluster:
Please note that the example on the wiki refers to the Euler cluster and for Leonhard Open, then hostname needs to be changed from
please make sure that xdg-open is installed. This package is used to automatically start your default browser. You can install it with the following command:
yum install xdg-utils
apt-get install xdg-utils
Further more, the script requires that there is a Python installation available, which is usually included in the Linux distribution or Mac OS.
Download the repository with the command
After downloading the script from gitlab.ethz.ch, you need to change its permissions to make it executable
chmod 755 start_jupyter_nb.sh
Running the script
The start_jupyer_nb.sh script needs to be executed on your local computer:
./start_jupyter_nb.sh CLUSTER NETHZ_USERNAME NUM_CORES RUN_TIME MEM_PER_CORE
|CLUSTER||Name of the cluster (Euler or LeoOpen)|
|NETHZ_USERNAME||NETHZ username for which the notebook should be started|
|NUM_CORES||Number of cores to be used on the cluster (maximum: 36)|
|RUN_TIME||Run time limit for the jupyter notebook on the cluster (HH:MM)|
|MEM_PER_CORE||Memory limit in MB per core|
./start_jupyter_nb.sh Euler sfux 4 01:20 2048
Running multiple notebooks in a single Jupyter instance
If you run Jupyter on the Leonhard cluster, using GPUs (the default version of the uses a python_cpu module, which does not support GPU usage. You would need to change the Python version in the script to enable GPU usage), then you need to make sure a notebook is correctly terminated before you can start another one.
If you don't properly close the first notebook and run a second one, then the previous notebook will still occupy some GPU memory and have processes running, which will throw some errors, when executing the second notebook.
Therefore please make sure that you stop running kernels in the "running" tab in the browser, before starting a new notebook.
Terminate the Jupyter session
Please note that when you finish working with the jupyter notebook, you need to click on the "Quit" or "Logout" button in your Browser. "Quit" will stop the batch job running on Euler, "Logout" will just log you out from the session but not stop the batch job (in this case you need to login to the cluster, identify the job with bjobs and then kill it with the bkill command, using the jobid as parameter). Afterwards you also need to clean up the SSH tunnel that is running in the background.
samfux@bullvalene:~/Jupyter-on-Euler-or-Leonhard-Open$ ps -u | grep -m1 -- "-L" | grep -- "-N" samfux 8729 0.0 0.0 59404 6636 pts/5 S 13:46 0:00 ssh firstname.lastname@example.org -L 51339:10.205.4.122:8888 -N samfux@bullvalene:~/jupyter-on-Euler-or-Leonhard-Open$ kill 8729
Installing additional Python and R packages locally
When starting a Jupyter notebook with this script, then it will use a central Python and R installation:
- Euler: python/3.6.1, r/3.6.0
- Leonhard Open: python_cpu/3.6.4, r/3.5.1
Therefore you can only use packages that are centrally installed out-of-the-box. But you have the option to install additional packages locally in your home directory, which can afterwards be used.
For installing a Python package from inside a Jupyter notebook, you would need to run the following command:
!pip install --user package_name
This will install package_name into $HOME/.local, as described on our wiki page about Python:
The command to locally install an R package:
Then follow the instructions provided on our wiki: