Setting up your environment

From ScientificComputing
Jump to: navigation, search

Introduction

Most applications, compilers and libraries rely on environment variables to function properly. These variables are usually set by the operating system, the administrator, or by the user. Typical examples include:

  • PATH — location of system commands and user programs
  • LD_LIBRARY_PATH — location of the dynamic (=shared) libraries needed by these commands and programs
  • MANPATH — location of man (=manual) pages for these commands
  • Program specific environment variables

The majority of problems encountered by users are caused by incorrect or missing environment variables. People often copy initialization scripts — .profile, .bashrc, .cshrc — from one machine to the next, without verifying that the variables defined in these scripts are correct (or even meaningful!) on the target system.

If setting environment variables is difficult, modifying them at run-time is even more complex and error-prone. Changing the contents of PATH to use a different compiler than the one set by default, for example, is not for the casual user. The situation can quickly become a nightmare when one has to deal with multiple compilers and libraries (e.g. MPI) at the same time.

Environment modules — modules in short — offer an elegant and user-friendly solution to all these problems. Modules allow a user to load all the settings needed by a particular application on demand, and to unload them when they are no longer needed. Switching from one compiler to the other; or between different releases of the same application; or from one MPI library to another can be done in a snap, using just one command — module.

Software stacks

On our clusters we provide multiple software stacks.

When in doubt, please use the most recent one. As of 08/2024, the available options are :

nmarounina@eu-login-18:~$ module avail stack

----------------------------- /cluster/software/lmods --------------------------------
  stack/2024-03-beta    stack/2024-04    stack/2024-05    stack/2024-06 (D)

 Where:
  D:  Default Module

The detail of the software available in each stack can be seen here.

Module commands

Module spider

The module spider command list all the existing modules matching the string.

nmarounina@eu-login-18:~$ module spider python

--------------------------------------------------
  python:
--------------------------------------------------
     Versions:
        python/3.8.18-c3ikxoi
        python/3.8.18-mcsql52
        python/3.8.18-zv6eekz
        python/3.9.18_cuda
        python/3.9.18_rocm
        python/3.9.18
        python/3.10.13_cuda
        python/3.10.13_rocm
        python/3.10.13
        python/3.11.6_cuda-oe7bpyk
        python/3.11.6_cuda
        python/3.11.6_rocm-oe7bpyk
        python/3.11.6_rocm
        python/3.11.6-m4n2ny4
        python/3.11.6-oe7bpyk
        python/3.11.6
     Other possible modules matches:
        py-python-dateutil  python_cuda  python_rocm

--------------------------------------------------
  To find other possible module matches execute:

      $ module -r spider '.*python.*'

--------------------------------------------------
  For detailed information about a specific "python" package (including how to load the modules) use the module's full name.
  Note that names that have a trailing (E) are extensions provided by other modules.
  For example:

     $ module spider python/3.11.6
--------------------------------------------------

This is an excellent tool to explore all of the available software on the cluster. As specified at the end of the output, type `module spider` + the full version name, to get the list of modules that are needed to be loaded prior to loading the desired module.

When in doubt, please choose a module that does not have a hash after its name.


Module show

The module show command provides you some information on what environment variables are changed and set by the module file.

nmarounina@eu-login-18:~$ module show matlab/R2024a
------------------------------------------------
  /cluster/software/lmods/matlab/R2024a.lua:
------------------------------------------------
whatis("Name : MATLAB")
whatis("Version : R2024a")
help(MATLAB)
setenv("MATLAB","/cluster/software/commercial/matlab/R2024a")
setenv("MATLAB_BASEDIR","/cluster/software/commercial/matlab/R2024a")
setenv("MKL_DEBUG_CPU_TYPE","5")
prepend_path("PATH","/cluster/software/commercial/matlab/R2024a/bin")
setenv("MATLAB_CLUSTER_PROFILES_LOCATION","/cluster/software/comm[...]
append_path("PATH","/cluster/software/commercial/matlab/support_package[...]
prepend_path("MATLABPATH","/cluster/software/commercial/matlab/support[...]

nmarounina@eu-login-18:~$

Module load

The module load command load the corresponding and prepares the environment for using this application or library, by applying the instructions, which can be shown by running the module show command.

nmarounina@eu-login-18:~$ module load stack/2024-06  gcc/12.2.0 python/3.11.6
Many modules are hidden in this stack. Use "module --show_hidden spider SOFTWARE" if you are not able to find the required software
nmarounina@eu-login-18:~$ which python
/cluster/software/stacks/2024-06/spack/opt/spack/linux-ubuntu22.04-x86_64_v3/gcc-12.2.0/python-3.11.6-ukhwpjnwzzzizek3pgr75zkbhxros5fq/bin/python
nmarounina@eu-login-18:~$

Module list

The module list command displays the currently loaded modules files.

nmarounina@eu-login-18:~$ module list

Currently Loaded Modules:
  1) stack/2024-06   2) gcc/12.2.0   3) python/3.11.6

 

nmarounina@eu-login-18:~$

Module purge

The module purge command unload all currently loaded modules and cleans up the environment of your shell. In some cases, it might be better to log out and log in again, in order to get a really clean shell.

nmarounina@eu-login-18:~$ module list

Currently Loaded Modules:
  1) stack/2024-06   2) gcc/12.2.0   3) python/3.11.6

 

nmarounina@eu-login-18:~$ 
nmarounina@eu-login-18:~$ module purge
nmarounina@eu-login-18:~$ module list
No modules loaded
nmarounina@eu-login-18:~$ 

Naming scheme

Please find the general naming scheme of module files below.

program_name/version(alias[:alias2])

Instead of specifying a version directly, it is also possible to use aliases.

program_name/alias == program_name/version

The special alias default indicates which version is taken by default (if neither version nor alias is specified)

program_name/default == program_name

If no default is specified for a particular software, then the most recent version (i.e. that with the largest number) is taken by default.

Hierarchical modules

LMOD allows to define a hierarchy of modules containing 3 layers (Core, Compiler, MPI). The core layer contains all module files which are not depending on any compiler/MPI. The compiler layer contains all modules which are depending on a particular compilers, but not on any MPI library. The MPI layer contains modules that are depending on a particular compiler/MPI combination.

When you login to the Leonhard cluster, the standard compiler gcc/4.8.5 is automatically loaded. Running the module avail command displays all modules that are available for gcc/4.8.5. If you would like to see the modules available for a different compiler, for instance gcc/6.3.0, then you would need to load the compiler module and run module avail again. For checking out the available modules for gcc/4.8.5 openmpi/2.1.0, you would load the corresponding compiler and MPI module and run again module avail'.

As a consequence of the module hierarchy, you can never have two different versions of the same module loaded at the same time. This helps to avoid problems arising due to misconfiguration of the environment.