Difference between revisions of "FAQ"
(→When does my job start ?)
(→What is the maximal amount of memory that I can use ?)
|Line 197:||Line 197:|
===What is the maximal amount of memory that I can use ?===
===What is the maximal amount of memory that I can use ?===
===Which queue should I choose ?===
===Which queue should I choose ?===
Revision as of 13:51, 22 August 2016
- 1 General
- 2 Access
- 3 Software
- 3.1 Do you provide any software on your clusters ?
- 3.2 Why does my 32-bit executable not work on your clusters ?
- 3.3 Can I run Windows executables on the clusters ?
- 3.4 Can you please update GLIBC on the clusters ?
- 3.5 Is it necessary to recompile or can I just copy my application to a cluster ?
- 3.6 Are development tools available on the clusters ?
- 3.7 How do I set up my environment for these compilers?
- 3.8 How do I compile MPI applications ?
- 3.9 Can I use another implementation of MPI?
- 3.10 What about OpenMP applications?
- 3.11 What scientific libraries are available on the clusters ?
- 3.12 Can you please allow me to run sudo for installing my code ?
- 3.13 Why can't I install my application into /usr/bin and /usr/lib64 ?
- 3.14 Is there a license available for application XYZ ?
- 4 Environment modules
- 5 Submitting jobs
- 5.1 Can I run an application on the login nodes?
- 5.2 Can I access a compute node via ssh or rsh?
- 5.3 How do I execute a program on the cluster?
- 5.4 How do I submit a simple command ?
- 5.5 How do I submit a shell script ?
- 5.6 How do I submit a parallel job ?
- 5.7 What are the processor and time limits ?
- 5.8 What is the maximal amount of memory that I can use ?
- 5.9 Which queue should I choose ?
- 5.10 How many jobs can I submit ?
- 5.11 How much time should I request for my job ?
- 5.12 What happens when a job reaches its time limit ?
- 5.13 How do I submit a series of jobs (job chaining)?
- 6 Monitoring jobs
- 6.1 When does my job start ?
- 6.2 How can I check the status of my job(s) ?
- 6.3 Why is my job waiting for a long time in the queue ?
- 6.4 Where is my job's output ?
- 6.5 Can I see my job's output in real time ?
- 6.6 How do I know when my job is done ?
- 6.7 Can I see the resources used by my job(s) ?
- 6.8 How do I kill a job ?
- 7 Data management and file transfer
- 7.1 How much disk space is available on then clusters ?
- 7.2 How much space can I use ?
- 7.3 What happens when I reach my quota ?
- 7.4 What if I need more space ?
- 7.5 Why is there a limit for the number of files in my home/scratch directory ?
- 7.6 Why is storage in the cluster more expensive than cheap external USB 3 hard drives ?
- 7.7 How long can I keep files in the scratch directories ?
- 7.8 Why did you delete my files in scratch ?
- 7.9 Are my files backed up regularly ?
- 7.10 How do I restore a file from a backup ?
- 7.11 What is the fastest way to transfer files from/to the cluster ?
- 7.12 Why is file transfer very slow ?
- 8 Miscellaneous
Who is ID SIS HPC ?
What services are provided ID SIS HPC ?
Do I need to pay or are the services for free ?
Where can I find more information ?
Who can use the services of ID SIS HPC ?
Anyone within ETH may use the services provided by ID SIS HPC. Professors and institutes who participated in the purchase of the system are guaranteed a share of the resources proportional to their investment. Other users must share the public resources allocated to the IT Services. Researchers from other Swiss and international institutions can use the services, as long as they have a collaboration with an institute of ETH Zurich.
Do I need an account ?
For most services, you can directly login with the credentials of your NETHZ account. The only exceptions are the Brutus cluster and the CLC genomics service, where you still need to apply for an account. If you are interested in using these services, then please contact cluster support.
How do I log in ?
For security reasons, you can only access our services from within the ETH network. If you are outside the ETH network, you have to establish a VPN connection first. From a Linux or a Mac OS X computer, you can login with
or for graphical applications with
ssh -Y username@hostname
If you are using Windows, then you need to make use of a third party application, as for instance PuTTY or Cygwin, that provide a console to enter the ssh command, analogue to the Linux users. For graphical applications, Windows users require a software called an X-server, which provides X11 forwarding. Common X-servers are Cygwin/X, Xming, Exceed, XWin-32.
X11-forwarding with -X does not work, what am I doing wrong?
As described above, you have to use the -Y option for X11-forwarding. Log in with:
ssh -Y username@hostname
How can I change my password?
Since the NETHZ password is used for the log in to our services, the users can change their password at
Can I change my default shell?
Bash is the default shell for all users. The configuration of our services is complex and everything is tested extensively using bash. It is therefore the only shell that we fully support. You are free to use a different shell, but you are doing so at your own risks.
Do you provide any software on your clusters ?
On our clusters, we provide a wide range of centrally installed applications and libraries. Our software stack contains commercial as well as open source software. An overview on all centrally installed applications can be found in our application tables.
Why does my 32-bit executable not work on your clusters ?
Our cluster are pure 64-bit systems. Your 32-bit executable might runs without problems in some cases, but there are certain limitations. A 32-bit executable can only use up to 3 GB of virtual memory. If you try to use more, this might results in a segmentation fault or an out of memory error message. The solution for this problem is to recompile your application for 64-bit.
Can I run Windows executables on the clusters ?
Windows executables do not run under Linux. In order to be able to run your application on our clusters, you need to make sure that it is a 64-bit binary for Linux.
Can you please update GLIBC on the clusters ?
The libc is part of the operating system. Updating libc is equivalent to updating the operating system on the cluster. Therefore we can not just update libc. If your executable requires a newer version of libc (GLIBC), then please consider recompiling the source code directly on the cluster, where you would like to run the executable.
Is it necessary to recompile or can I just copy my application to a cluster ?
Statically linked, single-processor executables built on standard x86 Linux platforms should run without any problem on our clusters. Recompliling may improves the performance, though. Dynamically linked executables will not run if the required shared libraries are either not available or not compatible (e.g. 32-bit executable and 64-bit library). Recompiling is recommended.
Are development tools available on the clusters ?
On our clusters we provide different versions of the standard compilers from gcc, intel and pgi. To identify the actual versions that are installed on the cluster, please use the module available command:
module available gcc module available intel module available pgi
Executables corresponding to the compilers:
gcc ← GNU C compiler g++ ← GNU C++ compiler gfortran ← GNU Fortran 90/95 compiler
icc ← Intel C compiler icpc ← Intel C++ compiler ifort ← Intel Fortran 90/95 compiler
pgcc ← PGI C compiler pgCC ← PGI C++ compiler pgf77 ← PGI Fortran 77 compiler pgf90 ← PGI Fortran 90 compiler pgf95 ← PGI Fortran 95 compiler pghpf ← PGI High-Performance Fortran compiler
How do I set up my environment for these compilers?
On our clusters, we use environment modules to prepare the environment for applications and compilers. By loading the corresponding module with the module load command, e.g.,
module load gcc/4.8.2
the environment variables as PATH, LD_LIBRARY_PATH and so on are adapted to the compiler you were loading.
How do I compile MPI applications ?
The compilation of parallel applications based on the Message Passing Interface (MPI) is slightly more complicated. Once you have loaded the compiler of your choice, you must also decide which MPI library you want to use. Two MPI libraries are available on Brutus:
- Open MPI (recommended)
Applications compiled with Open MPI or MVAPICH2 run on nodes connected to the InfiniBand network.
Open MPI is recommended for all applications. MVAPICH2 is provided for applications that are not compatible (or do not run well) with Open MPI.
Two series of modules — open_mpi and mvapich2 — are available to configure your environment for a particular MPI library. In addition, these modules define wrappers — e.g. mpicc, mpif90 — that greatly simplify the compilation of MPI applications. These wrappers are compiler-dependent and invoke whichever compiler was active (loaded) when you loaded the MPI module. For this reason, the MPI module must absolutely be loaded after the compiler module.
To summarize, the compilation of an MPI application should look somewhat like this:
module load compiler module load MPI library mpicc program -o executable ← C program mpiCC program -o executable ← C++ program mpif77 program -o executable ← Fortran 77 program mpif90 program -o executable ← Fortran 90 program
Can I use another implementation of MPI?
Yes. We provide MVAPICH2, but we do not compile all libraries with support for MVAPICH2. We strongly recommend to use the centrally installed OpenMPI library.
What about OpenMP applications?
You can use OpenMP but do not forget to set OMP_NUM_THREADS=#threads and submit it with the bsub option -n #threads.
What scientific libraries are available on the clusters ?
On our clusters, we provide a large range of scientific libraries and/or applications:
- PGI compilers come with the AMD's Core Math Library (ACML)
- Intel's Math Kernel Library (MKL)
- openblas (former GOTO)
- netlib blas and lapack
- python packages: numpy, scipy
- GMP, MPFR
Can you please allow me to run sudo for installing my code ?
Due to security reasons, we can not allow users to run sudo for installing their application of choice. The clusters are shared by more than 2000 people, and if we would allow them to use sudo, this could cause a lot of problem, which would affect all other cluster users. We recommend that you install software in your home directory, such that you do not need to run sudo for the installation step.
Why can't I install my application into /usr/bin and /usr/lib64 ?
The directories /usr/bin and /usr/lib64 are primarily used by the operating system for installing packages through the packet manager and only our system administrators have write access to them. The centrally installed applications and libraries are located in /cluster/apps and user software should be installed in the home directory.
Is there a license available for application XYZ ?
The ID SIS HPC team operates and maintains the HPC clusters and provides some more services, but we do not provide any software license at all. Licenses for commercial applications are either provided by the central license administration of ETH or directly by a research group or an institute/department.
Can I automatically load modules on login ?
Is it possible to load modules in a script ?
Module load does not work properly, what am I doing wrong ?
In the application table version X is listed, why does module avail not list it ?
Version X is gone, why did you delete it ?
Can I run an application on the login nodes?
Login nodes are the gateway to the cluster. They are used to compile programs and submit job requests to the compute nodes, not to run applications. You are allowed to run really short programs interactively on the login nodes for testing and debugging purposes, or for pre- or post-processing. Anything else is prohibited, and if you overload the login nodes, your processes will be killed without prior notice.
Can I access a compute node via ssh or rsh?
You can not access a compute node to run a program or a command directly, via ssh, rsh or any other means. From a user's point of view, compute nodes do not exist. If you have submitted a job through the batch system, it is possible to access the node (for advanced job monitoring), where the job is running, with the bjob_connect command, which expects the job id of the job as argument.
How do I execute a program on the cluster?
Every command, program or script that you want to execute on the cluster must be submitted to the bach system (LSF). The command
is used to submit a batch job and to indicate what resources are required for that job. On Brutus, two types of resources must be specified: number of processors and computation time:
bsub -n #CPUs -W HH:MMWhen LSF receives a job, it checks its requirement and either accepts or rejects it. If accepted, the job is dispatched to a batch queue that meets its requirements. The job will remain in the queue until enough resources are available to execute it.
LSF operates like a "black box". You do not need to know anything about the underlying queue structure to use it. Just tell LSF what you want, and you'll get it -- or not.
How do I submit a simple command ?
To execute a simple Unix command on one processor, use:
bsub [-W HH:MM] [-n 1] command [arguments]
The time limit can be expressed as HH:MM or in minutes; "-W 2:30" is equivalent to "-W 150". The default time limit is four hours. Since batch jobs are executed on one processor, the argument "-n 1" can be omitted. All environment variables defined in your current shell -- including the current working directory -- are automatically passed to your job by LSF.
How do I submit a shell script ?
To execute a shell script, you can use either:
bsub [optional flags] ./script bsub [optional flags] < scriptThese forms are not equivalent. In the first case, the script -- which must have "execute" permission -- is read only when the job starts; any change made to it between submission and execution will be "seen" by your job. In the second case, however, the script is read by LSF when you submit it and copied into a special "spool" directory; you cannot change it later on.
If your scirpt contains "#BSUB" statements, you must use the second form.
How do I submit a parallel job ?To execute a parallel program on N processors, you need to specify the number of processors. In addition the parallel code itself must be launched with the corresponding command, as for instance "mpirun".
If you launch a script with the mpirun command, the whole script will be executed in parallel. Therefore you have to be careful to not use any command that can cause a race condition, such as "cp","mv", etc. For this reason, it is often preferable to execute the script on a single processor (without "mpirun") and place "mpirun" inside the script before each command that must be executed in parallel.
What are the processor and time limits ?
Because the processor limits are not fully static and may change over time it is difficult to give hard numbers. However the number of processors that you can use at any given time is limited. This "user limit" is defined on a group or individual basis. You can use the command busers or bqueues to see you own limit:
[leonhard@euler01 ~]$ busers -w USER/GROUP JL/P MAX NJOBS PEND RUN SSUSP USUSP RSV MPEND leonhard - 48 0 0 0 0 0 0 10000
The MAX value specifies an absolute maximum in terms of number of cores that you can use at the same time. It does not mean that you are entitled to get this amount of resources all the time.
[leonhard@euler01 ~]$ bqueues QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP clc 94 Open:Active 256 - - - 0 0 0 0 bigmem.4h 88 Open:Active - - - - 536 536 0 0 bigmem.24h 86 Open:Active - - - - 2845 2354 491 0 bigmem.120h 84 Open:Active - 960 - - 994 144 850 0 bigmem.fair 80 Closed:Inact - - - - 0 0 0 0 normal.4h 68 Open:Active - - - - 1810 1459 351 0 normal.24h 66 Open:Active - - - - 89369 78448 10921 0 normal.120h 64 Open:Active - - - - 10666 4845 5821 0 normal.30d 62 Open:Active - - - - 58 58 0 0 normal.fair 60 Closed:Inact - - - - 0 0 0 0 virtual.40d 58 Open:Active - - - - 0 0 0 0 filler.40d 11 Closed:Active - - - - 0 0 0 0 light.5d 10 Open:Active - - - - 0 0 0 0 purgatory 1 Open:Inact - - - - 0 0 0 0
Please note that the time limits are not based on CPU-time, but on the wall clock time. Furthermore, it is possible to request more processors that we have in our cluster (you can basically also request one million processors for a job). LSF will not reject the job, but it will be pending in the queue forever (except if at some point the cluster has more than one million processors and enough resources are free).
What is the maximal amount of memory that I can use ?
The maximal amount of memory that guest users can use is 256 GB in total, and 128 GB for a single job. If you are member of a shareholder group, then the maximal amount of memory that you can use in a single core job is 3 TB (even though you are might facing very long queuing times if the share of your shareholder group does not explicitly contain so called ultra-fat memory nodes). For parallel jobs the theoretical limit is higher than 100 TB.
Which queue should I choose ?
In principle, you should not choose a queue at all. It is sufficient if you request the amount of resources that your job will require. The batch system will then take care of dispatching your job to the appropriate queue.
How many jobs can I submit ?
As described above, there is on one hand a limit of concurrent jobs that you can run, but on the other hand, there is also a limit for the maximum of pending jobs that you can have. It amounts to 10000 as can be seen by using the busers -w command.
How much time should I request for my job ?The time you request has a direct influence on the scheduling of your job. Short jobs have higher priority than long jobs. In addition, short jobs can use processors reserved by a large parallel job, if LSF determines that your job will finish before the expected start time of the large job.
Therefore, you have a pretty good reason to request as little time as possible. On the other hand, you want to make sure that your job has enough time to complete.
What happens when a job reaches its time limit ?To give the application a chance to exit gracefully, LSF first sends a "friendly" signal (USR2) to all processes of a job when its time limit is about to expire. If the job is still running after a short grace period, LSF sends increasingly "unfriendly" signals (INT, QUIT, TERM and KILL). The last one can not be caught or ignored; it effectively kills the job.
The KILL signal is brutal -- it may kills your job in the middle of a write operation, possibly causing data loss. Some applications do not like this at all ... For this reason, letting an application run until it is killed by LSF is not recommended at all. If you use an iterative method, reduce the number of iterations per job. Alternatively you can program your application to catch the USR2 signal and exit before its time is up. To extend the time, add the -ta USR2 -wt [hh:]mm bsub arguments.
How do I submit a series of jobs (job chaining)?
Job chaining can be used to split a very long computation into a series of jobs that fit within the allowed time limits. LSF offers the possibility to set dependency conditions, e.g. job2 should start only when job1 is don, job3 after job2, etc. This is done using bsub -w (wait)
bsub -J job1 command1 bsub -J job2 -w "done(job1)" command2 bsub -J job3 -w "done(job2)" command3All jobs in a series may be submitted at once. Each job must be given a name (option -J) that will be used to define the dependency condition of the subsequent job. The condition "done(job1)" is true only if job1 completed successfully. If job1 crashed or was killed by LSF when it reached its run-time limit, the dependency condition becomes "invalid or never satisfied" and job2 will not be executed, ever. (Invalid jobs stay in the queue until they are deleted, which is done periodically.) Use the condition "ended(job1)" if job2 ought to be executed no matter what happened to job1.
If job1,job2,job3 are merely iterations of the same program, it may be more convenient to use a single name for all jobs, such as "job_chani"; in that case the dependency is based on the order in which the jobs were submitted:
bsub -J job_chain command bsub -J job_chain -w "done(job_chain)" command bsub -J job_chain -w "done(job_chain)" command
In the example above, "command" is generally a shell script that will retrieve data from the previous job, check if there was any error, prepare the input for the current job and execute it.
When does my job start ?
It is very hard to give an accurate estimate, when a job will start. The starting time of a job is depending on two factors.
- Can the resource request of a job be fulfilled on a compute node in the cluster ?
- Is the user priority of the person that submitted the job higher than all other persons job that have the same or very similar resource requirements ?
The batch system has a heartbeat of about 1 minute, which means that it checks every minute the available resources. The user priorities are calculated in real time and depend on how much resources a user has already used and how much resources he is on average entitled due to the fairshare policy. The fairshare policy is there to ensure that members of a shareholder group get on average an amount of resources that is proportional to their investment in the cluster.
How can I check the status of my job(s) ?Use the LSF command bjobs to see all your jobs (in all states) with their unique job identifier. Additional details can be obtained with the option "-l" (lowercase "L").
The command bjobs -p lists only pending jobs and indicates why they are pending. The most common reasons are explained in the table below.
|New job is waiting for scheduling||Your jobs's requirements are being analyzed|
|Individual host based reasons||A compilicated way to say that not enough processors are available (literally, all hosts are unable to run your job for various, individual reasons)|
|The user has reached his/her job slot limit||Don't you think you are using enough processors already?|
|Job dependency condition not satisfied||Your job is waiting for another job to complete|
|The queue is inactivated by its time windows||This queue is active only during pre-defined time windows; your job will be considered for execution when the next window is open|
|Dependency condition invalid or never satisfied||Your job's dependency condition is false or can not be determined (usually because the status of the previous job in the chain is unknown)|
In the last case the job will never run. The simplest solution is to kill it and resubmit it with the correct dependency condition (or none at all). Alternatively, you can remove the dependency condition using the command bmod -w JOBID
Why is my job waiting for a long time in the queue ?
Where is my job's output ?
By default, your job's output (and error) is stored in a file called lsf.oJOBID, where JOBID corresponds to the job id of the job. If you use the "-o" or "-e" argument for the bsub command, you can give the output and the error file different names.
bsub -o job.out -e job.err ...
Can I see my job's output in real time ?
You can check the output of a particular running job with the bpeek command. You can specify the job either via its job id, or via its job name:
bpeek JOBID bpeek -J JOBNAME
How do I know when my job is done ?
You can instruct LSF to notify you by e-mail when your job begins and ends using bsub -B and bsub -N respectively. Notifications are sent to your official NETHZ e-mail address. The combination bsub -N -o job.out is valid and offers an elegant way to separate the job information written by LSF from your program's "real" output -- the former will be sent by e-mail while the latter will be stored into "job.out".
Can I see the resources used by my job(s) ?
You can display the load and resource usage (memory, swap, etc.) of any specific job with the command bbjobs JOBID.
[leonhard@euler01 ~]$ bbjobs 25445659 Job information Job ID : 25445659 Status : RUNNING Running on node : 8*e1374 User : leonhard Queue : normal.4h Command : mpirun solve_Basel_problem -accuracy 10e-8 Working directory : $HOME/unsolved_problems/basel_problem Requested resources Requested cores : 8 Requested memory : 1024 MB per core Requested scratch : not specified Dependency : - Job history Submitted at : 13:42 1735-08-22 Started at : 13:43 1735-08-22 Queue wait time : 34 sec Resource usage Updated at : 13:44 1735-08-22 Wall-clock : 59 sec Tasks : 12 Total CPU time : 7 min CPU utilization : 99.8 % Sys/Kernel time : 0.1 % Total resident memory : 8150 MB Resident memory utilization : 99.2 % Affinity per Host Host : e1374 Task affinity : by core Cores : /0/0/0 Memory affinity : not defined
How do I kill a job ?The LSF command bkill is used to kill pending or running jobs. For obvious reasons, you can kill only jobs that you own. Use bkill JOBID to kill a particular job, or bkill 0 (zero) to kill all your jobs (running and pending). You can kill a job by name using bkill -J jobname (this will kill the last job with that name) or a whole series of jobs with bkill -J jobname 0 (zero again)
You can use bkill to send a signal to a job without necessarily killing it. For example, if your application is programmed to save results when it receives a USR2 signal (i.e. the signal sent by LSF when the time limit is reached), you can trigger this action manually with the command bkill -s USR2 JOBID