Storage systems

From ScientificComputing
Jump to: navigation, search

Introduction

On our cluster, we provide multiple storage systems, which are optimized for different purposes. Since the available storage space on our clusters is limited and shared between all users, we set quotas in order to prevent single users from filling up an entire storage system with their data.

A summary of general questions about file systems, storage and file transfer can be found in our FAQ. If you have questions or encounter problems with the storage systems provided on our clusters or file transfer, then please contact cluster support.

Personal storage (everyone)

Home

On our clusters, we provide a home directory (folder) for every user that can be used for safe long term storage of important and critical data (program source, script, input file, etc.). It is created on your first login to the cluster and accessible through the path

/cluster/home/username

The path is also saved in the variable $HOME. The permissions are set that only you can access the data in your home directory and no other user. Your home directory is limited to 16 GB and a maximum of 100'000 files and directories (inodes). The content of your home is saved every hour and there is also a nightly backup (tape).

Scratch

We also provide a personal scratch directory (folder) for every user, that can be used for short-term storage of larger amounts of data. It is created, when you access it the first time through the path

/cluster/scratch/username

The path is also saved in the variable $SCRATCH. It is visible (mounted), only when you access it. If you try to access it with a graphical tool, then you need to specify the full path as it is might not visible in the /cluster/scratch top-level directory. Before you use your personal scratch directory, please carefully read the usage rules to avoid misunderstandings. The usage rules can also be displayed directly on the cluster with the following command.

cat $SCRATCH/__USAGE_RULES__

Your personal scratch directory has a disk quota of 2.5 TB and a maximum of 1'000'000 files and directories (inodes). There is no backup for the personal scratch directories and they are purged on a regular basis (see usage rules).

Group storage (shareholders only)

Project

Shareholder groups have the option to purchase additional storage inside the cluster. The project file system is designed for safe long-term storage of critical data (like the home directory). Shareholder groups can buy as much space as they need. The path for project storage is

/cluster/project/groupname

Access rights and restriction is managed by the shareholder group. We recommend to use NETHZ groups for this purpose. Backup (tape) is not included, but can be purchase optionally. If you are interested in getting more information and prices of the project storage, then please contact cluster support.

Work

Apart from project storage, shareholder groups also have the option to buy so-called work (high-performance) storage. It is optimized for I/O performance and can be used for short- or medium-term storage for large computations (like scratch, but without regular purge). Shareholders can buy as much space as they need. The path for work storage is

/cluster/work/groupname

Access rights and restriction is managed by the shareholder group. We recommend to use NETHZ groups for this purpose. The directory is visible (mounted), only when accessed. If you are interested in getting more information and prices of the work storage, then please contact cluster support.

Local scratch (on each compute node)

The compute nodes in our HPC clusters also have some local hard drives, which can be used for temporary storing data during a calculation. The main advantage of the local scratch is, that it is located directly inside the compute nodes and not attached via the network. This is very beneficial for serial, I/O-intensive applications. The path of the local scratch is

/scratch

You can either create a directory in local scratch yourself, as part of a batch job, or you can use a directory in local scratch, which is automatically created by the batch system. LSF creates a unique directory in local scratch for every job. At the end of the job, LSF is also taking care of cleaning up this directory. The path of the directory is stored in the environment variable

$TMPDIR

If you use $TMPDIR, then you need to request scratch space from the batch system.

External storage

Central NAS

Groups who have purchased storage on the central NAS of ETH can ask the storage group of IT services to export it to our HPC clusters. There are certain requirements that need to be fullfilled in order to use central NAS shares on our HPC clusters.

  • The NAS share needs to be mountable via NFS (shares that only support CIFS cannot be mounted on the HPC clusters).
  • The NAS share needs to be exported to the subnet of our HPC clusters (please contact ID Systemdienste and ask them for an NFS export of your NAS share).
  • Please carefully set the permissions of the files and directories on your NAS share if other cluster users should not have read/write access to your data.

NAS shares are then mounted automatically when you access them. The mount-point of such a NAS share is

/nfs/servername/sharename

A typical NFS export file to export a share to the Euler cluster would look like

# cat /etc/exports
/export 129.132.71.59/32(rw,root_squash,secure) 129.132.93.64/26(rw,root_squash,secure) 10.205.0.0/19(rw,root_squash,secure) 10.205.96.0/19(rw,root_squash,secure)

A typical NFS export file to export a share to the Leonhard cluster would look like

# cat /etc/exports
/export 129.132.248.224/27(rw,root_squash,secure) 10.204.0.0/19(rw,root_squash,secure)

It is also possible to use a single export for both clusters

# cat /etc/exports
/export 129.132.71.59/32(rw,root_squash,secure) 129.132.93.64/26(rw,root_squash,secure) 10.205.0.0/19(rw,root_squash,secure) 10.205.96.0/19(rw,root_squash,secure) 129.132.248.224/27(rw,root_squash,secure) 10.204.0.0/19(rw,root_squash,secure) 

If you ask the storage group to export your share to the Euler cluster, then please provide them the above-shown information. When a NAS share is mounted on our HPC clusters, then it is accessible from all the compute nodes in the cluster.

Local NAS

Groups who are operating their own NAS, can export a shared file system via NSF to our HPC clusters. In order to use an external NAS on our HPC clusters, the following requirements need to be fullfilled

  • NAS needs to support NFSv3 (this is currently the only NFS version that is supported from our side).
  • The user and group ID's on the NAS needs to be consistent with NETHZ user names and group.
  • The NAS needs to be exported to the subnet of our HPC clusters.
  • Please carefully set the permissions of the files and directories on your NAS share if other cluster users should not have read/write access to your data.

You external NAS can then be accessed through the mount-point

/nfs/servername/sharename

A typical NFS export file to export a share to the Euler cluster would look like

# cat /etc/exports
/export 129.132.71.59/32(rw,root_squash,secure) 129.132.93.64/26(rw,root_squash,secure) 10.205.0.0/19(rw,root_squash,secure) 10.205.96.0/19(rw,root_squash,secure)

A typical NFS export file to export a share to the Leonhard cluster would look like

# cat /etc/exports
/export 129.132.248.224/27(rw,root_squash,secure) 10.204.0.0/19(rw,root_squash,secure)

It is also possible to use a single export for both clusters

# cat /etc/exports
/export 129.132.71.59/32(rw,root_squash,secure) 129.132.93.64/26(rw,root_squash,secure) 10.205.0.0/19(rw,root_squash,secure) 10.205.96.0/19(rw,root_squash,secure) 129.132.248.224/27(rw,root_squash,secure) 10.204.0.0/19(rw,root_squash,secure) 

The share is automatically mounted, when accessed.

Comparison

In the table below, we try to give you an overview of the available storage categories/systems on our HPC clusters as well as a comparison of their features.

Category Mount point Life span Backup Purged Max. size Small files Large files
Home /cluster/home permanent yes no 16 GB + o
Scratch /cluster/scratch 4 years no yes (files older than 15 days) 2.5 TB o ++
Project /cluster/project 4 years optional no flexible + +
Work /cluster/work 4 years no no flexible o ++
Central NAS /nfs/servername/sharename flexible yes no flexible + +
Local scratch /scratch duration of job no end of job 800 GB ++ +

Choosing the optimal storage system

When working on an HPC cluster that provides different storage categories/systems, the choice of which system to use can have a big influence of the performance of your workflow. In the best case you can speedup your workflow by quite a lot, whereas in the worst case the system administrator has to kill all your jobs and has to limit the number of concurrent jobs that you can run because your jobs slow down the entire storage system and this can affect other users jobs. Please take into account a few recommendations that are listed below.

  • Use local scratch whenever possible. With a few exceptions this will give you the best performance in most cases.
  • For parallel I/O with large files, the high-performance (work) storage will give you the best performance.
  • Don't create a large number of small files (KB's) on project or work storage as this could slow down the entire storage system.
  • If your application does very bad I/O (opening and closing files multiple times per second and doing small appends on the order of a few bytes), then please don't use project and work storage. The best option for this use-case would be local scratch.

If you need to work with a large amount of small files, then please keep them grouped in a tar archive. During a job you can then untar the files to the local scratch, process them and group the results again in a tar archive, which can then be copied back to your home/scratch/work/project space.

File transfer

In order to run your jobs on a HPC cluster, you need to transfer some data or input files from/to the cluster. For smaller and medium amounts of data, you can use some standard command line/graphical tools. If you need to transfer very large amounts of data (on the order of several TB), then please contact the cluster support and we will help you to set up the optimal strategy to transfer your data in a reasonable amount of time.

Command line tools

For transferring files from/to the cluster, we recommend to use standard tools like secure copy (scp) or rsync. The general syntax for using scp is

scp [options] source destination

For copying a file from your PC to an HPC cluster (to your home directory), you need to run the following command on your PC:

scp file username@hostname:

Where username is your NETHZ username and hostname is the hostname of the cluster. Please note the colon after the hostname. For copying a file from the cluster to your PC (current directory), you need to run the following command on your PC:

scp username@hostname:file .

For copying an entire directory, you would need to add the option -r. Therefore you would use the following command to transfer a directory from your PC to an HPC cluster (to your home directory).

scp -r directory username@hostname:

The general sytnax for rsync is

rsync [options] source destination

In order to copy the content of a directory from your PC (home directory) to a cluster (home directory), you would use the following command.

rsync -Pav /home/username/directory/ username@hostname:/cluster/home/username/directory

The -P option enables rsync to show the progress of the file transfer. The -a option preserves almost all file attributes and the -v option gives you more verbose output.

Graphical tools

Graphical scp/sftp clients allow you to mount your Euler home directory on your workstation. These clients are available for most operating systems.

  • Linux + Gnome: Connect to server
  • Linux + KDE: Konqueror, Dolphin, Filezilla
  • Mac OS X: MacFUSE, Macfusion, Cyberduck, Filezilla
  • Windows: WinSCP, Filezilla'

WinSCP provides the user a Windows explorer like user interface with a split screen that allows to transfer files per drag-and-drop. After starting your graphical scp/sftp client, you need to specify the hostname of the cluster that you would like to connect to and then click on the connect button. After entering your NETHZ username and password, you will be connected to the cluster and can transfer files.

WinSCP
Winscp1.png Winscp2.png
Filezilla
Filezilla1.png Filezilla2.png


Quotas

The home and scratch directories on our clusters are subject to a strict user quota. In your home directory, the soft quota for the amount of storage that you can use is set to 16 GiB (17.18 GB) and the hard quota is set to 20 GiB (21.47 GB). Further more, you can store maximally 100'000 files and directories (inodes). The hard quota for your personal scratch directory is set to 2.5 TB. You can maximally have 1'000'000 files and directories (inodes). You can check your current usage with the lquota command.

[sfux@eu-login-13-ng ~]$ lquota
+-----------------------------+-------------+------------------+------------------+------------------+
| Storage location:           | Quota type: | Used:            | Soft quota:      | Hard quota:      |
+-----------------------------+-------------+------------------+------------------+------------------+
| /cluster/home/sfux          | space       |          8.85 GB |         17.18 GB |         21.47 GB |
| /cluster/home/sfux          | files       |            25610 |            80000 |           100000 |
+-----------------------------+-------------+------------------+------------------+------------------+
| /cluster/shadow             | space       |          4.10 kB |          2.15 GB |          2.15 GB |
| /cluster/shadow             | files       |                2 |            50000 |            50000 |
+-----------------------------+-------------+------------------+------------------+------------------+
| /cluster/scratch/sfux       | space       |        237.57 kB |          2.50 TB |          2.70 TB |
| /cluster/scratch/sfux       | files       |               29 |          1000000 |          1500000 |
+-----------------------------+-------------+------------------+------------------+------------------+
[sfux@eu-login-13-ng ~]$ 

If you reach 80% of your quota (number of files or storage) in your personal scratch directory, you will be informed via email to clean up.

Shareholders that own storage in /cluster/work or /cluster/project on Euler or Leonhard can check their quota also by using the lquota command:

[sfux@eu-login-11-ng ~]$ lquota /cluster/project/sis
+-----------------------------+-------------+------------------+------------------+------------------+
| Storage location:           | Quota type: | Used:            | Soft quota:      | Hard quota:      |
+-----------------------------+-------------+------------------+------------------+------------------+
| /cluster/project/sis        | space[B]    |          6.17 TB |                - |         10.41 TB |
| /cluster/project/sis        | files       |          1155583 |                - |         30721113 |
+-----------------------------+-------------+------------------+------------------+------------------+
[sfux@eu-login-11-ng ~]$
[sfux@eu-login-11-ng ~]$ lquota /cluster/work/sis
+-----------------------------+-------------+------------------+------------------+------------------+
| Storage location:           | Quota type: | Used:            | Soft quota:      | Hard quota:      |
+-----------------------------+-------------+------------------+------------------+------------------+
| /cluster/work/sis           | space       |          8.36 TB |         10.00 TB |         11.00 TB |
| /cluster/work/sis           | files       |          1142478 |         10000000 |         11000000 |
+-----------------------------+-------------+------------------+------------------+------------------+
[sfux@eu-login-11-ng ~]$

The lquota script requires the path to the top-level directory as parameter.