Difference between revisions of "Storage and data transfer"
(15 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
__NOTOC__ | __NOTOC__ | ||
+ | <table style="width: 100%;"> | ||
+ | <tr valign=top> | ||
+ | <td style="width: 30%; text-align:left"> | ||
+ | < [[Accessing the cluster|Accessing the cluster]] | ||
+ | </td> | ||
+ | <td style="width: 35%; text-align:center"> | ||
+ | [[Main_Page|Home]] | ||
+ | </td> | ||
+ | <td style="width: 35%; text-align:right"> | ||
+ | [[Modules and applications]] > | ||
+ | </td> | ||
+ | </tr> | ||
+ | </table> | ||
+ | |||
+ | |||
<table> | <table> | ||
<tr valign=top> | <tr valign=top> | ||
Line 22: | Line 37: | ||
Log in to the cluster and check your disk space quota | Log in to the cluster and check your disk space quota | ||
$ lquota | $ lquota | ||
+ | +-----------------------+-------------+------------+---------------+---------------+ | ||
+ | | Storage location: | Quota type: | Used: | Soft quota: | Hard quota: | | ||
+ | +-----------------------+-------------+------------+---------------+---------------+ | ||
+ | | /cluster/home/sfux | space | 8.85 GB | 17.18 GB | 21.47 GB | | ||
+ | | /cluster/home/sfux | files | 25610 | 160000 | 200000 | | ||
+ | +-----------------------+-------------+------------+---------------+---------------+ | ||
+ | | /cluster/shadow | space | 4.10 kB | 2.15 GB | 2.15 GB | | ||
+ | | /cluster/shadow | files | 2 | 50000 | 50000 | | ||
+ | +-----------------------+-------------+------------+---------------+---------------+ | ||
+ | | /cluster/scratch/sfux | space | 237.57 kB | 2.50 TB | 2.70 TB | | ||
+ | | /cluster/scratch/sfux | files | 29 | 1000000 | 1500000 | | ||
+ | +-----------------------+-------------+------------+---------------+---------------+ | ||
</td> | </td> | ||
</tr> | </tr> | ||
Line 33: | Line 60: | ||
* $HOME is a safe, long-term storage for critical data (program source, scripts, etc.) and is accessible only by the user (owner). This means other people cannot read its contents. | * $HOME is a safe, long-term storage for critical data (program source, scripts, etc.) and is accessible only by the user (owner). This means other people cannot read its contents. | ||
− | * There is a disk quota of 16/20 GB and a maximum of | + | * There is a disk quota of 16/20 GB and a maximum of 160’000/200’000 files (soft/hard quota). You can check the quota with the command lquota. |
− | * Its content is saved every hour/day | + | * Its content is saved every hour/day using snapshot, which is stored in the hidden .snapshot directory. |
<table> | <table> | ||
<tr valign=top> | <tr valign=top> | ||
Line 90: | Line 117: | ||
== External Storage == | == External Storage == | ||
+ | {{External_Storage}} | ||
<table> | <table> | ||
<tr valign=top> | <tr valign=top> | ||
<td style="width: 37%; background: white;"> | <td style="width: 37%; background: white;"> | ||
− | === Central NAS === | + | === Central NAS/CDS === |
− | Groups who have purchased storage on the central NAS of ETH provided ID Systemdienste can access it on our clusters. | + | Groups who have purchased storage on the central NAS/CDS of ETH provided by ID Systemdienste can access it on our clusters. |
</td> | </td> | ||
<td style="width: 3%; background: white;"> | <td style="width: 3%; background: white;"> | ||
Line 129: | Line 157: | ||
| Local /scratch || duration of job ||800 GB || - ||-|| ✓✓ || o | | Local /scratch || duration of job ||800 GB || - ||-|| ✓✓ || o | ||
|- | |- | ||
− | | Central NAS || flexible || flexible || ✓ || | + | | Central NAS || flexible || flexible || ✓ || ✓ || ✓ || ✓ |
|} | |} | ||
Retention time | Retention time | ||
− | * Snapshots: up to | + | * Snapshots: up to 7 days |
* Backup: up to 90 days | * Backup: up to 90 days | ||
Line 191: | Line 219: | ||
</tr> | </tr> | ||
</table> | </table> | ||
+ | |||
+ | === Globus for fst file transfer === | ||
+ | <br /> | ||
+ | [[File:Infographic-Globus-Universe-2020.png|560px|left|Infographic Globus univers|link=Globus for fast file transfer]] | ||
+ | <br clear=all> | ||
+ | see [[:Globus for fast file transfer]] | ||
== Further reading == | == Further reading == | ||
* [[Storage systems|User guide: Storage systems]] | * [[Storage systems|User guide: Storage systems]] | ||
+ | * [[Unified_quota_wrapper | Unified quota wrapper]] | ||
* [[Too_much_space_is_used_by_your_output_files|Too much space is used by your output files]] | * [[Too_much_space_is_used_by_your_output_files|Too much space is used by your output files]] | ||
* [[Best_practices_on_Lustre_parallel_file_systems|Best practices guide for Lustre file system]] | * [[Best_practices_on_Lustre_parallel_file_systems|Best practices guide for Lustre file system]] | ||
Line 201: | Line 236: | ||
<tr valign=top> | <tr valign=top> | ||
<td style="width: 30%; text-align:left"> | <td style="width: 30%; text-align:left"> | ||
− | < [[ | + | < [[Accessing the cluster|Accessing the cluster]] |
</td> | </td> | ||
<td style="width: 35%; text-align:center"> | <td style="width: 35%; text-align:center"> | ||
− | [[ | + | [[Main_Page|Home]] |
</td> | </td> | ||
<td style="width: 35%; text-align:right"> | <td style="width: 35%; text-align:right"> |
Latest revision as of 08:26, 21 October 2022
Once you can log in to the cluster, you can start setting up your calculation job and you need your data. Therefore, two questions arise: 1. Where to store data? Here, we explain the storage system on the cluster and give examples how to transfer data between your local computer and the cluster. Quick examplesUpload a directory from your local computer to /cluster/scratch/username ($SCRATCH) on Euler $ scp -r dummy_dir username@euler.ethz.ch:/cluster/scratch/username/ Log in to the cluster and check your disk space quota $ lquota +-----------------------+-------------+------------+---------------+---------------+ | Storage location: | Quota type: | Used: | Soft quota: | Hard quota: | +-----------------------+-------------+------------+---------------+---------------+ | /cluster/home/sfux | space | 8.85 GB | 17.18 GB | 21.47 GB | | /cluster/home/sfux | files | 25610 | 160000 | 200000 | +-----------------------+-------------+------------+---------------+---------------+ | /cluster/shadow | space | 4.10 kB | 2.15 GB | 2.15 GB | | /cluster/shadow | files | 2 | 50000 | 50000 | +-----------------------+-------------+------------+---------------+---------------+ | /cluster/scratch/sfux | space | 237.57 kB | 2.50 TB | 2.70 TB | | /cluster/scratch/sfux | files | 29 | 1000000 | 1500000 | +-----------------------+-------------+------------+---------------+---------------+ |
Personal storage for all users
$HOME
$ cd $HOME $ pwd /cluster/home/username
- $HOME is a safe, long-term storage for critical data (program source, scripts, etc.) and is accessible only by the user (owner). This means other people cannot read its contents.
- There is a disk quota of 16/20 GB and a maximum of 160’000/200’000 files (soft/hard quota). You can check the quota with the command lquota.
- Its content is saved every hour/day using snapshot, which is stored in the hidden .snapshot directory.
Global Scratch$ cd $SCRATCH $ pwd /cluster/scratch/username
|
Local Scratch/scratch on each compute node ($TMPDIR)
|
Shareholders can buy the space on Project and Work as much as they need, and manage access rights. Quota can be checked with lquota. The content is backed up multiple times per week.
Project$ cd /cluster/project/groupname Similar to $HOME, but for groups, it is a safe, long-term storage for critical data. |
Work$ cd /cluster/work/groupname Similar to global scratch, but without purge, it is a fast, short-or medium-term storage for large computations. The folder is visible only when accessed. |
External Storage
Please note that external storage is convenient to bring data in to the cluster or to store data for a longer time. But we recommend to not directly process data from external storage systems in batch jobs on Euler as this could be very slow and potentially put a high load on the external storage system. Please rather copy data from the external storage system to some cluster storage (home directory, personal scratch directory, project storage, work storage, or local scratch) before you process it in a batch job. After processing the data from a cluster storage system, you can copy the results back to the external storage system.
Central NAS/CDSGroups who have purchased storage on the central NAS/CDS of ETH provided by ID Systemdienste can access it on our clusters. |
Other NASGroups who are operating their own NAS can export a shared file system via NFS to Euler. The user and group ID's on the NAS needs to be consistent with ETH user names and groups. |
The NAS share needs to be mountable via NFSv3 (shares that only support CIFS cannot be mounted on the HPC clusters), and exported to the subnet of our HPC clusters. The NAS is then mounted automatically on our clusters under /nfs/servername/sharename |
File system comparison
File system | Life span | Max size | Snapshots | Backup | Small files | Large files |
---|---|---|---|---|---|---|
$HOME | permanent | 16 GB | ✓ | ✓ | ✓ | o |
$SCRATCH | 2 weeks | 2.5 TB | - | - | o | ✓✓ |
/cluster/project | 4 years | flexible | optional | ✓ | ✓ | ✓ |
/cluster/work | 4 years | flexible | - | ✓ | o | ✓✓ |
Local /scratch | duration of job | 800 GB | - | - | ✓✓ | o |
Central NAS | flexible | flexible | ✓ | ✓ | ✓ | ✓ |
Retention time
- Snapshots: up to 7 days
- Backup: up to 90 days
Data transfer with command line tools
Using scp commandUpload dummy_file from your workstation to your home directory on Euler $ scp dummy_file username@euler.ethz.ch: Download dummy_file from Euler to the current directory on your workstation $ scp username@euler.ethz.ch:dummy_file . Copy a directory to Euler $ scp -r dummy_dir username@euler.ethz.ch: |
Example: upload a directory with rsyncCreate two files in the dummy directory and use rsync to transfer the folder $ mkdir dummy_dir $ touch dummy_dir/dummy_file1 dummy_dir/dummy_file2 $ rsync -av dummy_dir username@euler.ethz.ch:dummy_dir |
Data transfer with graphical tools
Linux | macOS | Windows |
---|---|---|
FileZilla | FileZilla Cyberduck |
WinSCP PSCP FileZilla Cyberduck |
WinSCP
Globus for fst file transfer
see Globus for fast file transfer
Further reading
- User guide: Storage systems
- Unified quota wrapper
- Too much space is used by your output files
- Best practices guide for Lustre file system