Euler VI Testing

From ScientificComputing
Revision as of 12:12, 31 January 2020 by Urbanb (talk | contribs)

Jump to: navigation, search

The new Euler VI nodes are available for beta testing. They have 128 cores, 512 GB of memory and are connected in a 200 Gbps EDR Infiniband fabric.


Select or avoid Euler VI nodes

During the testing and transition period you can force your job to use or avoid these nodes.

To force your job to run on these nodes, request the “-R beta” or “-R "select[model==EPYC_7742]"” bsub option:

bsub -R beta [other bsub options] ./my_command
bsub -R "select[model==EPYC_7742]" [other bsub options] ./my_command

To prevent your job from running on these nodes, request the “-R stable” bsub option:

bsub -R stable [other bsub options] ./my_command

If you encounter any problem with running your jobs on the new Euler VI nodes, then please report it to cluster support.

While you can always use the -R stable option, the -R beta option will not work after the Euler VI nodes are put into production.

Changes in behavior

If you request Euler VI nodes, then the batch system will run jobs requesting up to 128 cores on a single node.

Threaded jobs

Non-threaded (multi-node MPI) jobs

You should use the “-R "span[ptile=128]"” (or other appropriate value instead of 128) if you intend to run multi-node jobs.

Known issues

Troubleshooting