Parallel Abaqus Jobs

From ScientificComputing
Revision as of 08:01, 25 April 2024 by Sfux (talk | contribs) (Jobs using MPI)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

Abaqus does not support the Slurm batch system, therefore you need to submit your jobs correctly to make sure they can use the resources allocated by the batch system.

Jobs using multiple threads

Most nodes in the Euler cluster have 128 cores, some even 192 cores. When running Abaqus in parallel, you can use threads mode for jobs up to 192 cores (jobs spend less time in the queue if you request only up to 128 cores, as many more nodes have 128 cores than 192 cores).

When using threads mode in Abaqus, you need to make sure that you request all cores on a single node. Please find below an example job script for Slurm:

#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --time=08:00:00
#SBATCH --mem-per-cpu=4000
#SBATCH --tmp=50g

module load intel/2022.1.2 abaqus/2023

abaqus job=test cpus=8 input=my_abaqus_input_file scratch=$TMPDIR mp_mode=THREADS

Using --ntasks=1 and --cpus-per-task=8 makes sure that all 8 cores are allocated on a single node.

Jobs using MPI

Abaqus provides different MPI implementations that can be used with the software. We have tested Abaqus on Euler using the default IntelMPI implementation. Since Abaqus does not provide support for the Slurm batch system, you need to provide the host list to the software in the format mp_host_list=[['host1',n1],['host2',n2],...,['hostx',nx]]. The host list needs to be written into an Abaqus environment file abaqus_v6.env. There are 3 locations that Abaqus is checking for environment files

  • install_directory/os/SMA/site/abaqus_v6.env
  • $HOME/abaqus_v6.env
  • current_directory/abaqus_v6.env

The host list is a particular property of a job, therefore we recommend to write the host list into a file in the same directory as the input file.

Please find below an example Slurm job script to run Abaqus with MPI on multiple compute nodes:

#!/bin/bash
#SBATCH -n 8
#SBATCH --nodes=2
#SBATCH --tasks-per-node=4
#SBATCH --time=08:00:00
#SBATCH --mem-per-cpu=4000
#SBATCH --tmp=50g
#SBATCH --constraint=ib

module load intel/2022.1.2 abaqus/2023

unset SLURM_GTIDS

echo "mp_host_list=[$(srun hostname | uniq -c | awk '{print("[\047"$2"\047,"$1"]")}' | paste -s -d ",")]" > abaqus_v6.env 

abaqus job=test cpus=8 input=my_abaqus_input_file scratch=$TMPDIR mp_mode=MPI

Please note the following differences to running Abaqus in threads mode:

  • #SBATCH --constraint=ib will make sure that the job is allocating nodes with the fast infiniband interconnect to handle internode communication efficiently
  • It is important to unset the variable SLURM_GTIDS, because the MPI implementations bundled with Abaqus do not support Slurm
  • The example above explicitly requests multiple nodes (--nodes=2) and specifies the number of cores per node (--tasks-per-node=4), but the commands for creating the host list also work when you only request a number of cores (--ntasks=8) and let Slum decide if those are allocated on one or multiple hosts
  • The job script will create a file abaqus_v6.env in the current directory. If there is already such a file existing, then it will be replaced with the new file