MATLAB/Parallel temp

From ScientificComputing
Revision as of 10:56, 7 February 2023 by Sfux (talk | contribs) (Configuration of the MATLAB client on the cluster)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

MATLAB's Parallel Computing Toolbox lets you run suitably-written programs in parallel. Several cores calculate different parts of a problem at the same time to reduce the time-to-solution.

For running MATLAB in parallel, you can either use a local parpool, an SLURM parpool or use the MATLAB to offload jobs from a local MATLAB instance running on your computer to the Euler cluster. Please find below the documentation about using a local parpool. Instructions for using an SLURM parpool as well as for using the Submit an asynchronous MATLAB job are provided on separate wiki pages.

Configuration of the MATLAB client on the cluster

After logging into the cluster, start MATLAB. Configure MATLAB to run parallel jobs on your cluster by calling configCluster:

[nmarounina@eu-login-28 ~]$ module load matlab/R2022b
[nmarounina@eu-login-28 ~]$ matlab -nodisplay
 
                                      < M A T L A B (R) >
                            Copyright 1984-2022 The MathWorks, Inc.
                       R2022b Update 2 (9.13.0.2105380) 64-bit (glnxa64)
                                        October 26, 2022

 
To get started, type doc.
For product information, visit www.mathworks.com.
>> 
>> configCluster
Complete.  Default cluster profile set to "euler R2022b".
>>  

configCluster only needs to be called only once per version of MATLAB. Please be aware that running configCluster more than once per version will reset your cluster profile back to default settings and erase any saved modifications to the profile.

Use a local parpool

For suitable MATLAB programs (such as those containing parfor loops), using the Parallel Computing Toolbox requires two steps:

  1. request multiple cores from Euler's SLURM batch scheduler and then
  2. use a parpool in your MATLAB program

In this example twelve cores are allocated for four hours for an interactive job that would give access to a bash terminal :

[nmarounina@eu-login-28 ~]$ srun --ntasks=1 --cpus-per-task=12 --time=04:00:00 --pty bash
srun: job 8937056 queued and waiting for resources
srun: job 8937056 has been allocated resources
[nmarounina@eu-g5-027-3 ~]$ 

This terminal is then used to load the matlab module and start a MATLAB instance :

[nmarounina@eu-g5-027-3 ~]$ module load matlab/R2022b
[nmarounina@eu-g5-027-3 ~]$ matlab -nodisplay
 
                                      < M A T L A B (R) >
                            Copyright 1984-2022 The MathWorks, Inc.
                       R2022b Update 2 (9.13.0.2105380) 64-bit (glnxa64)
                                        October 26, 2022

 
To get started, type doc.
For product information, visit www.mathworks.com.
 
>> 

To use parpool on the allocated resources, first, get a handle on the local resources :

>> c=parcluster('local')

c = 

 Local Cluster

    Properties: 

                   Profile: Processes
                  Modified: false
                      Host: eu-g5-027-3
                NumWorkers: 12
                NumThreads: 1

        JobStorageLocation: /cluster/home/nmarounina/.matlab/local_cluster_jobs/R2022b
   RequiresOnlineLicensing: false

    Associated Jobs: 

            Number Pending: 0
             Number Queued: 0
            Number Running: 0
           Number Finished: 1
 
>>

As seen in this example, MATLAB detected that 12 cores has been requested and assigned the value of 12 for the NumWorkers property, which is the maximal amount of workers that one can start with the SLURM resource allocation. A worker is a separate MATLAB instance that will be performing a parallel computation. By default, each worker handles a single threat. The next step would be to create a pool of workers :

>> pool=parpool(c,4) 
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to the parallel pool (number of workers: 4).

pool = 

 ProcessPool with properties: 

            Connected: true
           NumWorkers: 4
                 Busy: false
              Cluster: Processes (Local Cluster)
        AttachedFiles: {}
    AutoAddClientPath: true
            FileStore: [1x1 parallel.FileStore]
           ValueStore: [1x1 parallel.ValueStore]
          IdleTimeout: 30 minutes (30 minutes remaining)
          SpmdEnabled: true

>>


A trivial parallel program (simulation.m) is shown below:

parfor i = 1:10
    squares(i) = i^2;
end
disp(squares)

Running it in the local parpool will give :

>> simulation
     1     4     9    16    25    36    49    64    81   100

>>

Then the pool can be deleted using :

pool.delete()

Please note that several pools cannot be started at the same time, even if the current pool does not use the maximum amount of workers available locally. All of the previous steps can be combined in a single MATLAB script (simulation2.m):

c=parcluster('local');
pool=parpool(c,4);
parfor i = 1:10
     squares(i) = i^2;
end
disp(squares)
pool.delete()

Where the expected output is:

>> simulation2  
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to the parallel pool (number of workers: 4).
     1     4     9    16    25    36    49    64    81   100

Parallel pool using the 'Processes' profile is shutting down.
>>

Note that the local parpool is limited to 12 cores in releases up to R2016a (8.7/9.0). From release R2016b (9.1) on, you can use all the cores of Euler nodes (effectively 24).

Older versions of MATLAB used a matlabpool instead of a parpool. For using a local pool there is no need to load a cluster profile.

Submit a parallel job

simulation2.m script can also be submitted as a batch job. Pass the number of cores (e.g., 4) to sbatch's --cpus-per-task argument. This parameter should be greater or equal to the size of the pool requested in your MATLAB script.

sbatch --ntasks=1 --cpus-per-task=4 --time=1:00:00 --mem-per-cpu=2g --wrap="matlab -nodisplay -singleCompThread -r simulation2"

Note that you must not use the -nojvm argument but you should include the -singleCompThread argument. MATLAB is quite memory-hungry, so request at least 2 GB of memory.

Troubleshoot parallel jobs

Using parallel pools often results in hard-to-diagnose errors. Many of these errors are related to running several pools at the same time, which is not what MATLAB expects. If you encounter persistent problems starting pools, try to perform one of these commands. Before running them, make sure that you do not have a MATLAB processes running.

  1. Remove the matlab_metadat.mat file in your current working directory.
  2. Remove the $HOME/.matlab/local_cluster_jobs directory.
  3. Remove the entire $HOME/.matlab directory. Warning: Your MATLAB settings on Euler will be lost.

In case this does not solve the problem, then please contact cluster support.