# Difference between revisions of "MATLAB PCT"

(→Troubleshoot parallel jobs) |
|||

Line 31: | Line 31: | ||

You must ''not'' use the <tt>-nojvm</tt> MATLAB argument but you ''should'' include the <tt>-singleCompThread</tt> MATLAB argument. MATLAB is quite memory-hungry, so request at least 2 GB of memory per core as shown above. | You must ''not'' use the <tt>-nojvm</tt> MATLAB argument but you ''should'' include the <tt>-singleCompThread</tt> MATLAB argument. MATLAB is quite memory-hungry, so request at least 2 GB of memory per core as shown above. | ||

− | The '''local''' parpool is limited to 12 cores in releases up to R2016a (8.7/9.0). From release R2016b (9.1) on, you can use all the cores of an Euler node (effectively | + | The '''local''' parpool is limited to 12 cores in releases up to R2016a (8.7/9.0). From release R2016b (9.1) on, you can use all the cores of an Euler node (effectively up to 128). |

Older versions of MATLAB used a matlabpool instead of a parpool. | Older versions of MATLAB used a matlabpool instead of a parpool. |

## Revision as of 11:59, 9 August 2021

MATLAB's Parallel Computing Toolbox (PCT) lets you run suitably-written programs in parallel or as a set of independent jobs. Several cores calculate different parts of a problem, possibly at the same time, to reduce the total time-to-solution.

A trivial program that uses a *parpool* (a pool of workers) is shown below. It calculates the the squares of the first ten integers in parallel and stores them in an array:

squares = zeros(10,1); pool = parpool(4); parfor i = 1:10 squares(i) = i^2; end disp(squares) pool.delete()

You can use the Parallel Computing Toolbox (PCT) on Euler in two ways, though the best way depends on the properties of the program. One is to submit a job that requests multiple cores to the batch system and use the **local** parpool. The parallel part of your program (for exmaple, the parfor loop above) will run within your job. The other is to submit a single-core master job and use the **LSF** parpool. MATLAB will itself submit a parallel job to compute *just* the parallel part of your program.

## Contents

## Use a local parpool

When you use the local parpool, you submit a multi-core job to LSF. MATLAB will run additional worker processes within your multi-core job to process the parallel part of your program. A diagram of this is shown to the right.

A trivial parallel program (`simulation.m`) is shown below:

squares = zeros(10,1); local_job = parcluster('local'); pool = parpool(local_job, 4); parfor i = 1:10 squares(i) = i^2; end disp(squares) pool.delete()

To submit this program, pass the number of cores to bsub's `-n` argument. This should match the size of the pool requested in your MATLAB script (e.g., 4).

bsub -n 4 -W "1:00" -R "rusage[mem=2048]" matlab -nodisplay -singleCompThread -r simulation

You must *not* use the `-nojvm` MATLAB argument but you *should* include the `-singleCompThread` MATLAB argument. MATLAB is quite memory-hungry, so request at least 2 GB of memory per core as shown above.

The **local** parpool is limited to 12 cores in releases up to R2016a (8.7/9.0). From release R2016b (9.1) on, you can use all the cores of an Euler node (effectively up to 128).

Older versions of MATLAB used a matlabpool instead of a parpool.

## LSF parpool

### Set up MATLAB to use the LSF parpool

**One-time** preparation: Before using the LSF job pool for the first time, you need to import a cluster profile. We provide a settings template`/cluster/apps/matlab/support/EulerLSF8h.settings` that you can import into MATLAB. There are instructions provided by MATLAB or you can import it from the prompt:

parallel.importProfile('/cluster/apps/matlab/support/EulerLSF8h.settings')

Do this only **once** (or every once in a while, to get possible updates). **Do not** import the profile in every job you run.
These settings allow you to use up to 144 cores but the default pool size is 24 cores since guest users on Euler can use no more than 48 cores.

### Use an LSF parpool

When you use the LSF parpool, you submit a single-core job to LSF. MATLAB will submit an additional parallel job to run the MATLAB workers to process the parallel part of your program. A diagram of this is shown to the right.

A trivial parallel program (`simulation.m`) is shown below:

squares = zeros(10,1); batch_job = parcluster('EulerLSF8h'); pool = parpool(batch_job, 4); parfor i = 1:10 squares(i) = i^2; end disp(squares) pool.delete()

To submit this program, just submit your MATLAB program (the master job) as a serial (single-core) job:

bsub -R light -n 1 -W "120:00" -R "rusage[mem=2048]" matlab -nodisplay -singleCompThread -r simulation

The master job is assumed to not need much CPU power; however, it may need to run for a long time since it needs to wait for the parallel pool job to start and run. This is why you should submit the master job with the “-R light” option and request ample runtime.

You must *not* use the `-nojvm` MATLAB argument but you *should* include the `-singleCompThread` MATLAB argument. MATLAB is quite memory-hungry, so request at least 2 GB of memory as shown above.

Older versions of MATLAB used a matlabpool instead of a parpool.

### Change the settings of an LSF parpool

You can change the settings of the LSF jobs that the LSF parpool will submit, such as requesting more time or memory. To do this, you must edit the LSF paramaters MATLAB will use when submitting the parallel job. For example, if you want to request 24 hours and 4 GB of RAM per worker in the above example, then add the line listed below in bold:

squares = zeros(10,1); batch_job = parcluster('EulerLSF8h');batch_job.SubmitArguments = '-W 24:00 -R "rusage[mem=4000]"'pool = parpool(batch_job, 4); parfor i = 1:10 squares(i) = i^2; end disp(squares) pool.delete()

## Troubleshoot parallel jobs

Using parallel pools often results in hard-to-diagnose errors. Many of these errors are related to running several pools at the same time, which is not what MATLAB expects. If you encounter persistent problems starting pools, try to perform one of these commands. Before running them, make sure that you do not have a MATLAB processes running.

- Remove the
`matlab_metadat.mat`file in your current working directory. - Remove the
`$HOME/.matlab/local_cluster_jobs`directory. - Remove the entire
`$HOME/.matlab`directory.**Warning**: Your MATLAB settings on Euler will be lost and you will need to re-import the pool profile.