Difference between revisions of "Setting up the MDCS"

From ScientificComputing
Jump to: navigation, search
(Download and unpack the configuration files)
Line 92: Line 92:
 
== Verifying your setup ==
 
== Verifying your setup ==
  
Once you have performed the setup you can validate the Euler cluster profile. Enter your NETHZ username and password when asked.
+
Once you have performed the setup you can validate the Euler cluster profile. Enter your ETH username and password when asked.
  
 
By default MATLAB will use 48 cores for some test jobs, which may take a while to run if the cluster is busy. In this case, temporarily lower the number of workers used. Edit the Euler profile and change 48 to 4: select the profile, click on the Edit button, and edit the third entry. From version 9.1 (R2016b) on, you can specify the number of workers from the profile validation dialogue box.
 
By default MATLAB will use 48 cores for some test jobs, which may take a while to run if the cluster is busy. In this case, temporarily lower the number of workers used. Edit the Euler profile and change 48 to 4: select the profile, click on the Edit button, and edit the third entry. From version 9.1 (R2016b) on, you can specify the number of workers from the profile validation dialogue box.

Revision as of 09:02, 9 January 2020

You need to perform a one-time setup on your local workstation in order to use the MATLAB Distributed Computing Server (MDCS) on Euler.

Prerequisites

Install MATLAB 9.3 (R2017b) on your workstation

The MDCS service on Euler works only with specific MATLAB versions. The currently supported version is 9.3 (R2017b). It is also possible to use the service with versions 8.1–8.2 and 8.4–9.3 (releases R2013a, R2013b, and R2014b through R2017b). You can obtain recent versions from the IT Shop of the ETH Zurich.

Configure your firewall (optional but recommended)

You must poke a hole through your firewall to use matlabpool/parpool (i.e., parfor) functionality or pmode. Using batch() or submit() does not require any special firewall rules.

Open your firewall to incoming TCP connections to ports 27370–27470 from the Euler IP ranges 10.205.0.0/19 and 10.205.96.0/19. In practice opening just port 27370 instead of the whole 27370–27470 range may be sufficient.

Setting up your workstation

There are two parts to the local installation:

  1. Installing several supporting MATLAB function files and command scripts that interface with the Euler cluster.
  2. Importing the Euler cluster profile into MATLAB.

Choosing an installation directory

Before proceeding, you need to choose into which directory several supporting MATLAB function files and scripts will be installed.

The easiest way
is the default MATLAB user directory, which is usually Documents\MATLAB (Windows) or ~/Documents/MATLAB (Linux). Issue the disp(userpath) command in MATLAB command to see which directory this is; for example:
disp(userpath);
shows
C:\Users\my_username\Documents\MATLAB
The recommended way
is to create a new directory such as C:\Users\my_username\Documents\MATLAB\Euler. You must then add this directory to MATLAB's search path. For example, add the following line:
addpath('C:\Users\my_username\Documents\MATLAB\Euler');
to the startup.m file in MATLAB's default user directory. Refer to the Mathworks documentation for more details about the startup.m file.

Download and unpack the configuration files

  1. Download the required MATLAB functions and cluster profile as a .zip file.
  2. Unpack the Euler_MDCS_012.zip files into the directory that you chose above (it must be in your MATLAB path). You can use the free 7-Zip program to unpack .zip files under Windows.

To quickly check that these files are correctly installed and the directory is in MATLAB's search path, just run MATLAB and issue the which getSubmitString command in MATLAB. It should print MATLAB's path:

which getSubmitString
C:\Users\my_username\Documents\MATLAB\getSubmitString.m

If the answer is instead

'getSubmitString' not found

then either the directory is not in the search path or the file is not found. Double-check this Setting up your workstation section.

Import the Euler cluster profile

Version Release
8.1 R2013a
8.2 R2013b
8.4 R2014b
8.5 R2015a
8.6 R2015b
9.0 (8.7) R2016a
9.1 R2016b
9.2 R2017a
9.3 R2017b

Import the Euler_R2017b_9.3.settings file from the above directory into MATLAB. If you are using another MATLAB version, then use the corresponding settings file instead. This step should only be done once for a given MATLAB version.

You can do this either

  • via the MATLAB command line, e.g., parallel.importProfile('Euler_R2017b_9.3.settings') (or the appropriate version) with the proper path:
    parallel.importProfile('C:\Users\my_username\Documents\MATLAB\Euler_R2017b_9.3.settings') or
  • Mathwork's instructions for the GUI.

It is recommended to name the profile Euler in case the name changes during the import.

This step should only be done once for a given MATLAB version.

If you switch to a newer version, you should delete the old profile and import the new one.

Verifying your setup

Once you have performed the setup you can validate the Euler cluster profile. Enter your ETH username and password when asked.

By default MATLAB will use 48 cores for some test jobs, which may take a while to run if the cluster is busy. In this case, temporarily lower the number of workers used. Edit the Euler profile and change 48 to 4: select the profile, click on the Edit button, and edit the third entry. From version 9.1 (R2016b) on, you can specify the number of workers from the profile validation dialogue box.

Advanced Options

Networking configuration

Setting your workstation's hostname in MATLAB

Your computer must be reachable from the aforementioned subnet. In addition, your computer’s hostname must be resolvable in the ETH network. If you get errors about your host not being found when opening a parpool (matlabpool), then an invalid hostname may be to blame.

We suggest you to create or add to your startup.m file the following lines,

hostname = java.net.InetAddress.getLocalHost().getCanonicalHostName();
dotpos = hostname.indexOf('.');
if dotpos < 0
        hostname = [char(hostname), '.ethz.ch'];
end
clear dotpos;
pctconfig('hostname',char(hostname));
clear hostname;

This may not work in all cases. Before doing this check whether the MATLAB command

java.net.InetAddress.getLocalHost.getCanonicalHostName

returns a hostname ending in ethz.ch. Otherwise, you may have to use a different command. For example, on a Linux laptop, use system('hostname -A') instead:

[status, hostname] = system('hostname -A');
pctconfig('hostname',hostname);
clear hostname;
clear status;

Changing incoming port range

The incoming port range may be changed with the pctconfig MATLAB command; for example

pctconfig('portrange',[27370 27371])

The setting must also be set towards the beginning of a session as described in the previous subsection.

Troubleshooting

General

Using the MDCS service, especially parpool, often results in hard-to-diagnose errors. Many of these errors are related to running several pools at the same time, which is not what MATLAB expects. If you encounter persistent problems starting pools, try to perform one of these commands. Before running them, make sure that you do not have a MATLAB processes running.

  1. Remove the matlab_metadat.mat file and the stale Job* files in your current working directory on your workstation.
  2. Remove the $HOME/.matlab/local_cluster_jobs directory on your workstation. The actual location may depend on your operating system or installation options.
  3. Remove the entire $HOME/.matlab directory on your workstation. Warning: Your MATLAB settings will be lost.

Resetting or Forgetting the Username

Your username is saved as a MATLAB preference in the ETHCalculus preference group. If you have mistyped it or want to delete the saved username, then issue the

rmpref('ETHCalculus')

MATLAB command to clear all the Calculus preferences. Contrary to the username, your password is not saved as a preference. It remains valid for the entire MATLAB session but you will need to retype it every time your restart MATLAB.

JVM IOException: null

If you get an error message

MatlabPoolPeerInstance{fUuid=b524f061-3dba-4299-aebe-63bd3a6455e3, fGroupUuid=823ea93c-df6a-4972-817c-594370e08154, fLabIndex=1, fNumberOfLabs=240} was unable to connect to xbx-bburling-05/10.162.30.200:27370 due to a JVM IOException: null

then this indicates a firewall issue according to MathWorks support.

Please check if your firewall is configured correctly to allow the cluster to connect to your local computer. If the firewall settings on your local computer are correct, then please contact the IT responsible of your institute/department to ask if there is any other firewall between your local computer and the cluster. If this is the case, then this firewall also needs to have the required ports open.

JVM UnknownHostException: null

If you get an error message

MatlabPoolPeerInstance{fUuid=f28f24f8-c8bd-4c09-ac4a-19d1dec8e9dd, fGroupUuid=5451eae9-8043-4ada-a10c-92085c32827b, fLabIndex=1, fNumberOfLabs=8} was unable find the host for bowser:27371 due to a JVM UnknownHostException: null

then this indicates according to MathWorks support a problem with resolving the hostname of your local computer.

Please make sure that you correctly set the hostname of your local computer in MATLAB.