Difference between revisions of "Setting up the MDCS"
(→Download and unpack the configuration files: New ZIP package.) |
(→Configure your firewall (optional but recommended)) |
||
(14 intermediate revisions by 2 users not shown) | |||
Line 3: | Line 3: | ||
== Prerequisites == | == Prerequisites == | ||
− | === Install MATLAB | + | === Install MATLAB 9.3 (R2017b) on your workstation === |
− | The MDCS service on Euler works only with specific MATLAB versions. The currently supported version is | + | The MDCS service on Euler works only with specific MATLAB versions. The currently supported version is 9.3 (R2017b). It is also possible to use the service with versions 8.1–8.2 and 8.4–9.3 (releases R2013a, R2013b, and R2014b through R2017b). You can obtain recent versions from [http://idesnx.ethz.ch/ the IT Shop of the ETH Zurich]. |
=== Configure your firewall (optional but recommended) === | === Configure your firewall (optional but recommended) === | ||
Line 11: | Line 11: | ||
You must poke a hole through your firewall to use <tt>matlabpool</tt>/<tt>parpool</tt> (i.e., <tt>parfor</tt>) functionality or <tt>pmode</tt>. Using <tt>batch()</tt> or <tt>submit()</tt> does not require any special firewall rules. | You must poke a hole through your firewall to use <tt>matlabpool</tt>/<tt>parpool</tt> (i.e., <tt>parfor</tt>) functionality or <tt>pmode</tt>. Using <tt>batch()</tt> or <tt>submit()</tt> does not require any special firewall rules. | ||
− | Open your firewall to incoming TCP connections to ports 27370–27470 from the | + | Open your firewall to incoming TCP connections to ports 27370–27470 from the [[Cluster_IP_ranges|Euler IP ranges]]. In practice opening just port 27370 instead of the whole 27370–27470 range may be sufficient. |
== Setting up your workstation == | == Setting up your workstation == | ||
Line 35: | Line 35: | ||
=== Download and unpack the configuration files === | === Download and unpack the configuration files === | ||
− | #[https:// | + | #[https://scicomp.ethz.ch/public/config/MDCS/Euler_MDCS_012.zip Download] the required MATLAB functions and cluster profile as a <tt>.zip</tt> file. |
#Unpack the <tt>Euler_MDCS_012.zip</tt> files into the directory that you chose above (it must be in your MATLAB path). You can use the free [http://www.7-zip.org 7-Zip program] to unpack <tt>.zip</tt> files under Windows. | #Unpack the <tt>Euler_MDCS_012.zip</tt> files into the directory that you chose above (it must be in your MATLAB path). You can use the free [http://www.7-zip.org 7-Zip program] to unpack <tt>.zip</tt> files under Windows. | ||
− | To quickly check that these files are correctly installed and the directory is in MATLAB's search path, | + | To quickly check that these files are correctly installed and the directory is in MATLAB's search path, just run MATLAB and issue the <tt>which getSubmitString</tt> command in MATLAB. It should print MATLAB's path: |
which getSubmitString | which getSubmitString | ||
C:\Users\my_username\Documents\MATLAB\getSubmitString.m | C:\Users\my_username\Documents\MATLAB\getSubmitString.m | ||
− | + | If the answer is instead | |
'getSubmitString' not found | 'getSubmitString' not found | ||
− | then either the directory is not in the search path or the file is not found. | + | then either the directory is not in the search path or the file is not found. Double-check this [[#Setting_up_your_workstation|Setting up your workstation]] section. |
=== Import the Euler cluster profile === | === Import the Euler cluster profile === | ||
Line 70: | Line 70: | ||
|9.1 | |9.1 | ||
| R2016b | | R2016b | ||
+ | |- | ||
+ | |9.2 | ||
+ | | R2017a | ||
+ | |- | ||
+ | |9.3 | ||
+ | | R2017b | ||
|} | |} | ||
− | Import the <tt> | + | Import the <tt>Euler_R2017b_9.3.settings</tt> file from the above directory into MATLAB. If you are using another MATLAB version, then use the corresponding settings file instead. '''This step should only be done once''' for a given MATLAB version. |
You can do this either | You can do this either | ||
− | * via the MATLAB command line, ''e.g.'', <tt>parallel.importProfile(' | + | * via the MATLAB command line, ''e.g.'', <tt>parallel.importProfile('Euler_R2017b_9.3.settings')</tt> (or the appropriate version) with the proper path: |
− | *: <tt>parallel.importProfile('C:\Users\my_username\Documents\MATLAB\ | + | *: <tt>parallel.importProfile('C:\Users\my_username\Documents\MATLAB\Euler_R2017b_9.3.settings')</tt> or |
* [http://www.mathworks.com/help/distcomp/clusters-and-cluster-profiles.html#brb8e5t-1 Mathwork's instructions for the GUI]. | * [http://www.mathworks.com/help/distcomp/clusters-and-cluster-profiles.html#brb8e5t-1 Mathwork's instructions for the GUI]. | ||
Line 81: | Line 87: | ||
'''This step should only be done once''' for a given MATLAB version. | '''This step should only be done once''' for a given MATLAB version. | ||
+ | |||
+ | If you switch to a newer version, you should '''delete''' the old profile and '''import''' the new one. | ||
== Verifying your setup == | == Verifying your setup == | ||
− | Once you have performed the setup you can validate the Euler cluster profile. Enter your | + | Once you have performed the setup you can validate the Euler cluster profile. Enter your ETH username and password when asked. |
By default MATLAB will use 48 cores for some test jobs, which may take a while to run if the cluster is busy. In this case, temporarily lower the number of workers used. Edit the Euler profile and change 48 to 4: select the profile, click on the Edit button, and edit the third entry. From version 9.1 (R2016b) on, you can specify the number of workers from the profile validation dialogue box. | By default MATLAB will use 48 cores for some test jobs, which may take a while to run if the cluster is busy. In this case, temporarily lower the number of workers used. Edit the Euler profile and change 48 to 4: select the profile, click on the Edit button, and edit the third entry. From version 9.1 (R2016b) on, you can specify the number of workers from the profile validation dialogue box. | ||
Line 94: | Line 102: | ||
==== Setting your workstation's hostname in MATLAB ==== | ==== Setting your workstation's hostname in MATLAB ==== | ||
− | Your computer must be reachable from the aforementioned subnet. In addition, your computer’s hostname must be resolvable in the ETH network. If you get errors about your host not being found when opening a <tt>parpool</tt> (<tt>matlabpool</tt>), then | + | Your computer must be reachable from the aforementioned subnet. In addition, your computer’s hostname must be resolvable in the ETH network. If you get errors about your host not being found when opening a <tt>parpool</tt> (<tt>matlabpool</tt>), then an invalid hostname may be to blame. |
+ | <!-- You can force MATLAB to use a specific hostname using the <tt>pctconfig</tt> MATLAB command; e.g., | ||
pctconfig('hostname','id-rz-dock-1-000.ethz.ch') | pctconfig('hostname','id-rz-dock-1-000.ethz.ch') | ||
− | where ‘id-rz-dock-1-000.ethz.ch’ is the hostname of your workstation that is resolvable in the ETH network. Both settings expire with the current session and, in addition, must be set prior to using any parallel computing toolbox features. If this happens often, you can add the following lines to your <tt>[http://www.mathworks.com/help/matlab/ref/startup.html startup.m]</tt> file: | + | where ‘id-rz-dock-1-000.ethz.ch’ is the hostname of your workstation that is resolvable in the ETH network. Both settings expire with the current session and, in addition, must be set prior to using any parallel computing toolbox features. If this happens often, you can add the following lines to your <tt>[http://www.mathworks.com/help/matlab/ref/startup.html startup.m]</tt> file: --> |
− | hostname = java.net.InetAddress.getLocalHost. | + | |
+ | We suggest you to create or add to your <tt>[http://www.mathworks.com/help/matlab/ref/startup.html startup.m]</tt> file the following lines, | ||
+ | hostname = java.net.InetAddress.getLocalHost().getCanonicalHostName(); | ||
dotpos = hostname.indexOf('.'); | dotpos = hostname.indexOf('.'); | ||
if dotpos < 0 | if dotpos < 0 | ||
Line 103: | Line 114: | ||
end | end | ||
clear dotpos; | clear dotpos; | ||
− | pctconfig('hostname',hostname); | + | pctconfig('hostname',char(hostname)); |
clear hostname; | clear hostname; | ||
This may not work in all cases. Before doing this check whether the MATLAB command | This may not work in all cases. Before doing this check whether the MATLAB command | ||
− | java.net.InetAddress.getLocalHost. | + | java.net.InetAddress.getLocalHost.getCanonicalHostName |
returns a hostname ending in ethz.ch. Otherwise, you may have to use a different command. For example, on a Linux laptop, use <tt>system('hostname -A')</tt> instead: | returns a hostname ending in ethz.ch. Otherwise, you may have to use a different command. For example, on a Linux laptop, use <tt>system('hostname -A')</tt> instead: | ||
[status, hostname] = system('hostname -A'); | [status, hostname] = system('hostname -A'); | ||
Line 118: | Line 129: | ||
The setting must also be set towards the beginning of a session as described in the [[#Setting_your_workstation.27s_hostname|previous subsection]]. | The setting must also be set towards the beginning of a session as described in the [[#Setting_your_workstation.27s_hostname|previous subsection]]. | ||
+ | |||
+ | == Troubleshooting == | ||
+ | |||
+ | === General === | ||
+ | |||
+ | Using the MDCS service, especially parpool, often results in hard-to-diagnose errors. Many of these errors are related to running several pools at the same time, which is not what MATLAB expects. | ||
+ | If you encounter persistent problems starting pools, try to perform one of these commands. Before running them, make sure that you do not have a MATLAB processes running. | ||
+ | # Remove the <tt>matlab_metadat.mat</tt> file and the stale <tt>Job*</tt> files in your current working directory on your workstation. | ||
+ | # Remove the <tt>$HOME/.matlab/local_cluster_jobs</tt> directory on your workstation. The actual location may depend on your operating system or installation options. | ||
+ | # Remove the entire <tt>$HOME/.matlab</tt> directory on your workstation. '''Warning''': Your MATLAB settings will be lost. | ||
+ | |||
+ | === Resetting or Forgetting the Username === | ||
+ | |||
+ | Your username is saved as a MATLAB preference in the ''ETHCalculus'' preference group. If you have mistyped it or want to delete the saved username, then issue the | ||
+ | rmpref('ETHCalculus') | ||
+ | MATLAB command to clear all the Calculus preferences. Contrary to the username, your '''password''' is ''not'' saved as a preference. It remains valid for the entire MATLAB session but you will need to retype it every time your restart MATLAB. | ||
+ | |||
+ | === JVM IOException: null === | ||
+ | |||
+ | If you get an error message | ||
+ | |||
+ | '''MatlabPoolPeerInstance'''{fUuid=b524f061-3dba-4299-aebe-63bd3a6455e3, fGroupUuid=823ea93c-df6a-4972-817c-594370e08154, fLabIndex=1, fNumberOfLabs=240} '''was unable to connect to''' xbx-bburling-05/10.162.30.200:27370 '''due to a JVM IOException: null''' | ||
+ | |||
+ | then this indicates a firewall issue according to [https://ch.mathworks.com/matlabcentral/answers/121585-why-am-i-unable-to-validate-or-use-matlabpool-or-parpool-with-hpc MathWorks support]. | ||
+ | |||
+ | Please check if your firewall is [[Setting_up_the_MDCS#Configure_your_firewall_.28optional_but_recommended.29 |configured correctly]] to allow the cluster to connect to your local computer. If the firewall settings on your local computer are correct, then please contact the IT responsible of your institute/department to ask if there is any other firewall between your local computer and the cluster. If this is the case, then this firewall also needs to have the required ports open. | ||
+ | |||
+ | === JVM UnknownHostException: null === | ||
+ | |||
+ | If you get an error message | ||
+ | |||
+ | '''MatlabPoolPeerInstance'''{fUuid=f28f24f8-c8bd-4c09-ac4a-19d1dec8e9dd, fGroupUuid=5451eae9-8043-4ada-a10c-92085c32827b, fLabIndex=1, fNumberOfLabs=8} '''was unable find the host for''' bowser:27371 '''due to a JVM UnknownHostException: null''' | ||
+ | |||
+ | then this indicates according to [https://www.mathworks.com/matlabcentral/answers/152566-how-to-use-a-windows-7-matlab-client-on-a-ubuntu-based-matlab-cluster MathWorks support] a problem with resolving the hostname of your local computer. | ||
+ | |||
+ | Please make sure that you correctly [[Setting_up_the_MDCS#Setting_your_workstation.27s_hostname_in_MATLAB | set the hostname]] of your local computer in MATLAB. |
Latest revision as of 14:04, 22 April 2022
You need to perform a one-time setup on your local workstation in order to use the MATLAB Distributed Computing Server (MDCS) on Euler.
Contents
Prerequisites
Install MATLAB 9.3 (R2017b) on your workstation
The MDCS service on Euler works only with specific MATLAB versions. The currently supported version is 9.3 (R2017b). It is also possible to use the service with versions 8.1–8.2 and 8.4–9.3 (releases R2013a, R2013b, and R2014b through R2017b). You can obtain recent versions from the IT Shop of the ETH Zurich.
Configure your firewall (optional but recommended)
You must poke a hole through your firewall to use matlabpool/parpool (i.e., parfor) functionality or pmode. Using batch() or submit() does not require any special firewall rules.
Open your firewall to incoming TCP connections to ports 27370–27470 from the Euler IP ranges. In practice opening just port 27370 instead of the whole 27370–27470 range may be sufficient.
Setting up your workstation
There are two parts to the local installation:
- Installing several supporting MATLAB function files and command scripts that interface with the Euler cluster.
- Importing the Euler cluster profile into MATLAB.
Choosing an installation directory
Before proceeding, you need to choose into which directory several supporting MATLAB function files and scripts will be installed.
- The easiest way
- is the default MATLAB user directory, which is usually Documents\MATLAB (Windows) or ~/Documents/MATLAB (Linux). Issue the disp(userpath) command in MATLAB command to see which directory this is; for example:
disp(userpath);
- shows
C:\Users\my_username\Documents\MATLAB
- The recommended way
- is to create a new directory such as C:\Users\my_username\Documents\MATLAB\Euler. You must then add this directory to MATLAB's search path. For example, add the following line:
addpath('C:\Users\my_username\Documents\MATLAB\Euler');
- to the startup.m file in MATLAB's default user directory. Refer to the Mathworks documentation for more details about the startup.m file.
Download and unpack the configuration files
- Download the required MATLAB functions and cluster profile as a .zip file.
- Unpack the Euler_MDCS_012.zip files into the directory that you chose above (it must be in your MATLAB path). You can use the free 7-Zip program to unpack .zip files under Windows.
To quickly check that these files are correctly installed and the directory is in MATLAB's search path, just run MATLAB and issue the which getSubmitString command in MATLAB. It should print MATLAB's path:
which getSubmitString C:\Users\my_username\Documents\MATLAB\getSubmitString.m
If the answer is instead
'getSubmitString' not found
then either the directory is not in the search path or the file is not found. Double-check this Setting up your workstation section.
Import the Euler cluster profile
Version | Release |
8.1 | R2013a |
8.2 | R2013b |
8.4 | R2014b |
8.5 | R2015a |
8.6 | R2015b |
9.0 (8.7) | R2016a |
9.1 | R2016b |
9.2 | R2017a |
9.3 | R2017b |
Import the Euler_R2017b_9.3.settings file from the above directory into MATLAB. If you are using another MATLAB version, then use the corresponding settings file instead. This step should only be done once for a given MATLAB version.
You can do this either
- via the MATLAB command line, e.g., parallel.importProfile('Euler_R2017b_9.3.settings') (or the appropriate version) with the proper path:
- parallel.importProfile('C:\Users\my_username\Documents\MATLAB\Euler_R2017b_9.3.settings') or
- Mathwork's instructions for the GUI.
It is recommended to name the profile Euler in case the name changes during the import.
This step should only be done once for a given MATLAB version.
If you switch to a newer version, you should delete the old profile and import the new one.
Verifying your setup
Once you have performed the setup you can validate the Euler cluster profile. Enter your ETH username and password when asked.
By default MATLAB will use 48 cores for some test jobs, which may take a while to run if the cluster is busy. In this case, temporarily lower the number of workers used. Edit the Euler profile and change 48 to 4: select the profile, click on the Edit button, and edit the third entry. From version 9.1 (R2016b) on, you can specify the number of workers from the profile validation dialogue box.
Advanced Options
Networking configuration
Setting your workstation's hostname in MATLAB
Your computer must be reachable from the aforementioned subnet. In addition, your computer’s hostname must be resolvable in the ETH network. If you get errors about your host not being found when opening a parpool (matlabpool), then an invalid hostname may be to blame.
We suggest you to create or add to your startup.m file the following lines,
hostname = java.net.InetAddress.getLocalHost().getCanonicalHostName(); dotpos = hostname.indexOf('.'); if dotpos < 0 hostname = [char(hostname), '.ethz.ch']; end clear dotpos; pctconfig('hostname',char(hostname)); clear hostname;
This may not work in all cases. Before doing this check whether the MATLAB command
java.net.InetAddress.getLocalHost.getCanonicalHostName
returns a hostname ending in ethz.ch. Otherwise, you may have to use a different command. For example, on a Linux laptop, use system('hostname -A') instead:
[status, hostname] = system('hostname -A'); pctconfig('hostname',hostname); clear hostname; clear status;
Changing incoming port range
The incoming port range may be changed with the pctconfig MATLAB command; for example
pctconfig('portrange',[27370 27371])
The setting must also be set towards the beginning of a session as described in the previous subsection.
Troubleshooting
General
Using the MDCS service, especially parpool, often results in hard-to-diagnose errors. Many of these errors are related to running several pools at the same time, which is not what MATLAB expects. If you encounter persistent problems starting pools, try to perform one of these commands. Before running them, make sure that you do not have a MATLAB processes running.
- Remove the matlab_metadat.mat file and the stale Job* files in your current working directory on your workstation.
- Remove the $HOME/.matlab/local_cluster_jobs directory on your workstation. The actual location may depend on your operating system or installation options.
- Remove the entire $HOME/.matlab directory on your workstation. Warning: Your MATLAB settings will be lost.
Resetting or Forgetting the Username
Your username is saved as a MATLAB preference in the ETHCalculus preference group. If you have mistyped it or want to delete the saved username, then issue the
rmpref('ETHCalculus')
MATLAB command to clear all the Calculus preferences. Contrary to the username, your password is not saved as a preference. It remains valid for the entire MATLAB session but you will need to retype it every time your restart MATLAB.
JVM IOException: null
If you get an error message
MatlabPoolPeerInstance{fUuid=b524f061-3dba-4299-aebe-63bd3a6455e3, fGroupUuid=823ea93c-df6a-4972-817c-594370e08154, fLabIndex=1, fNumberOfLabs=240} was unable to connect to xbx-bburling-05/10.162.30.200:27370 due to a JVM IOException: null
then this indicates a firewall issue according to MathWorks support.
Please check if your firewall is configured correctly to allow the cluster to connect to your local computer. If the firewall settings on your local computer are correct, then please contact the IT responsible of your institute/department to ask if there is any other firewall between your local computer and the cluster. If this is the case, then this firewall also needs to have the required ports open.
JVM UnknownHostException: null
If you get an error message
MatlabPoolPeerInstance{fUuid=f28f24f8-c8bd-4c09-ac4a-19d1dec8e9dd, fGroupUuid=5451eae9-8043-4ada-a10c-92085c32827b, fLabIndex=1, fNumberOfLabs=8} was unable find the host for bowser:27371 due to a JVM UnknownHostException: null
then this indicates according to MathWorks support a problem with resolving the hostname of your local computer.
Please make sure that you correctly set the hostname of your local computer in MATLAB.