Difference between revisions of "Distributed computing in R with Rmpi"
Line 2: | Line 2: | ||
{{back_to_tutorials}} | {{back_to_tutorials}} | ||
− | == Load modules == | + | == Load modules and install Rmpi == |
Change to the new software stack and load required modules. Here we need MPI and R libraries. | Change to the new software stack and load required modules. Here we need MPI and R libraries. | ||
$ env2lmod | $ env2lmod | ||
$ module load gcc/6.3.0 openmpi/2.1.1 r/4.0.2 | $ module load gcc/6.3.0 openmpi/2.1.1 r/4.0.2 | ||
+ | $ R | ||
+ | > install.packages("Rmpi") | ||
== Run R in an interactive session == | == Run R in an interactive session == |
Revision as of 16:59, 6 October 2021
< Examples |
Load modules and install Rmpi
Change to the new software stack and load required modules. Here we need MPI and R libraries.
$ env2lmod $ module load gcc/6.3.0 openmpi/2.1.1 r/4.0.2 $ R > install.packages("Rmpi")
Run R in an interactive session
Rmpi assigns one processor to be the master and other processors to be workers. Here we would like to use 4 processors for computation. Therefore, we request 5 processors
$ bsub -n 5 -W 02:00 -I bash Generic job. Job <155200980> is submitted to queue <normal.4h>. <<Waiting for dispatch ...>> <<Starting on eu-c7-105-05>>
Define available global number of processors with the environment parameter MPI_UNIVERSE_SIZE.
$ export MPI_UNIVERSE_SIZE=5
Start R
$ R >
Use Rmpi
1. Load Rmpi which calls mpi.initialize()
> library(Rmpi)
2. Spawn R-slaves to the host. nslaves = requested number of processors - 1
> usize <- as.numeric(Sys.getenv("MPI_UNIVERSE_SIZE")) > ns <- usize - 1 > mpi.spawn.Rslaves(nslaves=ns)
3. Set up a variable array
> var = c(11.0, 22.0, 33.0)
4. Root sends state variables and parameters to other ranks
> mpi.bcast.data2slave(var, comm = 1, buffunit = 100)
5. Get the rank number of that processor
> mpi.bcast.cmd(id <- mpi.comm.rank())
6. Check if each rank can use its own value
> mpi.remote.exec(paste("The variable on rank ",id," is ", var[id]))
7. Root orders other ranks to calculate
> mpi.bcast.cmd(output <- var[id]*2)
8. Root orders other ranks to gather the output
> mpi.bcast.cmd(mpi.gather(output, 2, double(1)))
9. Root gathers the output from other ranks
> mpi.gather(double(1), 2, double(usize))
10. Close down and quit
> mpi.close.Rslaves(dellog = FALSE) > mpi.quit()
Exercises
- Try replacing mpi.scatter.Robj() instead of mpi.bcast.data2slave() in point 4
- Create an R script using Rmpi and submit a batch job through BSUB command line
- Create a BSUB job script and submit a batch job
Further reading
https://cran.r-project.org/web/packages/Rmpi/Rmpi.pdf
< Examples |