Distributed computing in R with Rmpi

From ScientificComputing
Jump to: navigation, search

Load modules

$ env2lmod
$ module load gcc/6.3.0 openmpi/4.0.2 r/4.0.2

Run in an interactive session

 $ bsub -n 5 -W 02:00 -I bash
 $ export MPI_UNIVERSE_SIZE=5
 $ R
 > 

Load Rmpi which calls mpi.initialize()

 > library(Rmpi)

Spawn R-slaves to the host. nslaves = requested number of processors - 1

 > usize <- as.numeric(Sys.getenv("MPI_UNIVERSE_SIZE"))
 > ns <- usize - 1
 > mpi.spawn.Rslaves(nslaves=ns)

Set up a variable array

 > var = c(11.0, 22.0, 33.0)

Root sends state variables and parameters to other ranks

 > mpi.bcast.data2slave(var, comm = 1, buffunit = 100)

Get the rank number of that processor

 > mpi.bcast.cmd(id <- mpi.comm.rank())

Check if each rank can use its own value

 > mpi.remote.exec(paste("The variable on rank ",id," is ", var[id]))

Root orders other ranks to calculate

 > mpi.bcast.cmd(output <- var[id]*2)

Root orders other ranks to gather the output

 > mpi.bcast.cmd(mpi.gather(output, 2, double(1)))

Root gathers the output from other ranks

 > mpi.gather(double(1), 2, double(usize))

Close down and quit

 > mpi.close.Rslaves(dellog = FALSE)
 > mpi.quit()

Exercise

What happens when we use mpi.scatter.Robj() instead of mpi.bcast.data2slave()?