R

Back to documentation index

Various versions of R are available on POD using the module command. If you need any additional versions installed, please email pod@penguincomputing.com.

[penguin@podmt1 ~]$ module avail R

----------------------------- /public/modulefiles ------------------------------
R/3.0.0/gcc.4.4.7(default) Rmpi/0.6-3/R.3.0.0
R/3.0.2/gcc.4.4.7


Rmpi

Rmpi is available by loading the corresponding environment module:

[penguin@podmt1]$ module load Rmpi/0.6-3/R.3.0.0

The module load command will load all the dependencies:

[penguin@podmt1]$ module list
Currently Loaded Modulefiles:
  1) gcc/4.4.7                 3) openmpi/1.6.4/gcc.4.4.7
  2) R/3.0.0/gcc.4.4.7         4) Rmpi/0.6-3/R.3.0.0

Running Rmpi Programs

Rmpi programs are typically executed by starting a master task which spawns the MPI slave processes. The master task has to be started using the mpirun command in order to initialize the MPi environment, however only 1 MPI process need to be started at this time, since the slave processes are spawned internally by the master task. Thus the typical command used to start an Rmpi program would be the following:

mpirun -np 1 R --slave CMD BATCH <my-Rmpi-program.R>

Spawning MPI Processes

The Rmpi function mpi.spawn.Rslaves() is used to spawn MPI tasks. This function takes an optional argument specifying how many slaves have to be spawned. The following snippet of code can be used to start as many task as the MPI environment allows. This is achieved by polling the environment variable OMPI_UNIVERSE_SIZE and subtracting 1 to account for the master task:

# Spawn as many slaves as possible
NS <- type.convert(Sys.getenv("OMPI_UNIVERSE_SIZE")) - 1
mpi.spawn.Rslaves(nslaves=NS)

Rmpi Examples

Rmpi examples and submission scripts are available at /public/Rmpi-examples, here is an example:

#PBS -S /bin/bash
#PBS -q M40
#PBS -N Rmpi
#PBS -j oe
#PBS -l nodes=4:ppn=12

module load Rmpi
cd $PBS_O_WORKDIR

echo "Job ID: $PBS_JOBID"
echo "Queue:  $PBS_QUEUE"
echo "Cores:  $PBS_NP"
echo "Nodes:  $(cat $PBS_NODEFILE | sort -u | tr '\n' ' ')"
echo "mpirun: $(which mpirun)"
echo "R:      $(which R)"

mpirun -np 1 R --slave CMD BATCH example.R

exit $?

# vim: syntax=sh

Where, example.R reads:

# Load the R MPI package if it is not already loaded.
if (!is.loaded("mpi_initialize")) {
    library("Rmpi")
    }
                                                                                
# Spawn as many slaves as possible
NS <- type.convert(Sys.getenv("OMPI_UNIVERSE_SIZE")) - 1
mpi.spawn.Rslaves(nslaves=NS)
                                                                                
# In case R exits unexpectedly, have it automatically clean up
# resources taken up by Rmpi (slaves, memory, etc...)
.Last <- function(){
    if (is.loaded("mpi_initialize")){
        if (mpi.comm.size(1) > 0){
            print("Please use mpi.close.Rslaves() to close slaves.")
            mpi.close.Rslaves()
        }
        print("Please use mpi.quit() to quit R")
        .Call("mpi_finalize")
    }
}

# Tell all slaves to return a message identifying themselves
mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size()))

# Tell all slaves to close down, and exit the program
mpi.close.Rslaves()
mpi.quit()