December 11, 2014

We've already seen the `doParallel`

package, which acts as an interface between `foreach`

and `parallel`

. Whereas `foreach`

was a parallel analogue of a `for`

loop, the `parallel`

package provides handy analogues of `apply`

functions

The `parallel`

package merges `multicore`

and `snow`

- Provides drop-in replacements for most of their functionality, with integrated handling of random-number generation
- By default, uses
`multicore`

functionality on Unix-like systems and`snow`

functionality on Windows (we can also use the`snow`

functionality to execute on a cluster for both systems)

To get started, we must first make the desired number of cores available to R by registering a parallel backend. We can do this with the `doParallel`

package.

library(doParallel) # Determine number of CPU cores in your machine nCores = detectCores() # Create cluster with desired number of cores cl = makeCluster(nCores) # Register cluster registerDoParallel(cl)

We've seen how we can move from a traditional `for`

loop to `foreach`

. Similarly, we can speed up traditional `apply`

functions by using their parallel analogues.

N = 100 input = seq_len(N) # same as 1:N but more robust input.list = as.list(input) testFun = function(i){ mu = mean(runif(1e+06, 0, i)) return(mu) }

system.time(sapply(input, testFun))

## user system elapsed ## 5.216 0.175 5.422

system.time(parSapply(cl, input, testFun))

## user system elapsed ## 0.004 0.000 2.455

system.time(lapply(input.list, testFun))

## user system elapsed ## 4.937 0.150 5.116

system.time(parLapply(cl, input.list, testFun))

## user system elapsed ## 0.002 0.000 2.363

system.time(mclapply(input.list, testFun, mc.cores=nCores))

## user system elapsed ## 5.324 0.344 2.150

# not available on Windows

We can also parallelize a vector map function using `pvec`

. This splits the vector and submits each part to one core. It is often only worthwhile on very long vectors and for computationally intensive calculations.

library(fields) d = runif(5e+06, 0.1, 10) system.time(Matern(d))

## user system elapsed ## 3.439 0.103 3.603

system.time(pvec(d, Matern, mc.cores=nCores))

## user system elapsed ## 3.950 0.349 1.451

Once again, we need to be careful when parallelizing a process which generates (psuedo-)random numbers.

We want the different processes to run independent (and preferably reproducible) random-number streams. We can do this using `clusterSetRNGStream`

.

Let's go back and fix our first example:

N = 100 input = seq_len(N) testFun = function(i){ mu = mean(runif(1e+06, 0, i)) return(mu) }

# RNGkind("L-Ecuyer-CMRG") # Automatically uses L'Ecuyer's algorithm clusterSetRNGStream(cl, iseed=0) res1 = parSapply(cl, input, testFun) clusterSetRNGStream(cl, iseed=0) res2 = parSapply(cl, input, testFun) identical(res1, res2)

## [1] TRUE

**Bootstrapping** is another example of an embarassingly parallel task. In the example below, we use bootstrapping to generate a 95% confidence interval for R-squared of a linear regression. Using `parLapply`

we can split the n*1000 bootstrap replicates between the n cores.

library(boot) run1 = function(...){ library(boot) rsq = function(formula, data, indices){ d = data[indices,] fit = lm(formula, data=d) return(summary(fit)$r.square) } boot(data=mtcars, statistic=rsq, R=1000, formula=mpg~wt+disp) }

system.time(car.boot <- do.call(c, lapply(seq_len(nCores), run1)))

## user system elapsed ## 8.216 0.075 8.498

clusterSetRNGStream(cl, iseed=123) system.time( car.boot2 <- do.call(c, parLapply(cl, seq_len(nCores), run1)) )

## user system elapsed ## 0.003 0.000 4.204

boot.ci(car.boot2, type="bca")

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS ## Based on 4000 bootstrap replicates ## ## CALL : ## boot.ci(boot.out = car.boot2, type = "bca") ## ## Intervals : ## Level BCa ## 95% ( 0.6510, 0.8564 ) ## Calculations and Intervals on Original Scale

Note that the `boot`

package has built-in parallel support, so we can also simply run the following:

rsq = function(formula, data, indices){ d = data[indices,] fit = lm(formula, data=d) return(summary(fit)$r.square) } nBoot = nCores*1000 set.seed(123, kind="L'Ecuyer") system.time( car.boot3 <- boot(data=mtcars, statistic=rsq, R=nBoot, formula=mpg~wt+disp, parallel="multicore", ncpus=4) )

## user system elapsed ## 11.460 0.383 3.602

It's a good idea to stop your cluster when you are done using it.

stopCluster(cl)