I am trying to use a high performance cluster at my institution for the first time and I have hit a problem that I can\'t resolve.
The following code returns an erro
In the vignette of foreach and the help page of foreach, the argument .packages
is pointed out as necessary to provide when using parallel computation with functions that are not loaded by default. So your code in the first example should be:
ptime<-system.time({
r <- foreach(z = 1:length(files),
.combine=cbind,
.packages='raster') %dopar% {
# some code
# and more code
}
})
Some more explanation
The foreach
package does a lot of setting up behind the scenes. What happens is the following (in principle, technical details are a tad more complicated):
foreach
sets up a system of "workers" that you can see as separate R sessions that are each committed to a different core in a cluster.
The function that needs to be carried out is loaded into each "worker" session, together with the objects needed to carry out the function
each worker calculates the result for a subset of the data
The results of the calculation on the different workers is put together and reported in the "master" R session.
As the workers can be seen as separate R sessions, packages from the "master" session are not automatically loaded. You have to specify which packages should be loaded in those worker sessions, and that's what the .package
argument of foreach
is used for.
Note that when you use other packages (e.g. parallel
or snowfall
), you'll have to set up these workers explicitly, and also take care of passing objects and loading packages on the worker sessions.
I dealt with the same problem. My solution is
Function.R
f <- function(parameters...){Body...}
MainFile.R
library(foreach)
library(doParallel)
cores=detectCores()
cl <- makeCluster(cores[1]-2) #not to overload your computer
registerDoParallel(cl)
clusterEvalQ(cl, .libPaths("C:/R/win-library/4.0")) #Give your R library path
output <- foreach(i=1:5, .combine = rbind) %dopar% {
source("~/Function.R") # That is the main point. Source your Function File here.
temp <- f(parameters...) # use your custom function after sourcing
temp
}
stopCluster(cl)