Let me add another scoping problem in R, this time with the snowfall package. If I define a function in my global environment, and I try to use that one later in an sfApply() inside another function, my first function isn't found any more :
#Runnable code. Don't forget to stop the cluster with sfStop()
require(snowfall)
sfInit(parallel=TRUE,cpus=3)
func1 <- function(x){
y <- x+1
y
}
func2 <- function(x){
y <- sfApply(x,2,function(i) func1(i) )
y
}
y <- matrix(1:10,ncol=2)
func2(y)
sfStop()
This gives :
> func2(y)
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: could not find function "func1"
If I nest my function inside the other function however, it works. It also works when I use the sfApply() in the global environment. Thing is, I don't want to nest my function func1 inside that function2, as that would cause that func1 is defined many times (func2 is used in a loop-like structure).
I've tried already simplifying the code to get rid of the double looping, but that's quite impossible due to the nature of the problem. Any ideas?
I think you want to sfExport(func1)
, though I'm not sure if you need to do it in your .GlobalEnv
or inside of func2
. Hope that helps...
> y <- matrix(1:10,ncol=2)
> sfExport(list=list("func1"))
> func2(y)
[,1] [,2]
[1,] 2 7
[2,] 3 8
[3,] 4 9
[4,] 5 10
[5,] 6 11
Methinks you are now confusing scoping with parallel computing. You are invoking new R sessions---and it is commonly your responsibility to re-create your environment on the nodes.
An alternative would be to use foreach et al. There has examples in the foreach (or iterator ?) docs that show exactly this. Oh, see, and Josh has by now recommended the same thing.
来源:https://stackoverflow.com/questions/3856245/scoping-problem-when-sfapply-is-used-within-function-package-snowfall-r