Parallelization in R: how to “source” on every node?

后端 未结 2 1509
花落未央
花落未央 2021-02-09 21:51

I have created parallel workers (all running on the same machine) using:

MyCluster = makeCluster(8)

How can I make every of these 8 nodes sourc

2条回答
  •  庸人自扰
    2021-02-09 22:37

    The following code serves your purpose:

    library(parallel)
    
    cl <- makeCluster(4)
    clusterCall(cl, function() { source("test.R") })
    
    ## do some parallel work
    
    stopCluster(cl)
    

    Also you can use clusterEvalQ() to do the same thing:

    library(parallel)
    
    cl <- makeCluster(4)
    clusterEvalQ(cl, source("test.R"))
    
    ## do some parallel work
    
    stopCluster(cl)
    

    However, there is subtle difference between the two methods. clusterCall() runs a function on each node while clusterEvalQ() evaluates an expression on each node. If you have a variable list of files to source, clusterCall() will be easier to use since clusterEvalQ(cl,expr) will regard any expr as an expression so it's not convenient to put a variable there.

提交回复
热议问题