Combinatorial iterator like expand.grid

前端 未结 1 1953
无人共我
无人共我 2021-01-19 14:55

Is there a fast way to iterate through combinations like those returned by expand.grid or CJ (data.table). These get too big to fit i

相关标签:
1条回答
  • 2021-01-19 15:29

    I think you'll get better performance if you give each of the workers a chunk of one of the data frames, have them each perform the computations, and then combine the results. This results in more efficient computation and reduced memory usage by the workers.

    Here is an example that uses the isplitRow function from the itertools package:

    library(doParallel)
    library(itertools)
    dim1 <- 10
    dim2 <- 100
    df1 <- data.frame(a = 1:dim1, b = 1:dim1)
    df2 <- data.frame(x= 1:dim2, y = 1:dim2, z = 1:dim2)
    f <- function(...) sum(...)
    
    nw <- 4
    cl <- makeCluster(nw)
    registerDoParallel(cl)
    
    res <- foreach(d2=isplitRows(df2, chunks=nw), .combine=c) %dopar% {
      expgrid <- expand.grid(x=seq(dim1), y=seq(nrow(d2)))
      apply(expgrid, 1, function(i) f(df1[i[["x"]],], d2[i[["y"]],]))
    }
    

    I split df2 because it has more rows, but you could choose either.

    0 讨论(0)
提交回复
热议问题