Taking a data.table slice with a sequence of (row,col) indices

前端 未结 2 817
囚心锁ツ
囚心锁ツ 2021-01-25 11:43

I have a data.table that resembles the one below.

tab <- data.table(a = c(NA, 42190, NA), b = c(42190, 42190, NA), c = c(40570, 42190, NA))
tab
          


        
相关标签:
2条回答
  • 2021-01-25 11:58

    There is a faster way to do this than coercing to either matrix or data.frame. Just use the [data.frame function.

    `[.data.frame`( tab,  cbind(ri,ci) )
    [1]    NA 42190    NA
    

    This is the functional syntax for the [.data.frame function.

    0 讨论(0)
  • 2021-01-25 12:05

    (UPDATE: @42-'s answer using [.data.frame is best. But here's my previous answer)

    as.matrix(tab)[cbind(ri, ci)]
    

    is going to be faster and more memory-efficient than melt.

    I see no reason you don't declare your DT as a matrix, as @thelatemail recommends. This is one case where DT syntax is not as powerful as matrix.

    (For memory-efficiency with large DTs, data.table has commands setDF/setDT to allow converting to/from DF/DT without copying, but I'm not aware it has an equivalent for matrix. If that is something people do a lot of, it might make a good enhance request for DT.

    For really big dimensions, you might look into Matrix's sparse-matrix formats package), or chunk your data, or use disk-backed data structures.)

    0 讨论(0)
提交回复
热议问题