Using lists inside data.table columns

前端 未结 2 1748
挽巷
挽巷 2020-11-27 04:59

In data.table is possible to have columns of type list and I\'m trying for the first time to benefit from this feature. I need to store for each ro

相关标签:
2条回答
  • 2020-11-27 05:10

    Just to add more info, what list columns are really designed for is when each cell is itself a vector:

    > DT = data.table(a=1:2, b=list(1:5,1:10))
    > DT
       a            b
    1: 1    1,2,3,4,5
    2: 2 1,2,3,4,5,6,
    
    > sapply(DT$b, length)
    [1]  5 10 
    

    Notice the pretty printing of the vectors in the b column. Those commas are just for display, each cell is actually a vector (as shown by the sapply command above). Note also the trailing comma on the 2nd item of b. That indicates that the vector is longer than displayed (data.table just displays the first 6 items).

    Or, more like your example :

    > DT = data.table(id=1:2, comment=list( c("michele", Sys.time(), "hello"),
                                            c("michele", Sys.time(), "world") ))
    > DT
       id                       comment
    1:  1 michele,1395330180.9278,hello
    2:  2 michele,1395330180.9281,world 
    

    What you're trying to do is not only have a list column, but put list into each cell as well, which is why <list> is being displayed. Additionally if you place named lists into each cell then beware that all those names will use up space. Where possible, a list column of vectors may be easier.

    0 讨论(0)
  • 2020-11-27 05:21

    Using :=:

    dt = data.table(id = 1:2, comment = vector("list", 2L))
    
    # assign value 1 to just the first column of 'comment'
    dt[1L, comment := 1L]
    
    # assign value of 1 and "a" to rows 1 and 2
    dt[, comment := list(1, "a")]
    
    # assign value of "a","b" to row 1, and 1 to row 2 for 'comment'
    dt[, comment := list(c("a", "b"), 1)]
    
    # assign list(1, "a") to just 1 row of 'comment'
    dt[1L, comment := list(list(list(1, "a")))]
    

    For the last case, you'll need one more list because data.table uses list(.) to look for values to assign to columns by reference.

    Using set:

    dt = data.table(id = 1:2, comment = vector("list", 2L))
    
    # assign value 1 to just the first column of 'comment'
    set(dt, i=1L, j="comment", value=1L)
    
    # assign value of 1 and "a" to rows 1 and 2
    set(dt, j="comment", value=list(1, "a"))
    
    # assign value of "a","b" to row 1, and 1 to row 2 for 'comment'
    set(dt, j="comment", value=list(c("a", "b"), 1))
    
    # assign list(1, "a") to just 1 row of 'comment'
    set(dt, i=1L, j="comment", value=list(list(list(1, "a"))))
    

    HTH


    I'm using the current development version 1.9.3, but should just work fine on any other version.

    > sessionInfo()
    R version 3.0.3 (2014-03-06)
    Platform: x86_64-apple-darwin10.8.0 (64-bit)
    
    locale:
    [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] data.table_1.9.3
    
    loaded via a namespace (and not attached):
    [1] plyr_1.8.0.99  reshape2_1.2.2 stringr_0.6.2  tools_3.0.3   
    
    0 讨论(0)
提交回复
热议问题