Best way to store variable-length data in an R data.frame?

前端 未结 5 1186
日久生厌
日久生厌 2021-02-06 02:59

I have some mixed-type data that I would like to store in an R data structure of some sort. Each data point has a set of fixed attributes which may be 1-d numeric, factors, or

5条回答
  •  庸人自扰
    2021-02-06 03:59

    Since the R data frame structure is based loosely on the SQL table, having each element of the data frame be anything other than an atomic data type is uncommon. However, it can be done, as you've shown, and this linked post describes such an application implemented on a larger scale.

    An alternative is to store your data as a string and have a function to retrieve it, or create a separate function to which the data is attached and extract it using indices stored in your data frame.

    > ## alternative 1
    > tokens <- function(x,i=TRUE) Map(as.numeric,strsplit(x[i],","))
    > d <- data.frame(id=c(1,2,3), token_lengths=c("5,5", "9", "4,2,2,4,6"))
    > 
    > tokens(d$token_lengths)
    [[1]]
    [1] 5 5
    
    [[2]]
    [1] 9
    
    [[3]]
    [1] 4 2 2 4 6
    
    > tokens(d$token_lengths,2:3)
    [[1]]
    [1] 9
    
    [[2]]
    [1] 4 2 2 4 6
    
    > 
    > ## alternative 2
    > retrieve <- local({
    +   token_lengths <- list(c(5,5), 9, c(4,2,2,4,6))
    +   function(i) token_lengths[i]
    + })
    > 
    > d <- data.frame(id=c(1,2,3), token_lengths=1:3)
    > retrieve(d$token_lengths[2:3])
    [[1]]
    [1] 9
    
    [[2]]
    [1] 4 2 2 4 6
    

提交回复
热议问题