rapply to nested list of data frames in R

后端 未结 3 881
太阳男子
太阳男子 2020-12-20 14:48

i have a nested list whose fundamental element is data frames, and i want to traverse this list recursively to do some computation of each data frame, finally to get a neste

相关标签:
3条回答
  • 2020-12-20 15:35

    1. wrap in proto

    When creating your list structure try wrapping the data frames in proto objects:

    library(proto)
    L <- list(a = proto(DF = BOD), b = proto(DF = BOD))
    rapply(L, f = function(.) colSums(.$DF), how = "replace")
    

    giving:

    $a
      Time demand 
        22     89 
    
    $b
      Time demand 
        22     89 
    

    Wrap the result of your function in a proto object too if you want to further rapply it;

    f <- function(.) proto(result = colSums(.$DF))
    out <- rapply(L, f = f, how = "replace")
    str(out)
    

    giving:

    List of 2
     $ a:proto object 
     .. $ result: Named num [1:2] 22 89 
     ..  ..- attr(*, "names")= chr [1:2] "Time" "demand" 
     $ b:proto object 
     .. $ result: Named num [1:2] 22 89 
     ..  ..- attr(*, "names")= chr [1:2] "Time" "demand" 
    

    2. write your own rapply alternative

    recurse <- function (L, f) {
        if (inherits(L, "data.frame")) f(L)
        else lapply(L, recurse, f)
    }
    
    L <- list(a = BOD, b = BOD)
    recurse(L, colSums)
    

    This gives:

    $a
      Time demand 
        22     89 
    
    $b
      Time demand 
        22     89 
    

    ADDED: second approach

    0 讨论(0)
  • 2020-12-20 15:42

    Update June 2020:

    You can now also use rrapply in the rrapply-package, (an extended version of base rapply). rrapply includes an additional argument dfaslist, which if set to FALSE does not treat data.frames as list-like objects by recursing into their individual columns:

    library(rrapply)
    
    L <- list(a = BOD, b = BOD)
    
    ## apply f to data.frames 
    rrapply(L, f = colSums, dfaslist = FALSE)
    #> $a
    #>   Time demand 
    #>     22     89 
    #> 
    #> $b
    #>   Time demand 
    #>     22     89
    
    ## apply f to individual columns of data.frames
    rrapply(L, f = function(x, .xname) if(.xname == "demand") scale(x) else x)
    #> $a
    #>   Time     demand
    #> 1    1 -1.4108974
    #> 2    2 -0.9789900
    #> 3    3  0.8998070
    #> 4    4  0.2519460
    #> 5    5  0.1655645
    #> 6    7  1.0725699
    #> 
    #> $b
    #>   Time     demand
    #> 1    1 -1.4108974
    #> 2    2 -0.9789900
    #> 3    3  0.8998070
    #> 4    4  0.2519460
    #> 5    5  0.1655645
    #> 6    7  1.0725699
    
    0 讨论(0)
  • 2020-12-20 15:49

    Handling list computation at a specific depth:

    recursive_lapply <- function (data, fun, depth = 1L) {
      stopifnot(inherits(data, "list"))
      stopifnot(depth >= 1)
      f <- function(data, fun, where = integer()) {
        if (length(where) == depth) {
          fun(data)
        } else {
          res <- lapply(seq_along(data), function(i) {f(data[[i]], fun, where = c(where, i))})
          names(res) <- names(data)
          res
        }
      }
      f(data, fun)
    }
    

    example computation:

    d <- list(
      A = list(a = list(
        a1 = data.table::data.table(x = 11:15, y = 10:14),
        a2 = data.table::data.table(x = 1:5, y = 0:4)
      )),
      B = list(b = list(
        b1 = data.table::data.table(x = 7, y = 8),
        b2 = data.table::data.table(x = 9, y = 10)
      ))
    )
    
    > recursive_lapply(d, function(data) data[, "z":= x + y], 3)
    $A
    $A$a
    $A$a$a1
        x  y  z
    1: 11 10 21
    2: 12 11 23
    3: 13 12 25
    4: 14 13 27
    5: 15 14 29
    
    $A$a$a2
       x y z
    1: 1 0 1
    2: 2 1 3
    3: 3 2 5
    4: 4 3 7
    5: 5 4 9
    
    $B
    $B$b
    $B$b$b1
       x y  z
    1: 7 8 15
    
    $B$b$b2
       x  y  z
    1: 9 10 19
    
    0 讨论(0)
提交回复
热议问题