Access lapply index names inside FUN

前端 未结 12 2194
自闭症患者
自闭症患者 2020-11-22 15:04

Is there a way to get the list index name in my lapply() function?

n = names(mylist)
lapply(mylist, function(list.elem) { cat(\"What is the name of this list         


        
相关标签:
12条回答
  • 2020-11-22 15:24

    Just loop in the names.

    sapply(names(mylist), function(n) { 
        doSomething(mylist[[n]])
        cat(n, '\n')
    }
    
    0 讨论(0)
  • 2020-11-22 15:25

    @ferdinand-kraft gave us a great trick and then tells us we shouldn't use it because it's undocumented and because of the performance overhead.

    I can't argue much with the first point but I'd like to note that the overhead should rarely be a concern.

    let's define active functions so we don't have to call the complex expression parent.frame()$i[] but only .i(), We will also create .n() to access the name, which should work for both base and purrr functionals (and probably most others as well).

    .i <- function() parent.frame(2)$i[]
    # looks for X OR .x to handle base and purrr functionals
    .n <- function() {
      env <- parent.frame(2)
      names(c(env$X,env$.x))[env$i[]]
    }
    
    sapply(cars, function(x) paste(.n(), .i()))
    #>     speed      dist 
    #> "speed 1"  "dist 2"
    

    Now let's benchmark a simple function Which pastes the items of a vector to their index, using different approaches (this operations can of course be vectorized using paste(vec, seq_along(vec)) but that's not the point here).

    We define a benchmarking function and a plotting function and plot the results below :

    library(purrr)
    library(ggplot2)
    benchmark_fun <- function(n){
      vec <- sample(letters,n, replace = TRUE)
      mb <- microbenchmark::microbenchmark(unit="ms",
                                          lapply(vec, function(x)  paste(x, .i())),
                                          map(vec, function(x) paste(x, .i())),
                                          lapply(seq_along(vec), function(x)  paste(vec[[x]], x)),
                                          mapply(function(x,y) paste(x, y), vec, seq_along(vec), SIMPLIFY = FALSE),
                                          imap(vec, function(x,y)  paste(x, y)))
      cbind(summary(mb)[c("expr","mean")], n = n)
    }
    
    benchmark_plot <- function(data, title){
      ggplot(data, aes(n, mean, col = expr)) + 
        geom_line() +
        ylab("mean time in ms") +
        ggtitle(title) +
        theme(legend.position = "bottom",legend.direction = "vertical")
    }
    
    plot_data <- map_dfr(2^(0:15), benchmark_fun)
    benchmark_plot(plot_data[plot_data$n <= 100,], "simplest call for low n")
    

    benchmark_plot(plot_data,"simplest call for higher n")
    

    Created on 2019-11-15 by the reprex package (v0.3.0)

    The drop at the start of the first chart is a fluke, please ignore it.

    We see that the chosen answer is indeed faster, and for a decent amount of iterations our .i() solutions are indeed slower, the overhead compared to the chosen answer is about 3 times the overhead of using purrr::imap(), and amount to about, 25 ms for 30k iterations, so I lose about 1 ms per 1000 iterations, 1 sec per million. That's a small cost for convenience in my opinion.

    0 讨论(0)
  • 2020-11-22 15:25

    Just write your own custom lapply function

    lapply2 <- function(X, FUN){
      if( length(formals(FUN)) == 1 ){
        # No index passed - use normal lapply
        R = lapply(X, FUN)
      }else{
        # Index passed
        R = lapply(seq_along(X), FUN=function(i){
          FUN(X[[i]], i)
        })
      }
    
      # Set names
      names(R) = names(X)
      return(R)
    }
    

    Then use like this:

    lapply2(letters, function(x, i) paste(x, i))
    
    0 讨论(0)
  • 2020-11-22 15:28

    Tommy's answer applies to named vectors but I got the idea you were interested in lists. And it seems as though he were doing an end-around because he was referencing "x" from the calling environment. This function uses only the parameters that were passed to the function and so makes no assumptions about the name of objects that were passed:

    x <- list(a=11,b=12,c=13)
    lapply(x, function(z) { attributes(deparse(substitute(z)))$names  } )
    #--------
    $a
    NULL
    
    $b
    NULL
    
    $c
    NULL
    #--------
     names( lapply(x, function(z) { attributes(deparse(substitute(z)))$names  } ))
    #[1] "a" "b" "c"
     what_is_my_name <- function(ZZZ) return(deparse(substitute(ZZZ)))
     what_is_my_name(X)
    #[1] "X"
    what_is_my_name(ZZZ=this)
    #[1] "this"
     exists("this")
    #[1] FALSE
    
    0 讨论(0)
  • 2020-11-22 15:29

    Both @caracals and @Tommy are good solutions and this is an example including list´s and data.frame´s.
    r is a list of list´s and data.frame´s (dput(r[[1]] at the end).

    names(r)
    [1] "todos"  "random"
    r[[1]][1]
    $F0
    $F0$rst1
       algo  rst  prec  rorac prPo pos
    1  Mean 56.4 0.450 25.872 91.2 239
    6  gbm1 41.8 0.438 22.595 77.4 239
    4  GAM2 37.2 0.512 43.256 50.0 172
    7  gbm2 36.8 0.422 18.039 85.4 239
    11 ran2 35.0 0.442 23.810 61.5 239
    2  nai1 29.8 0.544 52.281 33.1 172
    5  GAM3 28.8 0.403 12.743 94.6 239
    3  GAM1 21.8 0.405 13.374 68.2 239
    10 ran1 19.4 0.406 13.566 59.8 239
    9  svm2 14.0 0.385  7.692 76.2 239
    8  svm1  0.8 0.359  0.471 71.1 239
    
    $F0$rst5
       algo  rst  prec  rorac prPo pos
    1  Mean 52.4 0.441 23.604 92.9 239
    7  gbm2 46.4 0.440 23.200 83.7 239
    6  gbm1 31.2 0.416 16.421 79.5 239
    5  GAM3 28.8 0.403 12.743 94.6 239
    4  GAM2 28.2 0.481 34.815 47.1 172
    11 ran2 26.6 0.422 18.095 61.5 239
    2  nai1 23.6 0.519 45.385 30.2 172
    3  GAM1 20.6 0.398 11.381 75.7 239
    9  svm2 14.4 0.386  8.182 73.6 239
    10 ran1 14.0 0.390  9.091 64.4 239
    8  svm1  6.2 0.370  3.584 72.4 239
    

    The objective is to unlist all lists, putting the sequence of list´s names as a columns to identify the case.

    r=unlist(unlist(r,F),F)
    names(r)
    [1] "todos.F0.rst1"  "todos.F0.rst5"  "todos.T0.rst1"  "todos.T0.rst5"  "random.F0.rst1" "random.F0.rst5"
    [7] "random.T0.rst1" "random.T0.rst5"
    

    Unlist the lists but not the data.frame ´s.

    ra=Reduce(rbind,Map(function(x,y) cbind(case=x,y),names(r),r))
    

    Map puts the sequence of names as a column. Reduce join all data.frame´s.

    head(ra)
                case algo  rst  prec  rorac prPo pos
    1  todos.F0.rst1 Mean 56.4 0.450 25.872 91.2 239
    6  todos.F0.rst1 gbm1 41.8 0.438 22.595 77.4 239
    4  todos.F0.rst1 GAM2 37.2 0.512 43.256 50.0 172
    7  todos.F0.rst1 gbm2 36.8 0.422 18.039 85.4 239
    11 todos.F0.rst1 ran2 35.0 0.442 23.810 61.5 239
    2  todos.F0.rst1 nai1 29.8 0.544 52.281 33.1 172
    

    P.S. r[[1]]:

        structure(list(F0 = structure(list(rst1 = structure(list(algo = c("Mean", 
        "gbm1", "GAM2", "gbm2", "ran2", "nai1", "GAM3", "GAM1", "ran1", 
        "svm2", "svm1"), rst = c(56.4, 41.8, 37.2, 36.8, 35, 29.8, 28.8, 
        21.8, 19.4, 14, 0.8), prec = c(0.45, 0.438, 0.512, 0.422, 0.442, 
        0.544, 0.403, 0.405, 0.406, 0.385, 0.359), rorac = c(25.872, 
        22.595, 43.256, 18.039, 23.81, 52.281, 12.743, 13.374, 13.566, 
        7.692, 0.471), prPo = c(91.2, 77.4, 50, 85.4, 61.5, 33.1, 94.6, 
        68.2, 59.8, 76.2, 71.1), pos = c(239L, 239L, 172L, 239L, 239L, 
        172L, 239L, 239L, 239L, 239L, 239L)), .Names = c("algo", "rst", 
        "prec", "rorac", "prPo", "pos"), row.names = c(1L, 6L, 4L, 7L, 
        11L, 2L, 5L, 3L, 10L, 9L, 8L), class = "data.frame"), rst5 = structure(list(
            algo = c("Mean", "gbm2", "gbm1", "GAM3", "GAM2", "ran2", 
            "nai1", "GAM1", "svm2", "ran1", "svm1"), rst = c(52.4, 46.4, 
            31.2, 28.8, 28.2, 26.6, 23.6, 20.6, 14.4, 14, 6.2), prec = c(0.441, 
            0.44, 0.416, 0.403, 0.481, 0.422, 0.519, 0.398, 0.386, 0.39, 
            0.37), rorac = c(23.604, 23.2, 16.421, 12.743, 34.815, 18.095, 
            45.385, 11.381, 8.182, 9.091, 3.584), prPo = c(92.9, 83.7, 
            79.5, 94.6, 47.1, 61.5, 30.2, 75.7, 73.6, 64.4, 72.4), pos = c(239L, 
            239L, 239L, 239L, 172L, 239L, 172L, 239L, 239L, 239L, 239L
            )), .Names = c("algo", "rst", "prec", "rorac", "prPo", "pos"
        ), row.names = c(1L, 7L, 6L, 5L, 4L, 11L, 2L, 3L, 9L, 10L, 8L
        ), class = "data.frame")), .Names = c("rst1", "rst5")), T0 = structure(list(
            rst1 = structure(list(algo = c("Mean", "ran1", "GAM1", "GAM2", 
            "gbm1", "svm1", "nai1", "gbm2", "svm2", "ran2"), rst = c(22.6, 
            19.4, 13.6, 10.2, 9.6, 8, 5.6, 3.4, -0.4, -0.6), prec = c(0.478, 
            0.452, 0.5, 0.421, 0.423, 0.833, 0.429, 0.373, 0.355, 0.356
            ), rorac = c(33.731, 26.575, 40, 17.895, 18.462, 133.333, 
            20, 4.533, -0.526, -0.368), prPo = c(34.4, 52.1, 24.3, 40.7, 
            37.1, 3.1, 14.4, 53.6, 54.3, 116.4), pos = c(195L, 140L, 
            140L, 140L, 140L, 195L, 195L, 140L, 140L, 140L)), .Names = c("algo", 
            "rst", "prec", "rorac", "prPo", "pos"), row.names = c(1L, 
            9L, 3L, 4L, 5L, 7L, 2L, 6L, 8L, 10L), class = "data.frame"), 
            rst5 = structure(list(algo = c("gbm1", "ran1", "Mean", "GAM1", 
            "GAM2", "svm1", "nai1", "svm2", "gbm2", "ran2"), rst = c(17.6, 
            16.4, 15, 12.8, 9, 6.2, 5.8, -2.6, -3, -9.2), prec = c(0.466, 
            0.434, 0.435, 0.5, 0.41, 0.8, 0.44, 0.346, 0.345, 0.337), 
                rorac = c(30.345, 21.579, 21.739, 40, 14.754, 124, 23.2, 
                -3.21, -3.448, -5.542), prPo = c(41.4, 54.3, 35.4, 22.9, 
                43.6, 2.6, 12.8, 57.9, 62.1, 118.6), pos = c(140L, 140L, 
                195L, 140L, 140L, 195L, 195L, 140L, 140L, 140L)), .Names = c("algo", 
            "rst", "prec", "rorac", "prPo", "pos"), row.names = c(5L, 
            9L, 1L, 3L, 4L, 7L, 2L, 8L, 6L, 10L), class = "data.frame")), .Names = c("rst1", 
        "rst5"))), .Names = c("F0", "T0"))
    
    0 讨论(0)
  • 2020-11-22 15:32

    Let's say we want to calculate length of each element.

    mylist <- list(a=1:4,b=2:9,c=10:20)
    mylist
    
    $a
    [1] 1 2 3 4
    
    $b
    [1] 2 3 4 5 6 7 8 9
    
    $c
     [1] 10 11 12 13 14 15 16 17 18 19 20
    

    If the aim is to just label the resulting elements, then lapply(mylist,length) or below works.

    sapply(mylist,length,USE.NAMES=T)
    
     a  b  c 
     4  8 11 
    

    If the aim is to use the label inside the function, then mapply() is useful by looping over two objects; the list elements and list names.

    fun <- function(x,y) paste0(length(x),"_",y)
    mapply(fun,mylist,names(mylist))
    
         a      b      c 
     "4_a"  "8_b" "11_c" 
    
    0 讨论(0)
提交回复
热议问题