How to find the last occurrence of a certain observation in grouped data in R?

前端 未结 2 1591
粉色の甜心
粉色の甜心 2021-01-20 22:05

I have data that is grouped using dplyr in R. I would like to find the last occurrence of observations (\'B\') equal to or greater than 1 (1, 2, 3 or 4) in each group (\'A\'

相关标签:
2条回答
  • 2021-01-20 22:32

    Here is a solution that does not require loading external packages:

    df <- structure(list(A = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
    2L, 3L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor"), 
    B = c(2L, 2L, 1L, 0L, 3L, 3L, 3L, 0L, 0L, 1L, 1L, 1L), day = c(1L, 
    2L, 5L, 8L, 1L, 4L, 6L, 7L, 9L, 2L, 3L, 4L)), .Names = c("A", 
    "B", "day"), class = "data.frame", row.names = c(NA, -12L))
    
    x <- split(df, df$A, drop = TRUE)
    
    tp <- lapply(x, function(k) {
      tmp <- k[k$B >0,]
      k$last <- tmp$day[length(tmp$day)]
      k
    })
    
    do.call(rbind, tp)
    
             A B day last
    #a.1  a 2   1    5
    #a.2  a 2   2    5
    #a.3  a 1   5    5
    #a.4  a 0   8    5
    #b.5  b 3   1    6
    #b.6  b 3   4    6
    #b.7  b 3   6    6
    #b.8  b 0   7    6
    #b.9  b 0   9    6
    #c.10 c 1   2    4
    #c.11 c 1   3    4
    #c.12 c 1   4    4
    
    0 讨论(0)
  • 2021-01-20 22:50

    We can try

    library(data.table)
    setDT(df1)[, last := day[tail(which(B>=1),1)] , A]
    df1
    #    A B day last
    # 1: a 2   1    5
    # 2: a 2   2    5
    # 3: a 1   5    5
    # 4: a 0   8    5
    # 5: b 3   1    6
    # 6: b 3   4    6
    # 7: b 3   6    6
    # 8: b 0   7    6
    # 9: b 0   9    6
    #10: c 1   2    4
    #11: c 1   3    4
    #12: c 1   4    4
    

    Or using dplyr

    library(dplyr)
    df1 %>%
       group_by(A) %>%
       mutate(last = day[max(which(B>=1))])
    

    Or use the last function from dplyr (as @docendo discimus suggested)

    df1 %>%
       group_by(A) %>%
       mutate(last= last(day[B>=1]))
    

    For the second question,

    setDT(df1)[, dayafter:= if(all(!!B)) NA_integer_  else 
                 day[max(which(B!=0))+1L] , A]
    #    A B day dayafter
    # 1: a 2   1        8
    # 2: a 2   2        8
    # 3: a 1   5        8
    # 4: a 0   8        8
    # 5: b 3   1        7
    # 6: b 3   4        7
    # 7: b 3   6        7
    # 8: b 0   7        7
    # 9: b 0   9        7
    #10: c 1   2       NA
    #11: c 1   3       NA
    #12: c 1   4       NA
    
    0 讨论(0)
提交回复
热议问题