Selecting top N rows for each group based on value in column

后端 未结 4 382
南方客
南方客 2021-01-15 00:49

I have dataframe like below :-

x<-c(3,2,1,8,7,11,10,9,7,5,4)
y<-c(\"a\",\"a\",\"a\", \"b\",\"b\",\"c\",\"c\",\"c\",\"c\",\"c\",\"c\")
z<-c(2,2,2,1,1         


        
相关标签:
4条回答
  • 2021-01-15 01:02

    A solution with base R:

    # df is split according to y, then we keep only the top "z" value (after ordering x) 
    # and rbind everything back together:
    do.call(rbind, 
            lapply(split(df, df$y), 
                   function(df1) df1[order(df1$x, decreasing=TRUE), ][1:unique(df1$z), ]))
    #     x y z
    #a.1  3 a 2
    #a.2  2 a 2
    #b    8 b 1
    #c.6 11 c 3
    #c.7 10 c 3
    #c.8  9 c 3
    

    EDIT:
    A much more direct way (still in base R) provided in comment by @mt1022:

    df[ave(1:nrow(df), df$y, FUN = seq_along) <= df$z, ]
    #   x y z
    #1  3 a 2
    #2  2 a 2
    #4  8 b 1
    #6 11 c 3
    #7 10 c 3
    #8  9 c 3
    
    0 讨论(0)
  • 2021-01-15 01:05

    One approach with data.table:

    library(data.table)
    setDT(df)
    df[,.(inc=seq_len(.N)<=z,x,z),by=.(y)][inc==T ,-2]
    #   y  x z
    #1: a  3 2
    #2: a  2 2
    #3: b  8 1
    #4: c 11 3
    #5: c 10 3
    #6: c  9 3
    
    0 讨论(0)
  • 2021-01-15 01:10

    A solution with dplyr that uses do:

    df %>%
       group_by(y) %>%
       do(head(.,as.numeric(unique(.$z))))
    
    0 讨论(0)
  • 2021-01-15 01:15

    I'm posting the solution I was looking for using dplyr. It is based on @HNSKD:

    library(dplyr)
    x<-c(3,2,1,8,7,11,10,9,7,5,4)
    y<-c("a","a","a", "b","b","c","c","c","c","c","c")
    z<-c(2,2,2,1,1,3,3,3,3,3,3)
    
    df<-data.frame(x,y,z)
    
    df %>% group_by(y) %>% slice(1:2)
    

    Which returns the first two elements for each y:

    # A tibble: 6 x 3
    # Groups:   y [3]
          x y         z
      <dbl> <fct> <dbl>
    1     3 a         2
    2     2 a         2
    3     8 b         1
    4     7 b         1
    5    11 c         3
    6    10 c         3
    
    0 讨论(0)
提交回复
热议问题