R group by and aggregate - return relative rank within groups using plyr

后端 未结 2 529
时光说笑
时光说笑 2021-01-16 22:50

UPDATE: I have a data frame \'test\' that look like this:

    session_id  seller_feedback_score
1   1   282470
2   1   275258
3   1   275258
4   1   275258
5         


        
相关标签:
2条回答
  • 2021-01-16 23:16

    One option:

    library(dplyr)
    df %>% group_by(session_id) %>% 
      mutate(rank = dense_rank(-seller_feedback_score))
    

    dense_rank is "like min_rank, but with no gaps between ranks" so I negated the seller_feedback_score column in order to turn it into something like max_rank (which doesn't exist in dplyr).

    If you want the ranks with gaps so that you reach 21 for the lowest in your case, you can use min_rank instead of dense_rank:

    library(dplyr)
    df %>% group_by(session_id) %>% 
        mutate(rank = min_rank(-seller_feedback_score))
    
    0 讨论(0)
  • 2021-01-16 23:22

    From data.table 1.9.5 on, frank() (for fast rank) function is exported. The interface is similar to base::rank, but it implements dense rank in addition to all the ranking methods base::rank provides, and it also works on a list in addition to vectors. You can install it by following the instructions here.

    require(data.table) ## 1.9.5+
    setDT(df)[, 
        rank := frank(-seller_feedback_score, ties.method="dense"), 
    by=session_id]
    

    As @David points out, perhaps what you want is rank = "first" or "min"?? Not sure...

    setDT(df)[, 
        rank := frank(-seller_feedback_score, ties.method="first"), ## or "min" or "max"
    by=session_id]
    

    Anyhow, it must be plentiful fast. Here's a benchmark against base R:

    require(data.table)
    set.seed(45L)
    val = sample(1e4, 1e7, TRUE)
    system.time(ans1 <- rank(val, ties.method = "min"))
    #    user  system elapsed 
    #  16.771   0.199  17.035 
    system.time(an2 <- frank(val, ties.method = "min"))
    #    user  system elapsed 
    #   0.532   0.013   0.550 
    identical(ans1, ans2) # [1] TRUE
    
    0 讨论(0)
提交回复
热议问题