R group by and aggregate - return relative rank within groups using plyr

后端未结

关注

 2  530

UPDATE: I have a data frame \'test\' that look like this:

    session_id  seller_feedback_score
1   1   282470
2   1   275258
3   1   275258
4   1   275258
5


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  粉色の甜心        
                
              
                            
                2021-01-16 23:16
              
            
            
                                                                       
One option:

library(dplyr)
df %>% group_by(session_id) %>% 
  mutate(rank = dense_rank(-seller_feedback_score))


dense_rank is "like min_rank, but with no gaps between ranks" so I negated the seller_feedback_score column in order to turn it into something like max_rank (which doesn't exist in dplyr).

If you want the ranks with gaps so that you reach 21 for the lowest in your case, you can use min_rank instead of dense_rank:

library(dplyr)
df %>% group_by(session_id) %>% 
    mutate(rank = min_rank(-seller_feedback_score))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  花落未央        
                
              
                            
                2021-01-16 23:22
              
            
            
                                                                       
From data.table 1.9.5 on, frank() (for fast rank) function is exported. The interface is similar to base::rank, but it implements dense rank in addition to all the ranking methods base::rank provides, and it also works on a list in addition to vectors. You can install it by following the instructions here.

require(data.table) ## 1.9.5+
setDT(df)[, 
    rank := frank(-seller_feedback_score, ties.method="dense"), 
by=session_id]


As @David points out, perhaps what you want is rank = "first" or "min"?? Not sure...

setDT(df)[, 
    rank := frank(-seller_feedback_score, ties.method="first"), ## or "min" or "max"
by=session_id]


Anyhow, it must be plentiful fast. Here's a benchmark against base R:

require(data.table)
set.seed(45L)
val = sample(1e4, 1e7, TRUE)
system.time(ans1 <- rank(val, ties.method = "min"))
#    user  system elapsed 
#  16.771   0.199  17.035 
system.time(an2 <- frank(val, ties.method = "min"))
#    user  system elapsed 
#   0.532   0.013   0.550 
identical(ans1, ans2) # [1] TRUE

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复