Python pandas rank/sort based on another column that differs for each input

前端未结

关注

 2  1006

I would like to come up with the 4th column below based on the first three:

user    job  time  Rank
A   print   1559   2
A   print   1540   2
A   edit    1520


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2021-01-23 23:37
              
            
            
                                                                       
First, assign a new column which contains the minimum time for user-job pairs:

df['min_time'] = df.groupby(['user', 'job'])['time'].transform('min')


Then group by each user and rank them:

df.groupby('user')['min_time'].rank(method='dense').astype(int)
Out: 
0    2
1    2
2    1
3    1
4    3
5    2
6    2
7    2
8    1
9    1
Name: min_time, dtype: int64

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人及你        
                
              
                            
                2021-01-23 23:47
              
            
            
                                                                       
Or you can using 

df1=df1.sort_values(['user','time'],ascending=[True,True])
df1['Rank']=df1.job!=df1.job.shift().fillna('edit')
df1.Rank=df1.groupby('user').Rank.cumsum()+1


  user      job  time  Rank
0    A    print  1559   2.0
1    A    print  1540   2.0
2    A     edit  1520   1.0
3    A     edit  1523   1.0
4    A  deliver  9717   3.0
5    B     edit  1717   2.0
6    B     edit  1716   2.0
7    B     edit  1715   2.0
8    B  deliver  1527   1.0
9    B  deliver  1524   1.0

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复