Using Pandas groupby to calculate many slopes

后端未结

关注

 2  397

Some illustrative data in a DataFrame (MultiIndex) format:

|entity| year |value| +------+------+-----+ | a | 1999 | 2 | | | 2004 | 5 | | b | 20


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2021-01-07 00:48
              
            
            
                                                                       
A function can be applied to a groupby with the apply function. The passed function in this case linregress. Please see below:

In [4]: x = pd.DataFrame({'entity':['a','a','b','b','b'],
                          'year':[1999,2004,2003,2007,2014],
                          'value':[2,5,3,2,7]})

In [5]: x
Out[5]: 
  entity  value  year
0      a      2  1999
1      a      5  2004
2      b      3  2003
3      b      2  2007
4      b      7  2014


In [6]: from scipy.stats import linregress

In [7]: x.groupby('entity').apply(lambda v: linregress(v.year, v.value)[0])
Out[7]: 
entity
a    0.600000
b    0.403226

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  耶瑟儿～        
                
              
                            
                2021-01-07 01:01
              
            
            
                                                                       
You can do this via the iterator ability of the group by object. It seems easier to do it by dropping the current index and then specifying the group by 'entity'.

A list comprehension is then an easy way to quickly work through all the groups in the iterator. Or use a dict comprehension to get the labels in the same place (you can then stick the dict into a pd.DataFrame easily).

import pandas as pd
import scipy.stats

#This is your data
test = pd.DataFrame({'entity':['a','a','b','b','b'],'year':[1999,2004,2003,2007,2014],'value':[2,5,3,2,7]}).set_index(['entity','year'])

#This creates the groups
groupby = test.reset_index().groupby(['entity'])

#Process groups by list comprehension
slopes = [scipy.stats.linregress(group.year, group.value)[0] for name, group in groupby]
#Process groups by dict comprehension
slopes = {name:[scipy.stats.linregress(group.year, group.value)[0]] for name, group in groupby}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复