Add column to the end of Pandas DataFrame containing average of previous data

前端未结

关注

 4  1297

I have a DataFrame ave_data that contains the following:

ave_data

Time        F7           F8            F9  
00:00:00    43.005593    -56.509746


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  [愿得一人]        
                
              
                            
                2021-02-12 14:31
              
            
            
                                                                       
@LaangeHaare or anyone else who is curious, I just tested it and the copy part of the accepted answer seems unnecessary (maybe I am missing something...)

so you could simplify this with:

df['average'] = df.mean(numeric_only=True, axis=1)


I would have simply added this as a comment but don't have the reputation
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2021-02-12 14:36
              
            
            
                                                                       
df.assign is specifically for this purpose. It returns a copy to avoid changing the original dataframe and/or raising SettingWithCopyWarning. It works as follows:

data_with_ave = ave_data.assign(average = ave_data.mean(axis=1, numeric_only=True))


This function can also create multiple columns at the same time:

data_with_ave = ave_data.assign(
                    average = ave_data.mean(axis=1, numeric_only=True),
                    median = ave_data.median(axis=1, numeric_only=True)
)


As of pandas 0.36, you can even reference a column just created to create another:

data_with_ave = ave_data.assign(
                    average = ave_data.mean(axis=1, numeric_only=True),
                    isLarge = lambda df: df['average'] > 10
)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  心在旅途        
                
              
                            
                2021-02-12 14:50
              
            
            
                                                                       
You can take a copy of your df using copy() and then just call mean and pass params axis=1 and numeric_only=True so that the mean is calculated row-wise and to ignore non-numeric columns, when you do the following the column is always added at the end:

In [68]:

summary_ave_data = df.copy()
summary_ave_data['average'] = summary_ave_data.mean(numeric_only=True, axis=1)
summary_ave_data
Out[68]:
                 Time         F7         F8         F9    average
0 2015-07-29 00:00:00  43.005593 -56.509746  25.271271   3.922373
1 2015-07-29 01:00:00  55.114918 -59.173852  31.849262   9.263443
2 2015-07-29 02:00:00  63.990762 -64.699492  52.426017  17.239096

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别那么骄傲        
                
              
                            
                2021-02-12 14:52
              
            
            
                                                                       
In common case if you would like to use specific columns, you can use:

df['average'] = df[['F7','F8']].mean(axis=1)


where axis=1 stands for rowwise action (using column values for each row to calculate the mean in 'average' column)

Then you may want to sort by this column:

df.sort_values(by='average',ascending=False, inplace=True)


where inplace=True stands for applying action to dataframe instead of calculating on the copy.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复