Specific pandas columns as arguments in new column of df.apply outputs

前端未结

关注

 2  427

渐次进展 2021-01-24 18:37

Given a pandas DataFrame as below:

import pandas as pd
from sklearn.metrics import mean_squared_error

    df = pd.DataFrame.from_dict(  
         {\'row\': [\'a


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   鱼传尺愫
                                             
                
                
                (楼主)
            
              
              
                2021-01-24 19:04
              

            
            
                        
Approach #1

One approach for performance would be to use the underlying array data alongwith NumPy ufuncs, alongwith slicing those two blocks of columns to use those ufuncs in a vectorized manner, like so -

a = df.values
rmse_out = np.sqrt(((a[:,0:3] - a[:,3:6])**2).mean(1))
df['rmse_out'] = rmse_out


Approach #2

Alternative faster way to compute the RMSE values with np.einsum to replace the squared-summation -

diffs = a[:,0:3] - a[:,3:6]
rmse_out = np.sqrt(np.einsum('ij,ij->i',diffs,diffs)/3.0)


Approach #3

Another way to compute rmse_out using the formula : 


  (a - b)^2 = a^2 + b^2 - 2ab


would be to extract the slices :

s0 = a[:,0:3]
s1 = a[:,3:6]


Then, rmse_out would be -

np.sqrt(((s0**2).sum(1) + (s1**2).sum(1) - (2*s0*s1).sum(1))/3.0)


which with einsum becomes -

np.sqrt((np.einsum('ij,ij->i',s0,s0) + \
         np.einsum('ij,ij->i',s1,s1) - \
       2*np.einsum('ij,ij->i',s0,s1))/3.0)




Getting respective column indices 

If you are not sure whether the columns a,b,.. would be in that order or not, we could find those indices with column_index. 

Thus a[:,0:3] would be replaced by a[:,column_index(df, ['a','b','c'])] and a[:,3:6] by a[:,column_index(df, ['d','e','y'])].
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复