How to compare two columns of the same dataframe?

后端未结

关注

 3  545

I have a dataframe like this:

 match_id inn1  bat  bowl  runs1 inn2   runs2   is_score_chased
    1     1     KKR  RCB    222  2      82          1
    2


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  无人及你        
                
              
                            
                2021-01-05 07:27
              
            
            
                                                                       
This is a good case for using apply.

Here there is an example of using apply on two columns.

You can adapt it to your question with this:

def f(x):    
   return 'yes' if x['run1'] > x['run2'] else 'no'

df['is_score_chased'] = df.apply(f, axis=1)


However, I would suggest filling your column with booleans so you can make it more simple

def f(x):    
   return x['run1'] > x['run2']


And also using lambdas so you make it in one line

df['is_score_chased'] = df.apply(lambda x: x['run1'] > x['run2'], axis=1)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  深忆病人        
                
              
                            
                2021-01-05 07:30
              
            
            
                                                                       
You can more easily use np.where. 

high_scores1['is_score_chased'] = np.where(high_scores1['runs1']>=high_scores1['runs2'], 
                                           'yes', 'no')


Typically, if you find yourself trying to iterate explicitly as you were to set a column, there is an abstraction like apply or where which will be both faster and more concise. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2021-01-05 07:45
              
            
            
                                                                       
You need to reference the fact that you are iterating through the dataframe, so;

for i in (high_scores1):
  if(high_scores1['runs1'][i]>=high_scores1['runs2'][i]):
      high_scores1['is_score_chased'][i]='yes'
  else:
      high_scores1['is_score_chased'][i]='no' 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复