how to compute a new column based on the values of other columns in pandas - python

前端未结

关注

 4  1129

Let\'s say my data frame contains these data:

>>> df = pd.DataFrame({\'a\':[\'l1\',\'l2\',\'l1\',\'l2\',\'l1\',\'l2\'],
                       \'b\'


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  臣服心动        
                
              
                            
                2021-01-13 01:55
              
            
            
                                                                       
You can just use logical operators. I'm not sure why you're using strings of 1 and 2 rather than ints, but here's a solution.  The astype at the end converts it from boolean to 0's and 1's.

df['c'] = (((df['a'] == 'l1')&(df['b']=='1'))|((df['a'] == 'l2')&(df['b']=='2'))).astype(int)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2021-01-13 01:56
              
            
            
                                                                       
df['c'] = (df.a.apply(lambda x: x[1:])==df.b).astype(int)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2021-01-13 01:59
              
            
            
                                                                       
You can also use the string methods.

df['c'] = (df.a.str[-1] == df.b).astype(int)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2021-01-13 02:01
              
            
            
                                                                       
df = pd.DataFrame({'a': numpy.random.choice(['l1', 'l2'], 1000000),
                   'b': numpy.random.choice(['1', '2'], 1000000)})


A fast solution assuming only two distinct values:

%timeit df['c'] = ((df.a == 'l1') == (df.b == '1')).astype(int)


10 loops, best of 3: 178 ms per loop

@Viktor Kerkes:

%timeit df['c'] = (df.a.str[-1] == df.b).astype(int)


1 loops, best of 3: 412 ms per loop

@user1470788:

%timeit df['c'] = (((df['a'] == 'l1')&(df['b']=='1'))|((df['a'] == 'l2')&(df['b']=='2'))).astype(int)


1 loops, best of 3: 363 ms per loop

@herrfz

%timeit df['c'] = (df.a.apply(lambda x: x[1:])==df.b).astype(int)


1 loops, best of 3: 387 ms per loop
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复