Pandas boolean DataFrame selection ambiguity

后端未结

关注

 3  530

EDIT: Fixed values in tables.

Let\'s say I have a pandas dataframe df:

>>>df
                  a         b         c
        0  0.016367  0.


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  轻奢々        
                
              
                            
                2021-01-15 23:23
              
            
            
                                                                       
Since the logical operators are not overridable in python, numpy and pandas override the bitwise operators.

This means you need to use the bitwise-or operator:

df[(df > 0.5) | (df < 0)]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2021-01-15 23:30
              
            
            
                                                                       
It is not possible for custom types to override the behavior of and and or in Python.  That is, it is not possible for Numpy to say that it wants [0, 1, 1] and [1, 1, 0] to be [0, 1, 0].  This is because of how the and operation short-circuits (see the documentation); in essence, the short-circuiting behavior of and and or means that these operations must work as two separate truth values on the two arguments; they cannot combine their two operands in some way that makes use of data in both operands at once (for instance, to compare the elements componentwise, as would be natural for Numpy).

The solution is to use the bitwise operators & and |.  However, you do have to be careful with this, since the precedence is not what you might expect.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦毁少年i        
                
              
                            
                2021-01-15 23:41
              
            
            
                                                                       
You need to use the bitwise or and put the conditions in parentheses:

df[(df > 0.5) | (df < 0)]


The reason is because it is ambiguous to compare arrays when maybe some of the values in the array satisfy the condition, that is why it becomes ambiguous.

If you called the attribute any then it would evaluate to True.

The parentheses is required due to operator precedence.

Example:

In [23]:

df = pd.DataFrame(randn(5,5))
df
Out[23]:
          0         1         2         3         4
0  0.320165  0.123677 -0.202609  1.225668  0.327576
1 -0.620356  0.126270  1.191855  0.903879  0.214802
2 -0.974635  1.712151  1.178358  0.224962 -0.921045
3 -1.337430 -1.225469  1.150564 -1.618739 -1.297221
4 -0.093164 -0.928846  1.035407  1.766096  1.456888
In [24]:

df[(df > 0.5) | (df < 0)]
Out[24]:
          0         1         2         3         4
0       NaN       NaN -0.202609  1.225668       NaN
1 -0.620356       NaN  1.191855  0.903879       NaN
2 -0.974635  1.712151  1.178358       NaN -0.921045
3 -1.337430 -1.225469  1.150564 -1.618739 -1.297221
4 -0.093164 -0.928846  1.035407  1.766096  1.456888

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复