EDIT: Fixed values in tables.
Let\'s say I have a pandas dataframe df:
>>>df
a b c
0 0.016367 0.
Since the logical operators are not overridable in python, numpy and pandas override the bitwise operators.
This means you need to use the bitwise-or operator:
df[(df > 0.5) | (df < 0)]
It is not possible for custom types to override the behavior of and
and or
in Python. That is, it is not possible for Numpy to say that it wants [0, 1, 1] and [1, 1, 0]
to be [0, 1, 0]
. This is because of how the and
operation short-circuits (see the documentation); in essence, the short-circuiting behavior of and
and or
means that these operations must work as two separate truth values on the two arguments; they cannot combine their two operands in some way that makes use of data in both operands at once (for instance, to compare the elements componentwise, as would be natural for Numpy).
The solution is to use the bitwise operators &
and |
. However, you do have to be careful with this, since the precedence is not what you might expect.
You need to use the bitwise or and put the conditions in parentheses:
df[(df > 0.5) | (df < 0)]
The reason is because it is ambiguous to compare arrays when maybe some of the values in the array satisfy the condition, that is why it becomes ambiguous.
If you called the attribute any
then it would evaluate to True.
The parentheses is required due to operator precedence.
Example:
In [23]:
df = pd.DataFrame(randn(5,5))
df
Out[23]:
0 1 2 3 4
0 0.320165 0.123677 -0.202609 1.225668 0.327576
1 -0.620356 0.126270 1.191855 0.903879 0.214802
2 -0.974635 1.712151 1.178358 0.224962 -0.921045
3 -1.337430 -1.225469 1.150564 -1.618739 -1.297221
4 -0.093164 -0.928846 1.035407 1.766096 1.456888
In [24]:
df[(df > 0.5) | (df < 0)]
Out[24]:
0 1 2 3 4
0 NaN NaN -0.202609 1.225668 NaN
1 -0.620356 NaN 1.191855 0.903879 NaN
2 -0.974635 1.712151 1.178358 NaN -0.921045
3 -1.337430 -1.225469 1.150564 -1.618739 -1.297221
4 -0.093164 -0.928846 1.035407 1.766096 1.456888