Logical operation on two columns of a dataframe

£可爱£侵袭症+ 提交于 2019-11-29 11:05:04


In pandas, I'd like to create a computed column that's a boolean operation on two other columns.

In pandas, it's easy to add together two numerical columns. I'd like to do something similar with logical operator AND. Here's my first try:

In [1]: d = pandas.DataFrame([{'foo':True, 'bar':True}, {'foo':True, 'bar':False}, {'foo':False, 'bar':False}])

In [2]: d
     bar    foo
0   True   True
1  False   True
2  False  False

In [3]: d.bar and d.foo   ## can't
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

So I guess logical operators don't work quite the same way as numeric operators in pandas. I tried doing what the error message suggests and using bool():

In [258]: d.bar.bool() and d.foo.bool()  ## spoiler: this doesn't work either
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I found a way that works by casting the boolean columns to int, adding them together and evaluating as a boolean.

In [4]: (d.bar.apply(int) + d.foo.apply(int)) > 0  ## Logical OR
0     True
1     True
2    False
dtype: bool

In [5]: (d.bar.apply(int) + d.foo.apply(int)) > 1  ## Logical AND
0     True
1    False
2    False
dtype: bool

This is convoluted. Is there a better way?


Yes there is a better way! Just use the & element-wise logical and operator:

d.bar & d.foo

0     True
1    False
2    False
dtype: bool

