问题
I have a df:
dog1 dog2 cat1 cat2 ant1 ant2
0 1 2 3 4 5 6
1 1 2 3 4 0 0
2 3 3 3 3 3 3
3 4 3 2 1 1 0
I want to add a new column based on the following conditions:
if max(dog1, dog2) > max(cat1, cat2) > max(ant1, ant2) -----> 2
elif max(dog1, dog2) > max(cat1, cat2) -----> 1
elif max(dog1, dog2) < max(cat1, cat2) < max(ant1, ant2) -----> -2
elif max(dog1, dog2) < max(cat1, cat2) -----> -1
else -----> 0
So it should become this:
dog1 dog2 cat1 cat2 ant1 ant2 new
0 1 2 3 4 5 6 -2
1 1 2 3 4 0 0 -1
2 3 3 3 3 3 3 0
3 4 3 2 1 1 0 2
I know how to do it with straightforward condition, but not this kind with max. What's the best way to do it?
回答1:
You can use .max(axis=1)
function in pandas for it:
conditions = [
(df[['dog1','dog2']].max(axis=1) > df[['cat1','cat2']].max(axis=1)) & (df[['cat1','cat2']].max(axis=1) > df[['ant1','ant2']].max(axis=1)),
(df[['dog1','dog2']].max(axis=1) > df[['cat1','cat2']].max(axis=1)),
(df[['dog1','dog2']].max(axis=1) < df[['cat1','cat2']].max(axis=1)) & (df[['cat1','cat2']].max(axis=1) < df[['ant1','ant2']].max(axis=1)),
(df[['dog1','dog2']].max(axis=1) < df[['cat1','cat2']].max(axis=1))]
choices = [2,1,-2,-1]
df['new'] = np.select(conditions, choices, default=0)
output:
dog1 dog2 cat1 cat2 ant1 ant2 new
0 1 2 3 4 5 6 -2
1 1 2 3 4 0 0 -1
2 3 3 3 3 3 3 0
3 4 3 2 1 1 0 2
回答2:
You can use apply
Documentation
def newrow(dog1,dog2,cat1,cat2,ant1,ant2):
if max(dog1, dog2) > max(cat1, cat2) > max(ant1, ant2):
return 2
elif max(dog1, dog2) > max(cat1, cat2):
return 1
elif max(dog1, dog2) < max(cat1, cat2) < max(ant1, ant2):
return -2
elif max(dog1, dog2) < max(cat1, cat2):
return -1
return 0
df['new'] = df.apply(lambda x: newrow(*x), axis=1)
The new df will be
dog1 dog2 cat1 cat2 ant1 ant2 new
0 1 2 3 4 5 6 -2
1 1 2 3 4 0 0 -1
2 3 3 3 3 3 3 0
3 4 3 2 1 1 0 2
回答3:
It seems you looking for np.maximum(). Try to find it out at numpy maximum Hope it help.
来源:https://stackoverflow.com/questions/62526606/pandas-conditional-creation-of-a-dataframe-column-based-on-multiple-conditions