问题
I want to filter the rows of a dataframe that contains values less than ,say 10.
import numpy as np
import pandas as pd
from pprint import pprint
df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))
df = df[df <10]
gives,
A B C D
0 5.0 NaN NaN NaN
1 NaN NaN NaN NaN
2 0.0 NaN 6.0 NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 6.0 NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN 7.0
8 NaN NaN NaN NaN
9 NaN NaN NaN NaN
Expected:
0 5 57 87 95
2 0 80 6 82
5 6 33 74 75
7 71 44 60 7
Any suggestions on how to obtain expected result?
回答1:
Use:
np.random.seed(21)
df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))
If want filter by any value of condition, is necessary add DataFrame.any for test at least one True
of boolean DataFrame
:
df1 = df[(df < 10).any(axis=1)]
print (df1)
A B C D
0 73 79 56 4
5 5 18 70 50
7 5 80 35 91
9 6 84 90 28
print (df < 10)
A B C D
0 False False False True
1 False False False False
2 False False False False
3 False False False False
4 False False False False
5 True False False False
6 False False False False
7 True False False False
8 False False False False
9 True False False False
print ((df < 10).any(axis=1))
0 True
1 False
2 False
3 False
4 False
5 True
6 False
7 True
8 False
9 True
dtype: bool
来源:https://stackoverflow.com/questions/58128354/filtering-rows-of-a-dataframe-based-on-values-in-columns