问题
I have a dataframe as follows
Name Age
0 Tom 20
1 nick 21
2
3 krish 19
4 jack 18
5
6 jill 26
7 nick
Desired output is
Name Age
0 Tom 20
1 nick 21
3 krish 19
4 jack 18
6 jill 26
7 nick
The index should not be changed and if possible would be nice if I don't have to convert empty strings to NaN. It should be removed only if all the columns have ''
empty strings
回答1:
You can do:
# df.eq('') compare every cell of `df` to `''`
# .all(1) or .all(axis=1) checks if all cells on rows are True
# ~ is negate operator.
mask = ~df.eq('').all(1)
# equivalently, `ne` for `not equal`,
# mask = df.ne('').any(axis=1)
# mask is a boolean series of same length with `df`
# this is called boolean indexing, similar to numpy's
# which chooses only rows corresponding to `True`
df = df[mask]
Or in one line:
df = df[~df.eq('').all(1)]
回答2:
If they are NaN
we can do dropna
or we replace
the empty to NaN
df.mask(df.eq('')).dropna(thresh=1)
Out[151]:
Name Age
0 Tom 20
1 nick 21
3 krish 19
4 jack 18
6 jill 26
7 nick NaN
回答3:
Empty strings are actually interpreted as False
, so removing rows with only empty strings is as easy as keeping rows in which at least one field is not empty (i.e. interpreted as True
) :
df[df.any(axis=1)]
or shortly
df[df.any(1)]
来源:https://stackoverflow.com/questions/61964116/delete-rows-from-pandas-dataframe-if-all-its-columns-have-empty-string