问题
I have the following pandas dataframe and I would like to fill the NaNs in columns A-C in a row-wise fashion with values from columns D. Is there an explicit way to do this where I can define that all the NaNs should depend row-wise on values in column D? I couldn't find a way to explicitly do this in fillna().
Note that there are additional columns E-Z which have their own NaNs and may have other rules for filling in NaNs, and should be left untouched.
A B C D E
158 158 158 177 ...
158 158 158 177 ...
NaN NaN NaN 177 ...
158 158 158 177 ...
NaN NaN NaN 177 ...
Would like to have this for columns A-C only:
A B C D E
158 158 158 177 ...
158 158 158 177 ...
177 177 177 177 ...
158 158 158 177 ...
177 177 177 177 ...
Thanks.
回答1:
Using the fillna
function:
df.fillna(axis=1, method='backfill')
will do if there are no NaN's in the other columns.
If there are and you want to leave them untouched, I think the only option in this way is to perform the fillna
on a subset of your dataframe. With example dataframe:
In [45]: df
Out[45]:
A B C D E F
0 158 158 158 177 1 10
1 158 158 158 177 2 20
2 NaN NaN NaN 177 3 30
3 158 158 158 177 NaN 40
4 NaN NaN NaN 177 5 50
In [48]: df[['A', 'B', 'C', 'D']] = df[['A', 'B', 'C', 'D']].fillna(axis=1, method='backfill')
In [49]: df
Out[49]:
A B C D E F
0 158 158 158 177 1 10
1 158 158 158 177 2 20
2 177 177 177 177 3 30
3 158 158 158 177 NaN 40
4 177 177 177 177 5 50
Udate: If you don't want to depend on the column order, you can also specify the values to use to fill for each row (like .fillna(value=df['D']
). The only problem is that this only works for Series (when it is a dataframe, it tries to map the different values to fill to the different columns, not the rows). So with an apply to do it column by column, it works:
In [60]: df[['A', 'B', 'C']].apply(lambda x: x.fillna(value=df['D']))
Out[60]:
A B C
0 158 158 158
1 158 158 158
2 177 177 177
3 158 158 158
4 177 177 177
来源:https://stackoverflow.com/questions/24015379/row-by-row-fillna-with-respect-to-a-specific-column