问题
I have a data set in which there is a column known as Native Country which contain around 30000
records. Some are missing represented by NaN
so I thought to fill it with mode()
value. I wrote something like this:
data['Native Country'].fillna(data['Native Country'].mode(), inplace=True)
However when I do a count of missing values:
for col_name in data.columns:
print ("column:",col_name,".Missing:",sum(data[col_name].isnull()))
It is still coming up with the same number of NaN
values for the column Native Country.
回答1:
Just call first element of series:
data['Native Country'].fillna(data['Native Country'].mode()[0], inplace=True)
or you can do the same with assisgnment:
data['Native Country'] = data['Native Country'].fillna(data['Native Country'].mode()[0])
回答2:
Be careful, NaN may be the mode of your dataframe: in this case, you are replacing NaN with another NaN.
来源:https://stackoverflow.com/questions/42789324/pandas-fillna-mode