Pandas Dataframe nan values not replacing

痴心易碎 提交于 2021-02-19 05:20:22

问题


Trying to replace values in my data frame which are listed as 'nan' (note, not 'NaN')

I've read in an excel file, then tried to replace the nan values like this:

All_items_df = ALL_df[df_items].fillna(' ')

Finally I get an output that still contains 'nan'

All_items_df ['Colour'].head(10)
Out[]: 
7     nan
8     nan
9     nan
10    nan
13    nan
14    nan
15    nan
16    nan
18    nan
19    nan
Name: Colour, dtype: object

Checking the nan values using isna() or isnull().value.all() gives me False for the above values. Why is it not recognising as nan/na values?

All_items_df ['Colour'].isnull().head(10)
Out[123]: 
7     False
8     False
9     False
10    False
13    False
14    False
15    False
16    False
18    False
19    False
Name: Minor Feats, dtype: bool

I'm then writing to a csv file and getting the 'nan' written to the file, even when specifying not to write out nan

All_items_df.to_csv(folderpath + "All_items.csv",encoding="UTF-8", index=False, na_rep='')

回答1:


Your nan appear to be strings, and not actually null values. You can use this code to replace nan to actual null values before proceeding with whatever calculations you are planning on doing:

import numpy as np
df.Colour.replace('nan', np.nan, inplace=True)

Example:

>>> df
  Colour
0    nan
1    nan
2    nan
3   Blue
4    nan

df.Colour.replace('nan', np.nan, inplace=True)
df.fillna('', inplace=True)

>>> df
  Colour
0       
1       
2       
3   Blue
4       



回答2:


Make sure you read your nan values as NaN. You can do this via a parameter in pd.read_excel:

df = pd.read_excel('file.xlsx', na_values=['nan'])

Strangely, by default nan is not considered a NaN value in pd.read_excel:

na_values : scalar, str, list-like, or dict, default None

Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: ‘’, ‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’, ‘-NaN’, ‘-nan’,



来源:https://stackoverflow.com/questions/50685107/pandas-dataframe-nan-values-not-replacing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!