pandas read_csv and setting na_values to any string in the csv file [duplicate]

问题

data.csv

1, 22, 3432

1, 23, \N

2, 24, 54335

2, 25, 3928

I have a csv file of data that is collected from a device. Every now and then the device doesn't relay information and outputs '\N'. I want to treat these as NaN and did this by doing

read_csv(data.csv, na_values=['\\N'])

which worked fine. However, I would prefer to have not only this string turned to NaN but any string that is in the csv file just in case the data I get in the future has a different string.

Is it possible to me to make any changes in the argument so it covers all strings?

回答1:

You have to manually pass all the keywords as a list or dict to na_values

na_values : list-like or dict, default None

Alternatively, use pd.to_numeric and set errors to coerce to convert all values to numeric after reading the csv file.

sample input df:

    A   B        
0   1   2         
1   0  \N      
2  \N   8       
3  11   5       
4  11  Kud   

df = df.apply(pd.to_numeric, errors='coerce')

output:

     A     B        
0    1     2         
1    0   NaN      
2  NaN     8       
3   11     5       
4   11   NaN

来源：https://stackoverflow.com/questions/52229804/pandas-read-csv-and-setting-na-values-to-any-string-in-the-csv-file

标签

python

pandas

dataframe

data-cleaning

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!