How to gracefully fallback to `NaN` value while reading integers from a CSV with Pandas?

拈花ヽ惹草 提交于 2019-12-05 07:29:08

Thanks to the comments i realised that there is no NaN for integers, which was very surprising to me. Thus i switched to converting to float:

import pandas as pd
import numpy as np


df = pd.read_csv('my.csv', dtype={ 'my_column': np.float64 })

This gave me an understandable error message with the value of the failing conversion, so that i could add the failing value to the na_values:

df = pd.read_csv('my.csv', dtype={ 'my_column': np.float64 }, na_values=['n/a'])

This way i could finally import the CSV in a way which works with visualisation and statistical functions:

>>>> df['session_planned_os'].dtype
dtype('float64')

Once you are able to spot the right na_values, you can remove the dtype argument from read_csv. Type inference will now happen correctly:

df = pd.read_csv('my.csv', na_values=['n/a'])
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!