pandas.read_csv can't import file with accent mark in path

和自甴很熟 提交于 2019-12-01 22:34:37

Looking in deep, this behavior comes in a combination of Python 3.6 and pandas.read_csv only in Windows systems.

Python 3.6 change Windows filesystem encoding from "mbcs" to "UTF-8". See Python PEP 529. Use sys.getfilesystemencoding() to get the current file system encoding

I get some solutions around this:

1.- Use this code to change all the app to works with the prior Python <= 3.5 encoding ("mbcs")

import sys
sys._enablelegacywindowsfsencoding()

2.- Pass a file pointer to the pandas.read_csv

with open(path2, 'r') as fp:
    df2 = pd.read_csv(fp, delim_whitespace=True, dtype=object)

you can try those lines of code in your notebook/ipython before reading with utf-8 encoding :

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

and then when reading your file use those line as suggest in the comment

pd.read_csv(path1, delim_whitespace=True, dtype=object,encoding='utf-8')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!