I can read a csv file in which there is a column containing Chinese characters (other columns are English and numbers). However, Chinese characters don\'t display correctly.
I just remembered that the source dataset was created using encoding='GBK'
, so I tried again using
data06_16 = pd.read_csv("../data/stocks1542monthly.csv", encoding="GBK")
Now, I can see all the Chinese characters.
Thanks guys!
Try this
df = pd.read_csv(path, engine='python', encoding='utf-8-sig')
I see here three possible issues:
1) You can try this:
import codecs
x = codecs.open("testdata.csv", "r", "utf-8")
2) Another possibility can be theoretically this:
import pandas as pd
df = pd.DataFrame(pd.read_csv('testdata.csv',encoding='utf-8'))
3) Maybe you should convert you csv file into utf-8 before importing with Python (for example in Notepad++)? It can be a solution for one-time-import, not for automatic process, of course.