问题
I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df.
Then when I try to export it to a csv:
df.to_csv("path",header=True,index=False)
I get this error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 20: ordinal not in range(128)
Can someone suggest a way to fix this and what it means?
Thanks
回答1:
You have unicode
values in your DataFrame. Files store bytes, which means all unicode
have to be encoded into bytes before they can be stored in a file. You have to specify an encoding, such as utf-8
. For example,
df.to_csv('path', header=True, index=False, encoding='utf-8')
If you don't specify an encoding, then the encoding used by df.to_csv
defaults to ascii
in Python2, or utf-8
in Python3.
回答2:
Adding an answer to help myself google it later:
One trick that helped me is to encode a problematic series first, then decode it back to utf-8. Like:
df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))
This would get the dataframe to print correctly too.
来源:https://stackoverflow.com/questions/31331358/unicode-encode-error-when-writing-pandas-df-to-csv