问题
I work with large pandas dataframes on a daily basis, which gets fed information that we parse from a webAPI (xml encoding is utf-8) local to our network.
After I feed the dataframe and export as a csv file I start getting encoding errors (local machine is cp1252) which I've had to deal with the past few weeks.
The solution I finally found was [here][1] under tangfucious's response.
df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))
a line of code that takes a string and encodes it using .encode=('unicode_escape')
, decoding into utf-8 after.
Can someone explain to me how this code works? Unfortunately, I'm a noob and new to SO so I wasn't able to comment on his response
What is the purpose of unicode-escape under the hood (aside from the obvious, adding a \ to each unicode code point).? How does this affect decoding into utf-8? Why is this necessary? Isn't it always better to encode/decode using the same encoding?
Is there another use in using 'unicode_escape'?
来源:https://stackoverflow.com/questions/41967354/can-someone-explain-to-me-the-use-of-unicode-escape-as-an-encoding-argument-in-p