问题
I apologize if this question has been asked earlier. I am still not clear about encoding in python3.2.
I am reading a csv(encoded in UTF-8 w/o BOM) and I have French accents in the csv.
Here is the code to opening and reading the csv file:
csvfile = open(in_file, 'r', encoding='utf-8')
fieldnames = ("id","locale","message")
reader = csv.DictReader(csvfile,fieldnames,escapechar="\\")
for row in reader:
if row['id'] == id and row['locale'] == locale:
out = row['message'];
I am returning the message(out) as Json
jsonout = json.dumps(out, ensure_ascii=True)
return HttpResponse(jsonout,content_type="application/json; encoding=utf-8")
However when I preview the result I get the accent e(French) being replaced by \u00e9 .
Can you please advice on what I am doing wrong and what should I do so that the json output shows the proper e with accent.
Thanks
回答1:
You're doing nothing wrong (and neither is Python).
Python's json module simply takes the safe route and escapes non-ascii characters. This is a valid way of representing such characters in json, and any conforming parser will resurrect the proper Unicode characters when parsing the string:
>>> import json
>>> json.dumps({'Crêpes': 5})
'{"Cr\\u00eapes": 5}'
>>> json.loads('{"Cr\\u00eapes": 5}')
{'Crêpes': 5}
Don't forget that json is just a representation of your data, and both "ê"
and "\\u00ea"
are valid json representations of the string ê
. Conforming json parsers should handle both correctly.
It is possible to disable this behaviour though, see the json.dump documentation:
>>> json.dumps({'Crêpes': 5}, ensure_ascii=False)
'{"Crêpes": 5}'
回答2:
In respect to this answer, setting ensure_ascii=False
renders the special characters in your printouts. On the other hand, marcelm's answer is still correct, as no information is lost in those encodings.
来源:https://stackoverflow.com/questions/35582528/python-encoding-and-json-dumps