I have dictionary of dictionary which has utf8 encoded keys. I am dumping this dictionary to a file using json
module.
In the file keys are printed in utf8
You are writing JSON; the JSON standard allows for \uxxxx
escape sequences to encode non-ASCII characters. The Python json
module uses this by default.
Switch off the feature by using the ensure_ascii=False
switch when dumping the data:
json.dump(obj, yourfileobject, ensure_ascii=False)
This does mean that the output is no longer encoded to UTF-8 bytes as well; you'll need to use a codecs.open()
managed file for this:
import json
import codecs
with codecs.open('/path/to/file', 'w', encoding='utf8') as output:
json.dump(obj, output, ensure_ascii=False)
Now your unicode characters will be written to the file as UTF-8 encoded bytes instead. When opening the file with another program that decodes UTF-8 again, your codepoints should be displayed again as the same characters.