Saving utf-8 texts in json.dumps as UTF8, not as \u escape sequence

前端 未结 12 936
说谎
说谎 2020-11-21 23:25

sample code:

>>> import json
>>> json_string = json.dumps(\"ברי צקלה\")
>>> print json_string
\"\\u05d1\\u05e8\\u05d9 \\u05e6\\u05         


        
12条回答
  •  渐次进展
    2020-11-22 00:17

    Using ensure_ascii=False in json.dumps is the right direction to solve this problem, as pointed out by Martijn. However, this may raise an exception:

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 1: ordinal not in range(128)
    

    You need extra settings in either site.py or sitecustomize.py to set your sys.getdefaultencoding() correct. site.py is under lib/python2.7/ and sitecustomize.py is under lib/python2.7/site-packages.

    If you want to use site.py, under def setencoding(): change the first if 0: to if 1: so that python will use your operation system's locale.

    If you prefer to use sitecustomize.py, which may not exist if you haven't created it. simply put these lines:

    import sys
    reload(sys)
    sys.setdefaultencoding('utf-8')
    

    Then you can do some Chinese json output in utf-8 format, such as:

    name = {"last_name": u"王"}
    json.dumps(name, ensure_ascii=False)
    

    You will get an utf-8 encoded string, rather than \u escaped json string.

    To verify your default encoding:

    print sys.getdefaultencoding()
    

    You should get "utf-8" or "UTF-8" to verify your site.py or sitecustomize.py settings.

    Please note that you could not do sys.setdefaultencoding("utf-8") at interactive python console.

提交回复
热议问题