Saving utf-8 texts in json.dumps as UTF8, not as \u escape sequence

前端 未结 12 949
说谎
说谎 2020-11-21 23:25

sample code:

>>> import json
>>> json_string = json.dumps(\"ברי צקלה\")
>>> print json_string
\"\\u05d1\\u05e8\\u05d9 \\u05e6\\u05         


        
相关标签:
12条回答
  • 2020-11-22 00:17

    Using ensure_ascii=False in json.dumps is the right direction to solve this problem, as pointed out by Martijn. However, this may raise an exception:

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 1: ordinal not in range(128)
    

    You need extra settings in either site.py or sitecustomize.py to set your sys.getdefaultencoding() correct. site.py is under lib/python2.7/ and sitecustomize.py is under lib/python2.7/site-packages.

    If you want to use site.py, under def setencoding(): change the first if 0: to if 1: so that python will use your operation system's locale.

    If you prefer to use sitecustomize.py, which may not exist if you haven't created it. simply put these lines:

    import sys
    reload(sys)
    sys.setdefaultencoding('utf-8')
    

    Then you can do some Chinese json output in utf-8 format, such as:

    name = {"last_name": u"王"}
    json.dumps(name, ensure_ascii=False)
    

    You will get an utf-8 encoded string, rather than \u escaped json string.

    To verify your default encoding:

    print sys.getdefaultencoding()
    

    You should get "utf-8" or "UTF-8" to verify your site.py or sitecustomize.py settings.

    Please note that you could not do sys.setdefaultencoding("utf-8") at interactive python console.

    0 讨论(0)
  • 2020-11-22 00:21

    Peters' python 2 workaround fails on an edge case:

    d = {u'keyword': u'bad credit  \xe7redit cards'}
    with io.open('filename', 'w', encoding='utf8') as json_file:
        data = json.dumps(d, ensure_ascii=False).decode('utf8')
        try:
            json_file.write(data)
        except TypeError:
            # Decode data to Unicode first
            json_file.write(data.decode('utf8'))
    
    UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 25: ordinal not in range(128)
    

    It was crashing on the .decode('utf8') part of line 3. I fixed the problem by making the program much simpler by avoiding that step as well as the special casing of ascii:

    with io.open('filename', 'w', encoding='utf8') as json_file:
      data = json.dumps(d, ensure_ascii=False, encoding='utf8')
      json_file.write(unicode(data))
    
    cat filename
    {"keyword": "bad credit  çredit cards"}
    
    0 讨论(0)
  • 2020-11-22 00:21

    As of Python 3.7 the following code works fine:

    from json import dumps
    result = {"symbol": "ƒ"}
    json_string = dumps(result, sort_keys=True, indent=2, ensure_ascii=False)
    print(json_string)
    
    

    Output:

    {"symbol": "ƒ"}
    
    0 讨论(0)
  • 2020-11-22 00:23

    To write to a file

    import codecs
    import json
    
    with codecs.open('your_file.txt', 'w', encoding='utf-8') as f:
        json.dump({"message":"xin chào việt nam"}, f, ensure_ascii=False)
    

    To print to stdout

    import json
    print(json.dumps({"message":"xin chào việt nam"}, ensure_ascii=False))
    
    0 讨论(0)
  • 2020-11-22 00:26

    Here's my solution using json.dump():

    def jsonWrite(p, pyobj, ensure_ascii=False, encoding=SYSTEM_ENCODING, **kwargs):
        with codecs.open(p, 'wb', 'utf_8') as fileobj:
            json.dump(pyobj, fileobj, ensure_ascii=ensure_ascii,encoding=encoding, **kwargs)
    

    where SYSTEM_ENCODING is set to:

    locale.setlocale(locale.LC_ALL, '')
    SYSTEM_ENCODING = locale.getlocale()[1]
    
    0 讨论(0)
  • 2020-11-22 00:26

    If you are loading JSON string from a file & file contents arabic texts. Then this will work.

    Assume File like: arabic.json

    { 
    "key1" : "لمستخدمين",
    "key2" : "إضافة مستخدم"
    }
    

    Get the arabic contents from the arabic.json file

    with open(arabic.json, encoding='utf-8') as f:
       # deserialises it
       json_data = json.load(f)
       f.close()
    
    
    # json formatted string
    json_data2 = json.dumps(json_data, ensure_ascii = False)
    

    To use JSON Data in Django Template follow below steps:

    # If have to get the JSON index in Django Template file, then simply decode the encoded string.
    
    json.JSONDecoder().decode(json_data2)
    

    done! Now we can get the results as JSON index with arabic value.

    0 讨论(0)
提交回复
热议问题