Converting Unicode sequences to a string in Python 3

前端 未结 1 1389
半阙折子戏
半阙折子戏 2020-12-20 00:42

In parsing an HTML response to extract data with Python 3.4 on Kubuntu 15.10 in the Bash CLI, using print() I am getting output that looks like

相关标签:
1条回答
  • 2020-12-20 01:33

    It appears your input uses backslash as an escape character, you should unescape the text before passing it to json:

    >>> foobar = '{\\"body\\": \\"\\\\u05e9\\"}'
    >>> import re
    >>> json_text = re.sub(r'\\(.)', r'\1', foobar) # unescape
    >>> import json
    >>> print(json.loads(json_text)['body'])
    ש
    

    Don't use 'unicode-escape' encoding on JSON text; it may produce different results:

    >>> import json
    >>> json_text = '["\\ud83d\\ude02"]'
    >>> json.loads(json_text)
    ['                                                                    
    0 讨论(0)
提交回复
热议问题