Decode an ENCODED unicode string in Python

前端 未结 1 2018
醉话见心
醉话见心 2020-12-19 21:56

I need to decode a \"UNICODE\" encoded string:

>>> id = u\'abcdß\'
>>> encoded_id = id.encode(\'utf-8\')
>>> encoded_id
\'abcd\\xc         


        
相关标签:
1条回答
  • You have UTF-8 encoded data (there is no such thing as UNICODE encoded data).

    Encode the unicode value to Latin-1, then decode from UTF8:

    encoded_id.encode('latin1').decode('utf8')
    

    Latin 1 maps the first 255 unicode points one-on-one to bytes.

    Demo:

    >>> encoded_id = u'abcd\xc3\x9f'
    >>> encoded_id.encode('latin1').decode('utf8')
    u'abcd\xdf'
    >>> print encoded_id.encode('latin1').decode('utf8')
    abcdß
    
    0 讨论(0)
提交回复
热议问题