Python : UnicodeEncodeError: 'latin-1' codec can't encode character

后端 未结 3 1275
别跟我提以往
别跟我提以往 2021-01-04 05:34

I am at a scenario where I call api and based on the results from api I call database for each record that I in api. My api call return strings and when I make the database

相关标签:
3条回答
  • 2021-01-04 05:57

    If you need Latin-1 encoding, you have several options to get rid of the en-dash or other code points above 255 (characters not included in Latin-1):

    >>> u = u'hello\u2013world'
    >>> u.encode('latin-1', 'replace')    # replace it with a question mark
    'hello?world'
    >>> u.encode('latin-1', 'ignore')     # ignore it
    'helloworld'
    

    Or do your own custom replacements:

    >>> u.replace(u'\u2013', '-').encode('latin-1')
    'hello-world'
    

    If you aren't required to output Latin-1, then UTF-8 is a common and preferred choice. It is recommended by the W3C and nicely encodes all Unicode code points:

    >>> u.encode('utf-8')
    'hello\xe2\x80\x93world'
    
    0 讨论(0)
  • 2021-01-04 06:11

    The unicode character u'\02013' is the "en dash". It is contained in the Windows-1252 (cp1252) character set (with the encoding x96), but not in the Latin-1 (iso-8859-1) character set. The Windows-1252 character set has some more characters defined in the area x80 - x9f, among them the en dash.

    The solution would be for you to choose a different target character set than Latin-1, such as Windows-1252 or UTF-8, or to replace the en dash with a simple "-".

    0 讨论(0)
  • 2021-01-04 06:14

    u.encode('utf-8') converts it to bytes which can then be printed on stdout using sys.stdout.buffer.write(bytes) checkout the displayhook on https://docs.python.org/3/library/sys.html

    0 讨论(0)
提交回复
热议问题