UnicodeEncodeError: 'ascii' codec can't encode character in position 0: ordinal not in range(128)

前端 未结 4 1867
一整个雨季
一整个雨季 2020-12-04 18:00

I\'m working on a Python script that uses the scissor character (9986 - ✂) and I\'m trying to port my code to Mac, but I\'m running into this error.

The scissor char

相关标签:
4条回答
  • 2020-12-04 18:45

    When Python prints and output, it automatically encodes it to the target medium. If it is a file, UTF-8 will be used as default and everyone will be happy, but if it is a terminal, Python will figure out the encoding the terminal is using and will try to encode the output using that one.

    This means that if your terminal is using ascii as encoding, Python is trying to encode scissor char to ascii. Of course, ascii doesn't support it so you get Unicode decode error.

    This is why you always have to explicitly encode your output. Explicit is better than implicit remember? To fix your code you may do:

    import sys
    sys.stdout.buffer.write(chr(9986).encode('utf8'))
    

    This seems a bit hackerish. You can also set PYTHONIOENCODING=utf-8 before executing the script. I'am uncomfortable with both solutions. Probably your console doesn't support utf-8 and you see gibberish. But your program will be behaving correctly.

    What I strongly recommend if you definitely need to show correct output on your console is to set your console to use another encoding, one that support scissor character. (utf-8 perhaps). On Linux, that can be achieve by doing: export lang=UTF_8. On Windows you change the console's code page with chcp. Just figure out how to set utf8 in yours and IMHO that'll be the best solution.


    You can't mix print and sys.stdout.write because they're basically the same. Regarding to your code, the hackerish way would be like this:

    sys.stdout.buffer.write(("|\t "+ chr(9986) +" PySnipt'd " + chr(9986)+" \t|").encode('utf8'))
    

    I suggest you to take a read at the docs to see what's going on under the hood with print function and with sys.stdout: http://docs.python.org/3/library/sys.html#sys.stdin

    Hope this helps!

    0 讨论(0)
  • 2020-12-04 18:45

    test_io_encoding.py output suggests that you should change your locale settings e.g., set LANG=en_US.UTF-8.


    The first error might be due to you are trying to decode a string that is already Unicode. Python 2 tries to encode it using a default character encoding ('ascii') before decoding it using (possibly) different character encoding. The error happens on the encode step:

    >>> u"\u2702".decode() # Python 2
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u2702' in position 0: ordinal not in range(128)
    

    It looks like you are running your script using Python 2 instead of Python 3. You would get:

    >>> "\u2702".decode() # Python 3
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'str' object has no attribute 'decode'
    

    different error otherwise.

    Just drop the .decode() call:

    print("|\t {0} PySnipt'd {0} \t|".format(snipper))
    

    The second issue is due to printing a Unicode string into a pipe:

    $ python3 -c'print("\u2702")'
    ✂
    $ python3 -c'print("\u2702")' | cat
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character '\u2702' in position 0: ordinal not in range(128)
    

    Set appropriate for your purpose PYTHONIOENCODING environment variable:

    $ PYTHONIOENCODING=utf-8 python3 -c'print("\u2702")' | cat
    ✂
    

    the terminal is just displaying this: | b'\xe2\x9c\x82' PySnipt'd b'\xe2\x9c\x82' |

    If snipper is a bytes object then leave the snipper.decode() calls.

    $ python3 -c"print(b'\xe2\x9c\x82'.decode())"
    ✂
    $ python3 -c"print(b'\xe2\x9c\x82'.decode())" | cat
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character '\u2702' in position 0: ordinal not in range(128)
    

    The fix is the same:

    $ PYTHONIOENCODING=utf-8 python3 -c"print(b'\xe2\x9c\x82'.decode())" | cat
    ✂
    
    0 讨论(0)
  • 2020-12-04 18:54

    My locale is set to de_AT.UTF-8 but these lines in /etc/profile were missing:

    export LANG=de_AT.UTF-8
    export LANGUAGE=de_AT.UTF-8
    export LC_ALL=de_AT.UTF-8
    

    logout / login and your problem should be solved

    To verify if all locales are set correctly type locale in your terminal

    The output should be similar to this:

    LANG=de_AT.UTF-8
    LANGUAGE=de_AT.UTF-8
    LC_CTYPE="de_AT.UTF-8"
    LC_NUMERIC="de_AT.UTF-8"
    LC_TIME="de_AT.UTF-8"
    LC_COLLATE="de_AT.UTF-8"
    LC_MONETARY="de_AT.UTF-8"
    LC_MESSAGES="de_AT.UTF-8"
    LC_PAPER="de_AT.UTF-8"
    LC_NAME="de_AT.UTF-8"
    LC_ADDRESS="de_AT.UTF-8"
    LC_TELEPHONE="de_AT.UTF-8"
    LC_MEASUREMENT="de_AT.UTF-8"
    LC_IDENTIFICATION="de_AT.UTF-8"
    LC_ALL=de_AT.UTF-8
    
    0 讨论(0)
  • 2020-12-04 19:03

    in the first line of your file .py you need to add this string, :

    # -- coding: utf-8 --

    and you can also try this:

    print ("|\t ",unichr(9986),"PySnipt'd",unichr(9986),"\t|")

    0 讨论(0)
提交回复
热议问题