Python 3: How to specify stdin encoding

后端 未结 1 1583
一整个雨季
一整个雨季 2020-11-27 17:04

While porting code from Python 2 to Python 3, I run into this problem when reading UTF-8 text from standard input. In Python 2, this works fine:

for line in          


        
相关标签:
1条回答
  • 2020-11-27 17:44

    Python 3 does not expect ASCII from sys.stdin. It'll open stdin in text mode and make an educated guess as to what encoding is used. That guess may come down to ASCII, but that is not a given. See the sys.stdin documentation on how the codec is selected.

    Like other file objects opened in text mode, the sys.stdin object derives from the io.TextIOBase base class; it has a .buffer attribute pointing to the underlying buffered IO instance (which in turn has a .raw attribute).

    Wrap the sys.stdin.buffer attribute in a new io.TextIOWrapper() instance to specify a different encoding:

    import io
    import sys
    
    input_stream = io.TextIOWrapper(sys.stdin.buffer, encoding='utf-8')
    

    Alternatively, set the PYTHONIOENCODING environment variable to the desired codec when running python.

    From Python 3.7 onwards, you can also reconfigure the existing std* wrappers, provided you do it at the start (before any data has been read):

    # Python 3.7 and newer
    sys.stdin.reconfigure(encoding='utf-8')
    
    0 讨论(0)
提交回复
热议问题