问题
I want to print a set of Unicode characters to my command prompt terminal. Even when I enforce the encoding to be "UTF-8" the terminal prints some garbage.
$python -c "import sys; print sys.stdout.write(u'\u2044'.encode('UTF-8'))"
ΓüäNone
$python -c "import sys; print sys.stdout.encoding"
cp437
My default terminal encoding is cp437 and I am trying to override that. The expected output here is Fraction slash ( ⁄ )
http://www.fileformat.info/info/unicode/char/2044/index.htm
The same piece of code works flawlessly in my Mac terminal and it uses UTF-8 as default encoding. Is there a way to display this on Windows as well? The font I use on windows command prompt is consolas.
I want my code to work with any Unicode characters, not just this particular example since the input is a web query result and I have no control over it.
回答1:
You have to use a UTF-8 code page (cp65001) to expect UTF-8 encoded text to display.
Python 3.3 claims to support code page 65001 (UTF-8) on Windows.
C:\>chcp 65001
Active code page: 65001
C:\>python
Python 3.3.0rc1 (v3.3.0rc1:8bb5c7bc46ba, Aug 25 2012, 13:50:30) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print('\u2044')
⁄
Although it is buggy:
>>> print('\u2044')
⁄
>>> print('\u2044'*8)
⁄⁄⁄⁄⁄⁄⁄⁄
��⁄⁄⁄⁄
⁄⁄
��
>>> print('1\u20442 2\u20443 4\u20445')
1⁄2 2⁄3 4⁄5
⁄5
回答2:
Python cannot control the encoding used by your terminal; you'll have to change that somewhere else.
In other words, just because you force python to output UTF-8 encoded text to the terminal, does not mean your terminal will magically start to accept that output as UTF-8 as well.
The Mac OS X terminal has already been configured to work with UTF-8.
On Windows, you can switch the console codepage with the chcp
command:
chcp 65001
where 65001 is the Windows codepage for UTF-8. See Unicode characters in Windows command line - how?
来源:https://stackoverflow.com/questions/12330184/printing-unicode-characters-to-stdout-in-python-prints-wrong-glyphs