问题
I used to think I had this whole encoding stuff pretty figured out. I seem to be wrong because I can't explain what's happening here.
What I was trying to do is to use the tabulate module to print a nicely formatted table using
from tabulate import tabulate
s = tabulate([[1,2],[3,4]], ["x","y"], tablefmt="fancy_grid")
print(s)
in IPython 3.5.0's interactive console under Windows 10. I expected the result to be
╒═════╤═════╕
│ x │ y │
╞═════╪═════╡
│ 1 │ 2 │
├─────┼─────┤
│ 3 │ 4 │
╘═════╧═════╛
but instead, I got a
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>
Puzzled, I tried to find out where the problem was and looked at the repr
of the string:
In [15]: s
Out[15]: '╒═════╤═════╕\n│ x │ y │\n╞═════╪═════╡\n│ 1 │ 2 │\n├─────┼─────┤\n│ 3 │ 4 │\n╘═════╧═════╛'
Hmm, all the characters can be displayed by the terminal (even the first one that triggered the error).
Just checking some details:
In [16]: sys.stdout.encoding
Out[16]: 'cp850'
In [17]: s.encode("cp850")
[...]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>
So which encoding is the terminal using? Python says that it's cp850
, and it tells me that cp850
doesn't have a ╒
character (which is true, it's one of the characters from cp437
that had to make room for accented letters), but I can see it in the terminal window!
To complicate things further, when using the native Python console instead of IPython, the error seems more understandable:
>>> s
'\u2552═══\u2564═══\u2555\n│ 1 │ 2 │\n├───┼───┤\n│ 3 │ 4 │\n\u2558═══\u2567═══\u255b'
>>> sys.stdout.encoding
'cp850'
>>> print(s)
Traceback (most recent call last):
[...]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>
So at least Python is consistent, but what's happening with IPython?
回答1:
IPython uses OEM code page in the interactive mode like any other Python console program:
In [1]: '\u2552'
ERROR - failed to write data to stream: <_io.TextIOWrapper name='<stdout>' mode=
'w' encoding='cp850'>
Out[1]:
In [2]: !chcp
Active code page: 850
The result changes if pyreadline
is installed (it enables colors in the IPython console among other things):
In [1]: '\u2552'
Out[1]: '╒'
In [2]: import sys
In [3]: sys.stdout.encoding
Out[3]: 'cp850'
In [4]: !chcp
Active code page: 850
Once pyreadline
has been installed, IPython's sys.displayhook
writes the result to readline's console object that uses WriteConsoleW()
Windows Unicode API that allows to print even unencodable in the current code page Unicode characters (to see them, you might need to configure a (TrueType) font such as Lucida Console in the Windows console).
来源:https://stackoverflow.com/questions/33960660/which-character-encoding-is-the-ipython-terminal-using