Oftentimes when I\'m working interactively in IDLE, I\'d like to paste a Unicode string into the IDLE window. It appears to paste properly but generates an error immediately
I finally figured out a way. Since the sources to IDLE are part of the distribution you can make a couple of quick edits to enable the capability. The files will typically be found in C:\Python27\Lib\idlelib
.
The first step is to prevent IDLE from trying to encode all those nice Unicode characters into a character set that can't handle them. This is controlled by IOBinding.py
. Edit the file, find the section after if sys.platform == 'win32':
and comment out this line:
#encoding = locale.getdefaultlocale()[1]
Now add this line after it:
encoding = 'utf-8'
I was hoping that there would be a way to override this with an environment variable or something, but getdefaultlocale
calls directly into a Win32 function that gets the globally configured Windows mbcs encoding.
This is half the battle, now we need to get the command line interpreter to recognize that the input bytes are UTF-8 encoded. It didn't appear that there was a way to pass an encoding into the interpreter, so I came up with the mother of all hacks. Maybe someone with a little more patience can come up with a better way, but this works for now. The input is processed in PyShell.py
, in the runsource
function. Change the following:
if isinstance(source, types.UnicodeType):
from idlelib import IOBinding
try:
source = source.encode(IOBinding.encoding)
except UnicodeError:
self.tkconsole.resetoutput()
self.write("Unsupported characters in input\n")
return
To:
from idlelib import IOBinding # line moved
if isinstance(source, types.UnicodeType):
try:
source = source.encode(IOBinding.encoding)
except UnicodeError:
self.tkconsole.resetoutput()
self.write("Unsupported characters in input\n")
return
source = "#coding=%s\n%s" % (IOBinding.encoding, source) # line added
We're taking advantage of PEP 263 to specify the encoding for each line of input provided to the interpreter.
Update: In Python 2.7.10 it is no longer necessary to make the change in PyShell.py
, it already works properly if the encoding is set to utf-8
. Unfortunately I haven't found a way to bypass the change in IOBinding.py
.