How to get IDLE to accept paste of Unicode characters?

后端 未结 1 1222
臣服心动
臣服心动 2020-12-17 05:19

Oftentimes when I\'m working interactively in IDLE, I\'d like to paste a Unicode string into the IDLE window. It appears to paste properly but generates an error immediately

相关标签:
1条回答
  • 2020-12-17 06:02

    I finally figured out a way. Since the sources to IDLE are part of the distribution you can make a couple of quick edits to enable the capability. The files will typically be found in C:\Python27\Lib\idlelib.

    The first step is to prevent IDLE from trying to encode all those nice Unicode characters into a character set that can't handle them. This is controlled by IOBinding.py. Edit the file, find the section after if sys.platform == 'win32': and comment out this line:

    #encoding = locale.getdefaultlocale()[1]
    

    Now add this line after it:

    encoding = 'utf-8'
    

    I was hoping that there would be a way to override this with an environment variable or something, but getdefaultlocale calls directly into a Win32 function that gets the globally configured Windows mbcs encoding.

    This is half the battle, now we need to get the command line interpreter to recognize that the input bytes are UTF-8 encoded. It didn't appear that there was a way to pass an encoding into the interpreter, so I came up with the mother of all hacks. Maybe someone with a little more patience can come up with a better way, but this works for now. The input is processed in PyShell.py, in the runsource function. Change the following:

        if isinstance(source, types.UnicodeType):
            from idlelib import IOBinding
            try:
                source = source.encode(IOBinding.encoding)
            except UnicodeError:
                self.tkconsole.resetoutput()
                self.write("Unsupported characters in input\n")
                return
    

    To:

        from idlelib import IOBinding  # line moved
        if isinstance(source, types.UnicodeType):
            try:
                source = source.encode(IOBinding.encoding)
            except UnicodeError:
                self.tkconsole.resetoutput()
                self.write("Unsupported characters in input\n")
                return
        source = "#coding=%s\n%s" % (IOBinding.encoding, source)  # line added
    

    We're taking advantage of PEP 263 to specify the encoding for each line of input provided to the interpreter.

    Update: In Python 2.7.10 it is no longer necessary to make the change in PyShell.py, it already works properly if the encoding is set to utf-8. Unfortunately I haven't found a way to bypass the change in IOBinding.py.

    0 讨论(0)
提交回复
热议问题