Printing utf8 strings in Sublime Text's console with Windows

前端 未结 3 1035
自闭症患者
自闭症患者 2021-01-13 07:10

When running this code with python myscript.py from Windows console cmd.exe (i.e. outside of Sublime Text), it works:

# co         


        
相关标签:
3条回答
  • 2021-01-13 07:27

    This is a long answer full of gory details, but the TL;DR version is that this appears to be a bug in Sublime Text 2 (in particular in it's exec command).

    There are instructions below on how to patch Sublime in order to potentially solve the problem (it worked in all of my tests at least) if upgrading to Sublime Text 3 is not an option, as Sublime 3 has an enhanced exec command.


    Something to note is that the error you're seeing in the form of:

    [Decode error - output not utf-8]

    is generated by Sublime as it's adding data to the output panel and not by Python. Even with the fix outlined below, it may still be necessary (based on system setup and/or platform in use) to include the env setting as mentioned in your question, since that tells Python to generate its output in UTF-8 regardless of what it thinks it should do.


    For the purposes of the following tests, I installed Sublime Text 2 and Python 2.7.14 on my Windows 7 machine. This machine already has Python 3 installed on it and added to the PATH, so I installed this version into C:\Python27-64 as indicated in your sample build file and left it out of the path.

    With the exception of installing PackageResourceViewer and bumping up the default font size, Sublime is otherwise stock.

    The test script is the following, slightly modified from the version outlined in your question:

    # coding: utf8
    import sys
    
    print(sys.version)
    print("Café")
    

    Since everything is stock, the Build System in Tools > Build System is set to Automatic, and trying to run the build with Ctrl+B produces the following output:

    3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)]
    [Decode error - output not utf-8]
    [Finished in 0.1s]
    

    This makes sense because as mentioned above Python 3 is on my path but Python 2 is not, and so it it's picking Python 3.

    The default Python.sublime-build is the following:

    {
        "cmd": ["python", "-u", "$file"],
        "file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
        "selector": "source.python"
    }
    

    Using PackgeResourceViewer, I opened up the file and modified it to invoke the Python 2 interpreter directly:

    {
        "cmd": ["C:\\Python27-64\\python.exe", "-u", "$file"],
        "file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
        "selector": "source.python"
    }
    

    With this in place, the build results look like this:

    2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 20:25:58) [MSC v.1500 64 bit (AMD64)]
    Café
    [Finished in 0.1s]
    

    Notice that it's running Python 2, but it's also properly displaying the data now, without having to modify anything.

    That's somewhat curious and I must admit I went down a few rabbit holes on this because it seemed to work right off the bat. However, if you comment out the print of sys.version:

    # coding: utf8
    import sys
    
    #print(sys.version)
    print("Café")
    

    It stops working:

    [Decode error - output not utf-8]
    [Decode error - output not utf-8]
    [Finished in 0.1s]
    

    Alternatively, if you modify slightly the text that's being printed so that it doesn't end on the accented character:

    # coding: utf8
    import sys
    
    # print(sys.version)
    print("Café au lait")
    

    Now it works as you might expect:

    Café au lait
    [Finished in 0.1s]
    

    I believe this to be a bug in the exec command that ships with Sublime Text in the Default package. In particular, it decodes data just prior to it being inserted into the build results, and so is potentially sensitive to where the buffer cutoffs happen when the data is being read.

    Conversely, Sublime Text 3 has a modified version of the exec command which (among other enhancements) uses an incremental decoder at the point where the data is read from the pipe, and doesn't exhibit this issue.

    Modifying the exec command in Sublime 2 to also use incremental decoding appears to fix the problem, although I will admit that I didn't do any exhaustive testing of this.

    I have created a public gist that contains a modified version of the exec.py file that provides the exec command used by the build system, along with instructions on how to apply it.

    If you use that, your existing build system (and even the default) should work find for you, barring what I mentioned above that you may still need to use the env setting in the build to force the Python interpreter to actually output UTF-8 in case it's not.

    0 讨论(0)
  • 2021-01-13 07:29

    I have found a possible fix: add the encoding parameter in the Python.sublime-build file:

    {
    "cmd": ["python", "-u", "$file"],
    "file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
    "selector": "source.python",
    "encoding": "cp1252",
    ...
    

    Note: "encoding": "latin1" seems to work as well, but - I don't know why - "encoding": "utf8" does not work, even if the .py file is UTF8, even if Python 3 uses UTF8, etc. Mystery!


    Edit: This works now:

    {
      "cmd": ["python", "-u", "$file"],
      "file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
      "selector": "source.python",
      "encoding": "utf8",
      "env": {"PYTHONIOENCODING": "utf-8", "LANG": "en_US.UTF-8"},
    }
    

    Linked topic:

    • Setting the correct encoding when piping stdout in Python and this answer in particular

    • How to change the preferred encoding in Sublime Text 3 for MacOS for the env trick.

    0 讨论(0)
  • 2021-01-13 07:44

    A possible quick fix :

    # coding: utf8
    import json
    d = json.loads("""{"mykey": {"readme": "Café"}}""", encoding='latin1')
    print d['mykey']['readme'].encode('latin1')
    
    0 讨论(0)
提交回复
热议问题