Python 3: reading UCS-2 (BE) file

问题

I can't seem to be able to decode UCS-2 BE files (legacy stuff) under Python 3.3, using the built-in open() function (stack trace shows UnicodeDecodeError and contains my readLine() method) - in fact, I wasn't able to find a flag for specifying this encoding.

Using Windows 8, terminal is set to codepage 65001, using 'Lucida Console' fonts.

Code snippet won't be of too much help, I guess:

def display_resource():
    f = open(r'D:\workspace\resources\JP.res', encoding=<??tried_several??>)
    while True:
        line = f.readline()
        if len(line) == 0:
            break

Appreciating any insight into this issue.

回答1:

UCS-2 is UTF-16, really, for any codepoint that was assigned when it was still called UCS-2 in any case.

Open it with encoding='utf16'. If there is no BOM (the Byte order mark, 2 bytes at the start, for BE that'd be \xfe\xff), then use encoding='utf_16_be' to force a byte order.

来源：https://stackoverflow.com/questions/14488346/python-3-reading-ucs-2-be-file

标签

file

python-3.x

ucs2

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!