问题
I can't seem to be able to decode UCS-2 BE files (legacy stuff) under Python 3.3, using the built-in open() function (stack trace shows UnicodeDecodeError and contains my readLine() method) - in fact, I wasn't able to find a flag for specifying this encoding.
Using Windows 8, terminal is set to codepage 65001, using 'Lucida Console' fonts.
Code snippet won't be of too much help, I guess:
def display_resource():
f = open(r'D:\workspace\resources\JP.res', encoding=<??tried_several??>)
while True:
line = f.readline()
if len(line) == 0:
break
Appreciating any insight into this issue.
回答1:
UCS-2 is UTF-16, really, for any codepoint that was assigned when it was still called UCS-2 in any case.
Open it with encoding='utf16'
. If there is no BOM (the Byte order mark, 2 bytes at the start, for BE that'd be \xfe\xff
), then use encoding='utf_16_be'
to force a byte order.
来源:https://stackoverflow.com/questions/14488346/python-3-reading-ucs-2-be-file