I\'ve some problem with \"decode\" method in python 3.3.4. This is my code:
for lines in open(\'file\',\'r\'):
decodedLine = lines.decode(\'ISO-8859-1\')
One encodes strings, and one decodes bytes.
You should read bytes from the file and decode them:
for lines in open('file','rb'):
decodedLine = lines.decode('ISO-8859-1')
line = decodedLine.split('\t')
Luckily open
has an encoding argument which makes this easy:
for decodedLine in open('file', 'r', encoding='ISO-8859-1'):
line = decodedLine.split('\t')
This works for me smoothly to read Chinese text in Python 3.6. First, convert str to bytes, and then decode them.
for l in open('chinese2.txt','rb'):
decodedLine = l.decode('gb2312')
print(decodedLine)
open
already decodes to Unicode in Python 3 if you open in text mode. If you want to open it as bytes, so that you can then decode, you need to open with mode 'rb'.