问题
I am trying to write data into an excel sheet using the utf-8 encoding. Right now i get the following error with complete traceback -->
Traceback (most recent call last):
File "C:\Users\varun\Desktop\Python_testfiles\Reports Automation\Txn.py", line 142, in <module>
domesticsheet.write(row, j, txn[payuid][j])
File "C:\Python27\lib\site-packages\xlwt\Worksheet.py", line 1030, in write
self.row(r).write(c, label, style)
File "C:\Python27\lib\site-packages\xlwt\Row.py", line 240, in write
StrCell(self.__idx, col, style_index, self.__parent_wb.add_str(label))
File "C:\Python27\lib\site-packages\xlwt\Workbook.py", line 326, in add_str
return self.__sst.add_str(s)
File "C:\Python27\lib\site-packages\xlwt\BIFFRecords.py", line 24, in add_str
s = unicode(s, self.encoding)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 11: invalid start byte
The main issue is that i get this error randomly. I ran the code for data corresponding to some other day and it ran just fine. I tried using "utf-16" and "ascii" encoding as well instead of utf - 8 but the error persists(the error statement changed, though.)
Is there some way i can get rid of this error? Also, i would like to know why this error comes(I am a beginner at python). Any help will be highly appreciated. Is it necessary to even provide some encoding type?
If you need to see the code it is as follows-->
filehandler[booknumber] = xlwt.Workbook(encoding = "utf-8")
domesticsheet = filehandler[booknumber].add_sheet("Domestic_txn" + `booknumber`, cell_overwrite_ok=True)
for k in range(len(header)):
domesticsheet.write(0,k,header[k]);
for j in range(len(txn[payuid])):
domesticsheet.write(row, j, txn[payuid][j])
回答1:
Bytes in the range 0x80 - 0xBF are reserved in UTF-8 encoding as continuation bytes.
0x00 - 0x7F - Single byte sequence, backwards compatible with ASCII
0x80 - 0xBF - Continuation byte for multi byte sequences
0xC0 - 0xDF - Starter byte for two byte sequence
0xE0 - 0xEF - Starter byte for three byte sequence
0xF0 - 0xF7 - Starter byte for four byte sequence
0xF8 - 0xFB - Starter byte for five byte sequence (overlong encoding)
0xFC - 0xFD - Starter byte for six byte sequence (overlong encoding)
0xFE - 0xFF - Illegal bytes
What Python is complaining about is that your data doesn't contain a valid starter byte before a continuation byte.
来源:https://stackoverflow.com/questions/29095360/unicodedecodeerror-utf8-codec-cant-decode-byte-0x80-in-position-11-invalid