Converting multiple tab-delimited .txt files into multiple .xls files

走远了吗. 提交于 2019-12-29 08:02:05

问题


I am a newbie to python and I am trying to do what the title above says with the code displayed below. It runs up to the point where I ask to save the xls output. Any help would be very much appreciated.

import glob
import csv
import xlwt

for filename in glob.glob("C:\xxxx\*.txt"):
    wb = xlwt.Workbook()
    sheet = wb.add_sheet('sheet 1')
    newName = filename
    spamReader = csv.reader(open(filename, 'rb'), delimiter=';',quotechar='"')
    for rowx, row in enumerate(spamReader):
        for colx, value in enumerate(row):
            sheet.write(rowx, colx, value)

    wb.save(newName + ".xls")

print "Done"

Traceback (most recent call last):
File "C:/Users/Aline/Desktop/Python_tests/1st_trial.py", line 13, in <module>
wb.save("C:\Users\Aline\Documents\Data2013\consulta_cand_2010\newName.xls")
File "C:\Python27\lib\site-packages\xlwt\Workbook.py", line 662, in save
doc.save(filename, self.get_biff_data())
File "C:\Python27\lib\site-packages\xlwt\Workbook.py", line 637, in get_biff_data
shared_str_table   = self.__sst_rec()
File "C:\Python27\lib\site-packages\xlwt\Workbook.py", line 599, in __sst_rec
return self.__sst.get_biff_record()
File "C:\Python27\lib\site-packages\xlwt\BIFFRecords.py", line 76, in get_biff_record
self._add_to_sst(s)
File "C:\Python27\lib\site-packages\xlwt\BIFFRecords.py", line 91, in _add_to_sst
u_str = upack2(s, self.encoding)
File "C:\Python27\lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc7 in position 4: ordinal not in    range(128)

[edit] This code works.

import glob
import csv
import xlwt

for filename in glob.glob("C:\\Users\\Aline\\Documents\\Data2013\\consulta_cand_2010\\*.txt"):
    spamReader = csv.reader((open(filename, 'rb')), delimiter=';',quotechar='"')
    encoding = 'latin1'
    wb = xlwt.Workbook(encoding=encoding)
    sheet=xlwt.Workbook()
    sheet = wb.add_sheet('sheet 1')
    newName = filename
    for rowx, row in enumerate(spamReader):
        for colx, value in enumerate(row):
            sheet.write(rowx, colx, value)
    wb.save(newName + ".xls")

print "Done"

回答1:


Your encoding needs to be set for the output spreadsheet, I believe. You'd need to know what encoding that file is using. The csv module does not directly support unicode, but it's [8-bit-clean][1] so it just works for most western languages.

Without knowing what the encoding of your text file is, you have two options. Option 1 is use your local encoding according to python:

   >>> import locale
   >>> lang_code, encoding = locale.getdefaultlocale()

^^ Be careful using getdefaultlocale(). The documentation states that encoding MAY BE None.

OR just fallback to UTF8 and cross your fingers :D.

   >>> encoding = 'UTF8'
   >>> workbook = xlwt.Workbook(encoding=encoding)



回答2:


You're not escaping the file names. For example, in Python the string "consulta_cand_2010\newName.xls" has "\n" in the middle, which is an end-of-line character --- invalid for a file name!

On Windows you need to write the literal strings containing file names "C:\\Like\\This" or "C:/Like/This" or even r"C:\Like\This".



来源:https://stackoverflow.com/questions/17110674/converting-multiple-tab-delimited-txt-files-into-multiple-xls-files

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!