problem with closing python pypdf - writing. getting a valueError: I/O operation on closed file

落花浮王杯 提交于 2019-12-10 15:59:27

问题


can't figure this up this function (part of class for scraping internet site into a pdf) supposed to merge the pdf file generated from web pages using pypdf.

this is the method code:

def mergePdf(self,mainname,inputlist=0):
    """merging the pdf pages
    getting an inputlist to merge or defaults to the class instance self.pdftomerge list"""
    from pyPdf import PdfFileWriter, PdfFileReader
    self._mergelist = inputlist or self.pdftomerge
    self.pdfoutput = PdfFileWriter()

    for name in self._mergelist:
        print "merging %s into main pdf file: %s" % (name,mainname)
        self._filestream = file(name,"rb")
        self.pdfinput = PdfFileReader(self._filestream)
        for p in self.pdfinput.pages:
            self.pdfoutput.addPage(p)
        self._filestream.close()

    self._pdfstream = file(mainname,"wb")
    self._pdfstream.open()
    self.pdfoutput.write(self._pdfstream)
    self._pdfstream.close()

I keep getting this error:

  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 264, in write
    self._sweepIndirectReferences(externalReferenceMap, self._root)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 324, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, data[i])
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 345, in _sweepIndirectReferences
    newobj = data.pdf.getObject(data)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 645, in getObject
    self.stream.seek(start, 0)
ValueError: I/O operation on closed file

but when I check the status of self._pdfstream I get:

<open file 'c:\python27\learn\dive.pdf', mode 'wb' at 0x013B2020>

what am I doing wrong?

i'll be glad for any help


回答1:


OK, I found your problem. You were right to call file(). Don't try to call open() at all.

Your problem is the input file still needs to be open when you call self.pdfoutput.write(self._pdfstream), so you need to remove the line self._filestream.close().

Edit: This script will trigger the problem. The first write will succeed and the second will fail.

from pyPdf import PdfFileReader as PfR, PdfFileWriter as PfW

input_filename = 'in.PDF' # replace with a real file
output_filename = 'out.PDF' # something that doesn't exist

infile = file(input_filename, 'rb')
reader = PfR(infile)
writer = PfW()

writer.addPage(reader.getPage(0))
outfile = file(output_filename, 'wb')
writer.write(outfile)
print "First Write Successful!"
infile.close()
outfile.close()

infile = file(input_filename, 'rb')
reader = PfR(infile)
writer = PfW()

writer.addPage(reader.getPage(0))
outfile = file(output_filename, 'wb')
infile.close() # BAD!

writer.write(outfile)
print "You'll get an IOError Before this line"
outfile.close()


来源:https://stackoverflow.com/questions/6773631/problem-with-closing-python-pypdf-writing-getting-a-valueerror-i-o-operation

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!