PYPDF watermarking returns error

问题

hi im trying to watermark a pdf fileusing pypdf2 though i get this error i cant figure out what goes wrong.

i get the following error:

Traceback (most recent call last):   File "test.py", line 13, in <module>
    page.mergePage(watermark.getPage(0))   File "C:\Python27\site-packages\PyPDF2\pdf.py", line 1594, in mergePage
    self._mergePage(page2)   File "C:\Python27\site-packages\PyPDF2\pdf.py", line 1651, in _mergePage
    page2Content, rename, self.pdf)   File "C:Python27\site-packages\PyPDF2\pdf.py", line 1547, in
_contentStreamRename
    op = operands[i] KeyError: 0

using python 2.7.6 with pypdf2 1.19 on windows 32bit. hopefully someone can tell me what i do wrong.

my python file:

from PyPDF2 import PdfFileWriter, PdfFileReader

output = PdfFileWriter()
input = PdfFileReader(open("test.pdf", "rb"))
watermark = PdfFileReader(open("watermark.pdf", "rb"))

# print how many pages input1 has:
print("test.pdf has %d pages." % input.getNumPages())
print("watermark.pdf has %d pages." % watermark.getNumPages())

# add page 0 from input, but first add a watermark from another PDF:
page = input.getPage(0)
page.mergePage(watermark.getPage(0))
output.addPage(page)

# finally, write "output" to document-output.pdf
outputStream = file("outputs.pdf", "wb")
output.write(outputStream)
outputStream.close()

回答1:

Try writing to a StringIO object instead of a disk file. So, replace this:

outputStream = file("outputs.pdf", "wb")
output.write(outputStream)
outputStream.close()

with this:

outputStream = StringIO.StringIO()
output.write(outputStream) #write merged output to the StringIO object
outputStream.close()

If above code works, then you might be having file writing permission issues. For reference, look at the PyPDF working example in my article.

回答2:

I encountered this error when attempting to use PyPDF2 to merge in a page which had been generated by reportlab, which used an inline image canvas.drawInlineImage(...), which stores the image in the object stream of the PDF. Other PDFs that use a similar technique for images might be affected in the same way -- effectively, the content stream of the PDF has a data object thrown into it where PyPDF2 doesn't expect it.

If you're able to, a solution can be to re-generate the source pdf, but to not use inline content-stream-stored images -- e.g. generate with canvas.drawImage(...) in reportlab.

Here's an issue about this on PyPDF2.

来源：https://stackoverflow.com/questions/20221991/pypdf-watermarking-returns-error

标签

python-2.7

pypdf