Python: Numbering Pages in a PDF using PyPDF2 and io

帅比萌擦擦* 提交于 2019-12-11 11:26:58

问题


So I am trying to retrospectively add a page numbering to a PDF file. I don't understand how this works. I copied the code together from here and here. I keep a problem I can't seem to fix on my own, probably because I don't understand what is happening even after reading the PyPDF2 documentation.

from PyPDF2 import PdfFileWriter, PdfFileReader
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A4


packet = io.BytesIO()
can = canvas.Canvas(packet, pagesize=A4)    
can.drawString(10, 100, "Page" + str(15)) #just a random test number
can.save()
packet.seek(0)

watermark = PdfFileReader(packet)
watermark_page = watermark.getPage(0)

pdf = PdfFileReader('in.pdf')
pdf_writer = PdfFileWriter()

for page in range(pdf.getNumPages()):

    pdf_page = pdf.getPage(page)
    pdf_page.mergePage(watermark_page)
    pdf_writer.addPage(pdf_page)

with open('out.pdf', 'wb') as fh:
    pdf_writer.write(fh)

This works fine. However, I would like to give every page a different number. So I changed the for loop to this:

from PyPDF2 import PdfFileWriter, PdfFileReader
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A4

packet = io.BytesIO()

pdf = PdfFileReader('in.pdf')
pdf_writer = PdfFileWriter()

for page in range(pdf.getNumPages()):

    can = canvas.Canvas(packet, pagesize=A4)


    can.drawString(10, 200, "Page " + str(page) )
    can.save()
    packet.seek(0)
    watermark = PdfFileReader(packet)
    watermark_page = watermark.getPage(0)



    pdf_page = pdf.getPage(page)
    pdf_page.mergePage(watermark_page)
    pdf_writer.addPage(pdf_page)

with open('out.pdf', 'wb') as fh:
    pdf_writer.write(fh)

This does not work.

I get:

Traceback (most recent call last):

  File "<ipython-input-44-c6a76740be9f>", line 1, in <module>
    runfile('//DIR/pdftest.py', wdir='//DIR')

  File "C:\Program Files (x86)\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "C:\Program Files (x86)\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "//DIR/pdftest.py", line 55, in <module>
    watermark = PdfFileReader(packet)

  File "C:\Program Files (x86)\Anaconda\lib\site-packages\PyPDF2\pdf.py", line 1084, in __init__
    self.read(stream)

  File "C:\Program Files (x86)\Anaconda\lib\site-packages\PyPDF2\pdf.py", line 1901, in read
    raise utils.PdfReadError("Could not find xref table at specified location")

PdfReadError: Could not find xref table at specified location

A bit of help understanding as well as fixing this would be greatly appreciated.

Thank you!

来源:https://stackoverflow.com/questions/53864602/python-numbering-pages-in-a-pdf-using-pypdf2-and-io

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!