ReportLab and pdfrw: Importing Scanned PDF

≡放荡痞女 提交于 2019-12-11 03:39:35

问题


Using the code below, I am trying to import a pdf page into an existing canvas object and save to PDF. This usually works just fine, but I noticed that when I try it with a PDF generated from a scanned document, it results in a blank page. Any takers?

from reportlab.pdfgen import canvas
from pdfrw import PdfReader
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

c = canvas.Canvas(Out_Folder+pdf_file_name)
c.setPageSize([11*inch, 8.5*inch])

page = PdfReader(folder+'2_VisionMissionValues.pdf',decompress=False).pages
p = pagexobj(page[0])
c.setPageSize([11*inch, 8.5*inch]) #Set page size (for landscape)
c.doForm(makerl(c, p))
c.showPage()
c.save()

Thanks in advance!


回答1:


Sooo...

On the one hand, I have absolutely no idea why this is happening, and not really much time to debug it right now.

On the other hand, I have a workaround for you (and I tried the workaround on v0.3, as well as on the current github master, and it worked in both cases for me).

I started off by verifying that your code failed on your page and that it worked on another PDF. Then I asked myself "What happens if I use my watermark example to create a PDF with your page as a watermark?" (because that uses some of the same form XObject code). That worked, so then I asked myself "What does it look like if I pass my watermarked page through your reportlab code?"

Interestingly, the entire watermarked page, including your image made it through. So I modified your code to do the minimal stuff that the watermark does, which winds up putting a form XObject inside a form XObject when it's passed to reportlab. That worked.

Here's a slightly modified version of your code that I used for this.

import sys

from reportlab.pdfgen import canvas
from pdfrw import PdfReader, PageMerge
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

inch = 72

fname, = sys.argv[1:]
page = PdfReader(fname,decompress=False).pages[0]
p = pagexobj(PageMerge().add(page).render())

c = canvas.Canvas('outstuff.pdf')
c.setPageSize([8.5*inch, 11.0*inch]) #Set page size (for portrait)
c.doForm(makerl(c, p))
c.showPage()
c.save()


来源:https://stackoverflow.com/questions/43773477/reportlab-and-pdfrw-importing-scanned-pdf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!