converting pdf to image but after zooming in

老子叫甜甜 提交于 2020-05-14 20:48:07

问题


This link shows how pdfs could be converted to images. Is there a way to zoom my pdfs before converting to images? In my project, i am converting pdfs to pngs and then using Python-tesseract library to extract text. I noticed that if I zoom pdfs and then save parts as pngs then OCR provides much better results. So is there a way to zoom pdfs before converting to pngs?


回答1:


I think that raising the quality (resolution) of your image is a better solution than zooming into the pdf.

using pdf2image you can accomplish this quite easily:

install pdf2image: pip install pdf2image

then, in python, convert your pdf into a high quality image:

from pdf2image import convert_from_path

pages = convert_from_path('sample.pdf', 400) #400 is the Image quality in DPI (default 200)

pages[0].save("sample.png")

by playing around with the quality parameter you should get the result you desider



来源:https://stackoverflow.com/questions/55305385/converting-pdf-to-image-but-after-zooming-in

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!