I\'m trying to use the command line program convert to take a PDF into an image (JPEG or PNG). Here is one of the PDFs that I\'m trying to convert.
I want the progr
One more suggestion is that you can use GIMP.
Just load the PDF file in GIMP->save as .xcf and then you can do whatever you want to the image.
I have used pdf2image. A simple python library that works like charm.
First install poppler on non linux machine. You can just download the zip. Unzip in Program Files and add bin to Machine Path.
After that you can use pdf2image in python class like this:
from pdf2image import convert_from_path, convert_from_bytes
images_from_path = convert_from_path(
inputfile,
output_folder=outputpath,
grayscale=True, fmt='jpeg')
I am not good with python but was able to make exe of it. Later you may use the exe with file input and output parameter. I have used it in C# and things are working fine.
Image quality is good. OCR works fine.
normally I extract the embedded image with 'pdfimages' at the native resolution, then use ImageMagick's convert to the needed format:
$ pdfimages -list fileName.pdf
$ pdfimages fileName.pdf fileName # save in .ppm format
$ convert fileName-000.ppm fileName-000.png
this generate the best and smallest result file.
Note: For lossy JPG embedded images, you had to use -j:
$ pdfimages -j fileName.pdf fileName # save in .jpg format
With recent poppler you can use -all that save lossy as jpg and lossless as png
On little provided Win platform you had to download a recent (0.37 2015) 'poppler-util' binary from: http://blog.alivate.com.au/poppler-windows/
I really haven't had good success with convert
[update May 2020: actually: it pretty much never works for me], but I've had EXCELLENT success with pdftoppm
. Here's a couple examples of producing high-quality images from a PDF:
[Produces ~25 MB-sized files per pg] Output uncompressed .tif file format at 300 DPI into a folder called "images", with files being named pg-1.tif, pg-2.tif, pg-3.tif, etc:
mkdir -p images && pdftoppm -tiff -r 300 mypdf.pdf images/pg
[Produces ~1MB-sized files per pg] Output in .jpg format at 300 DPI:
mkdir -p images && pdftoppm -jpeg -r 300 mypdf.pdf images/pg
[Produces ~2MB-sized files per pg] Output in .jpg format at highest quality (least compression) and still at 300 DPI:
mkdir -p images && pdftoppm -jpeg -jpegopt quality=100 -r 300 mypdf.pdf images/pg
https://askubuntu.com/questions/150100/extracting-embedded-images-from-a-pdf/1187844#1187844.
pdf2searchablepdf
] https://askubuntu.com/questions/473843/how-to-turn-a-pdf-into-a-text-searchable-pdf/1187881#1187881I use pdftoppm
on the command line to get the initial image, typically with a resolution of 300dpi, so pdftoppm -r 300
, then use convert
to do the trimming and PNG conversion.
You can do it in LibreOffice Draw (which is usually preinstalled in Ubuntu):