Is there any easy (scriptable) way to convert a PDF with vector images into a PDF with raster images? In other words, I want to generate a PDF with the exact same (un-rasterized
Pitstop Pro v2 update 3 from Enfocus can do exactly that. It has an action called "Rasterize page content, keeping text" which works pretty well. It is a plugin to Adobe Acrobat so it requires a little more but is also available as a server solution.
I used the following:
gswin32c -o "%2" -dFirstPage=1 -dLastPage=1 -sDEVICE=pngalpha -r72x72 -dUseCropBox -dFitPage "%1" -dBATCH -dNOPAUSE
where %1
is the input file and %2
is the output. This can be used with LaTeX, the generated PNG has the same ratio and page size as the original PDF so the relative position of the image will not change.
Note that in Linux, you may need to use gs
rather than gswin32c
.
You can also set the page range and then print the pages back to PDF. The downside is that the text gets rasterized as well.
It's a little complicated, but you asked for any possible solution. Furthermore this solution is not automatable.
1) Open the pdf with the vector images in Inkscape
. Then select the whole image with the select
tool (F1
)
2) If the vector image is consistant of more than one svg graphic press Ctrl + G
(Object --> Group)
3) cut the grouped svg image Ctrl + x
4) open a new InkScape Window Ctrl + n
and paste the image Ctrl + v
5) choose File --> export Bitmap (Shift + Ctrl + e
), maybe you want to increase the dpi
6) go back to the first InkScape window, File --> import (Ctrl + i
) and choose the previously exported bitmap
7) place the bitmap to the location where the svg image was
Save the pdf and the vector image is replaced by a bitmap image.
I had a similar issue, and solved it using ImageMagics convert tool (http://www.imagemagick.org/script/index.php). That comes with linux and runs fine on Windows/Cygwin or OS X
convert -density 300 largeVectorFileFromR.pdf out.pdf
With -density 300 you control resolution (as DPI).
Downside: Text is rasterized as well, I understand that Michael does not want this.
Here's one way to solve your problem:
Step 1: Use an online PDF-to-HTML converter, like the one here:
http://www.idrsolutions.com/online-pdf-to-html5-converter/
This tool converts the PDF into a set of images and a text overlay. The vector images should be converted to raster at this point.
Step 2: Convert the HTML+images back into PDF:
http://pdfcrowd.com/#convert_by_upload+with_options
The resulting PDF will have all the vector images rasterized, and all text will remain text, so you can select, copy, etc.
After some days searching for some solution, based on "Remove all text from PDF file" and "How to add a picture onto an existing pdf file?" I found a (ugly) scriptable solution:
gs -o /tmp/onlytxt.pdf -sDEVICE=pdfwrite -dFILTERVECTOR -dFILTERIMAGE $INPUT_FILE && \
gs -o /tmp/graphics.pdf -sDEVICE=pdfwrite -dFILTERTEXT $INPUT_FILE && \
convert -density $DPI -quality 100 /tmp/graphics.pdf /tmp/graphics.png && \
convert -density $DPI -quality 100 /tmp/graphics.png /tmp/graphics.pdf && \
pdftk /tmp/graphics.pdf stamp /tmp/onlytxt.pdf output $OUTPUT_FILE && \
rm /tmp/onlytxt.pdf /tmp/graphics.pdf /tmp/graphics.png
were we have three variables INPUT_FILE, OUTPUT_FILE, and DPI. We split the textual and graphical contents via Ghostscript, convert the graphical image to a raster image (PNG) and join the two using pdftk.
I've been using this successfully to convert huge vector images for use in scientific papers.