Convert PDF to image with high resolution

前端 未结 18 1900
故里飘歌
故里飘歌 2020-11-28 00:13

I\'m trying to use the command line program convert to take a PDF into an image (JPEG or PNG). Here is one of the PDFs that I\'m trying to convert.

I want the progr

相关标签:
18条回答
  • 2020-11-28 00:38

    I use icepdf an open source java pdf engine. Check the office demo.

    package image2pdf;
    
    import org.icepdf.core.exceptions.PDFException;
    import org.icepdf.core.exceptions.PDFSecurityException;
    import org.icepdf.core.pobjects.Document;
    import org.icepdf.core.pobjects.Page;
    import org.icepdf.core.util.GraphicsRenderingHints;
    import javax.imageio.ImageIO;
    import java.awt.image.BufferedImage;
    import java.awt.image.RenderedImage;
    import java.io.File;
    import java.io.FileNotFoundException;
    import java.io.IOException;
    
    public class pdf2image {
    
       public static void main(String[] args) {
    
          Document document = new Document();
          try {
             document.setFile("C:\\Users\\Dell\\Desktop\\test.pdf");
          } catch (PDFException ex) {
             System.out.println("Error parsing PDF document " + ex);
          } catch (PDFSecurityException ex) {
             System.out.println("Error encryption not supported " + ex);
          } catch (FileNotFoundException ex) {
             System.out.println("Error file not found " + ex);
          } catch (IOException ex) {
             System.out.println("Error IOException " + ex);
          }
    
          // save page captures to file.
          float scale = 1.0f;
          float rotation = 0f;
    
          // Paint each pages content to an image and
          // write the image to file
          for (int i = 0; i < document.getNumberOfPages(); i++) {
             try {
             BufferedImage image = (BufferedImage) document.getPageImage(
                 i, GraphicsRenderingHints.PRINT, Page.BOUNDARY_CROPBOX, rotation, scale);
    
             RenderedImage rendImage = image;
             try {
                System.out.println(" capturing page " + i);
                File file = new File("C:\\Users\\Dell\\Desktop\\test_imageCapture1_" + i + ".png");
                ImageIO.write(rendImage, "png", file);
             } catch (IOException e) {
                e.printStackTrace();
             }
             image.flush();
             }catch(Exception e){
                 e.printStackTrace();
             }
          }
    
          // clean up resources
          document.dispose();
       }
    }
    

    I've also tried imagemagick and pdftoppm, both pdftoppm and icepdf has a high resolution than imagemagick.

    0 讨论(0)
  • 2020-11-28 00:38

    PNG file you attached looks really blurred. In case if you need to use additional post-processing for each image you generated as PDF preview, you will decrease performance of your solution.

    2JPEG can convert PDF file you attached to a nice sharpen JPG and crop empty margins in one call:

    2jpeg.exe -src "C:\In\*.*" -dst "C:\Out" -oper Crop method:autocrop
    
    0 讨论(0)
  • 2020-11-28 00:38

    It's actually pretty easy to do with Preview on a mac. All you have to do is open the file in Preview and save-as (or export) a png or jpeg but make sure that you use at least 300 dpi at the bottom of the window to get a high quality image.

    0 讨论(0)
  • 2020-11-28 00:39

    In ImageMagick, you can do "supersampling". You specify a large density and then resize down as much as desired for the final output size. For example with your image:

    convert -density 600 test.pdf -background white -flatten -resize 25% test.png
    


    Download the image to view at full resolution for comparison..

    I do not recommend saving to JPG if you are expecting to do further processing.

    If you want the output to be the same size as the input, then resize to the inverse of the ratio of your density to 72. For example, -density 288 and -resize 25%. 288=4*72 and 25%=1/4

    The larger the density the better the resulting quality, but it will take longer to process.

    0 讨论(0)
  • 2020-11-28 00:45

    It appears that the following works:

    convert           \
       -verbose       \
       -density 150   \
       -trim          \
        test.pdf      \
       -quality 100   \
       -flatten       \
       -sharpen 0x1.0 \
        24-18.jpg
    

    It results in the left image. Compare this to the result of my original command (the image on the right):

      

    (To really see and appreciate the differences between the two, right-click on each and select "Open Image in New Tab...".)

    Also keep the following facts in mind:

    • The worse, blurry image on the right has a file size of 1.941.702 Bytes (1.85 MByte). Its resolution is 3060x3960 pixels, using 16-bit RGB color space.
    • The better, sharp image on the left has a file size of 337.879 Bytes (330 kByte). Its resolution is 758x996 pixels, using 8-bit Gray color space.

    So, no need to resize; add the -density flag. The density value 150 is weird -- trying a range of values results in a worse looking image in both directions!

    0 讨论(0)
  • 2020-11-28 00:45

    The following python script will work on any Mac (Snow Leopard and upward). It can be used on the command line with successive PDF files as arguments, or you can put in into a Run Shell Script action in Automator, and make a Service (Quick Action in Mojave).

    You can set the resolution of the output image in the script.

    The script and a Quick Action can be downloaded from github.

    #!/usr/bin/python
    # coding: utf-8
    
    import os, sys
    import Quartz as Quartz
    from LaunchServices import (kUTTypeJPEG, kUTTypeTIFF, kUTTypePNG, kCFAllocatorDefault) 
    
    resolution = 300.0 #dpi
    scale = resolution/72.0
    
    cs = Quartz.CGColorSpaceCreateWithName(Quartz.kCGColorSpaceSRGB)
    whiteColor = Quartz.CGColorCreate(cs, (1, 1, 1, 1))
    # Options: kCGImageAlphaNoneSkipLast (no trans), kCGImageAlphaPremultipliedLast 
    transparency = Quartz.kCGImageAlphaNoneSkipLast
    
    #Save image to file
    def writeImage (image, url, type, options):
        destination = Quartz.CGImageDestinationCreateWithURL(url, type, 1, None)
        Quartz.CGImageDestinationAddImage(destination, image, options)
        Quartz.CGImageDestinationFinalize(destination)
        return
    
    def getFilename(filepath):
        i=0
        newName = filepath
        while os.path.exists(newName):
            i += 1
            newName = filepath + " %02d"%i
        return newName
    
    if __name__ == '__main__':
    
        for filename in sys.argv[1:]:
            pdf = Quartz.CGPDFDocumentCreateWithProvider(Quartz.CGDataProviderCreateWithFilename(filename))
            numPages = Quartz.CGPDFDocumentGetNumberOfPages(pdf)
            shortName = os.path.splitext(filename)[0]
            prefix = os.path.splitext(os.path.basename(filename))[0]
            folderName = getFilename(shortName)
            try:
                os.mkdir(folderName)
            except:
                print "Can't create directory '%s'"%(folderName)
                sys.exit()
    
            # For each page, create a file
            for i in range (1, numPages+1):
                page = Quartz.CGPDFDocumentGetPage(pdf, i)
                if page:
            #Get mediabox
                    mediaBox = Quartz.CGPDFPageGetBoxRect(page, Quartz.kCGPDFMediaBox)
                    x = Quartz.CGRectGetWidth(mediaBox)
                    y = Quartz.CGRectGetHeight(mediaBox)
                    x *= scale
                    y *= scale
                    r = Quartz.CGRectMake(0,0,x, y)
            # Create a Bitmap Context, draw a white background and add the PDF
                    writeContext = Quartz.CGBitmapContextCreate(None, int(x), int(y), 8, 0, cs, transparency)
                    Quartz.CGContextSaveGState (writeContext)
                    Quartz.CGContextScaleCTM(writeContext, scale,scale)
                    Quartz.CGContextSetFillColorWithColor(writeContext, whiteColor)
                    Quartz.CGContextFillRect(writeContext, r)
                    Quartz.CGContextDrawPDFPage(writeContext, page)
                    Quartz.CGContextRestoreGState(writeContext)
            # Convert to an "Image"
                    image = Quartz.CGBitmapContextCreateImage(writeContext) 
            # Create unique filename per page
                    outFile = folderName +"/" + prefix + " %03d.png"%i
                    url = Quartz.CFURLCreateFromFileSystemRepresentation(kCFAllocatorDefault, outFile, len(outFile), False)
            # kUTTypeJPEG, kUTTypeTIFF, kUTTypePNG
                    type = kUTTypePNG
            # See the full range of image properties on Apple's developer pages.
                    options = {
                        Quartz.kCGImagePropertyDPIHeight: resolution,
                        Quartz.kCGImagePropertyDPIWidth: resolution
                        }
                    writeImage (image, url, type, options)
                    del page
    
    0 讨论(0)
提交回复
热议问题