Display first page of PDF as Image

后端 未结 3 1540
攒了一身酷
攒了一身酷 2021-02-01 10:03

I am creating web application where I am displaying images/ pdf in thumbnail format. Onclicking respective image/ pdf it get open in new window.

For PDF, I have (this is

3条回答
  •  粉色の甜心
    2021-02-01 10:22

    Warning: Don't use Ma9ic's script (posted in another answer) unless you want to...

    • ...make the PDF->JPEG conversion consume much more time + resources than it should be
    • ...give up your own control over the PDF->JPEG conversion process altogether.

    While it may work well for you there are so many problems in these 8 little lines of Bash.

    First,
    it uses identify to extract the number of pages from the input PDF. However, identify (part of ImageMagick) is completely unable to process PDFs all by itself. It has to run Ghostscript as a 'delegate' to handle PDF input. It would be much more efficient to use Ghostscript directly instead of running it indirectly, via ImageMagick.

    Second,
    it uses convert to PDF->JPEG conversion. Same remark as above: it uses Ghostscript anyway, so why not run it directly?

    Third,
    it loops over the pages and runs a different convert process for every single page of the PDF, that is 100 converts for a 100 page PDF file. That means: it also runs 100 Ghostscript commands to produce 100 JPEGs.

    Fourth,
    Fahim Parkar's question was to get a thumbnail from the first page of the PDF, not from all of them.

    The script does run at least 201 different commands for a 100 page PDF, when it could all be done in just 1 command. If you Ghostscript directly...

    1. ...not only will it run faster and more efficiently,
    2. ...but also it will give you more fine-grained and better control over the JPEGs' quality settings.

    Use the right tool for the job, and use it correctly!


    Update:

    Since I was asked, here is my alternative implementation to Ma9ic's script.

    #!/bin/bash 
    infile=${1}
    
    gs -q -o $(basename "${infile}")_p%04d.jpeg -sDEVICE=jpeg "${infile}"
    
    # To get thumbnail JPEGs with a width 200 pixel use the following command:
    # gs -q -o name_200px_p%04d.jpg -sDEVICE=jpeg -dPDFFitPage -g200x400 "${infile}"
    
    # To get higher quality JPEGs (but also bigger-in-size ones) with a 
    # resolution of 300 dpi use the following command:
    # gs -q -o name_300dpi_p%04d.jpg -sDEVICE=jpeg -dJPEGQ=100 -r300 "${infile}"
    
    echo "Done"
    

    I've even run a benchmark on it. I converted the 756-page PDF-1.7 specification to JPEGs with both scripts:

    • Ma9ic's version needs 1413 seconds generate the 756 JPEGs.
    • My version saves 93% of that time and takes 91 seconds.
    • Moreover, Ma9ic's script produces on my system mostly black JPEG images, mine are Ok.

提交回复
热议问题