determine if PDF file is openable and not corrupt

前端 未结 2 1236
盖世英雄少女心
盖世英雄少女心 2021-01-26 02:40

I am wandering if anybody has a reliable way of determine whether a PDF document is actually a PDF document, and that it isn\'t corrupted.

I generate reports on my syste

相关标签:
2条回答
  • 2021-01-26 02:47

    If you just want to make sure the file is a PDF file, without checking that it is a completely intact pdf file with no issues, you can read the first 5 bytes of the file and for a PDF file they will be exactly equal to the string "%PDF-"

    This is how the file program in linux identifies PDF files.

    But if you want to make absolutely sure there are no errors anywhere in the file, you can run a program that processes the entire file, and see if that program returns success.

    In linux you can use ghostscript ("gs") to render the PDF document to any format.

    Or you can install acrobat reader, and run acroread as a command line program to convert it to postscript:

    acroread -print -toPostScript [your_file.pdf]
    

    To do either of these you will need to use the system PHP function. To check of the program ran successfully, you need to pass a variable in the second parameter to system that will receive the return status.

    0 讨论(0)
  • 2021-01-26 03:01

    You can use pdfinfo, centos installation command:

    yum install poppler-utils
    

    ... and use pdfinfo command. The PHP code is as follows:

    if(!exec("pdfinfo test.pdf")){
      echo "file is corrupted"
    }
    
    0 讨论(0)
提交回复
热议问题