With PHP, how can I check if a PDF file has errors

我是研究僧i 提交于 2019-12-12 00:34:11

问题


I have a DB system built in PHP/MySql. I'm fairly new at this. The system allows the user to upload an invoice. Others give permission to pay the invoice. The accounting person uploads the check. After check is uploaded, it generates a PDF as a cover, then uses PDFTK (using Ben Squire's PDFTK-PHP-Library) to combine all of the files together and present the user with a single PDF to download.

Some users upload PDF files which cause PDFTK to hang indefinitely when it tries to combine the PDF with others (but most of the time it works fine). No returned error, just hangs. In order to get back onto the sytem, user must clear cache and re-log in. There are no error messages logged by the server, it just freezes. The only difference I can find in the files that do or do not work in looking at them with Acrobat is that the bad files are legal sized (8.5 x 14) ... but if I create my own legal sized file and try that, it works fine.

Using Putty I've gone to command line and replicated the same problem, PDFTK can't read the file, it hangs on the command line as well. I tried using PDFMerge which uses FPDF to combine the files and get an error with the file as well (The error I get back from this is: FPDF error: Unable to find object (4, 0) at expected location). On the command line I was able to use ImageMagick to convert PDF to JPG, but it gives me an error: "Warning: File has an invalid xref entry: 2. Rebuilding xref table." and then it converts it to a jpg but gives a few other less helpful warnings.

If I could get PHP to check the PDF file to determine if is valid without hanging the system, I could use ImageMagick to convert the file and then convert it back to a PDF, but I don't want to do this to all files. How can I get it to check the validity of the file when uploaded to see if it needs to be converted without causing the system to hang?

Here is a link to a file that is causing problems: http://www.cssc-testing.org/accounting/school_9/20130604-a1atransportation-1.pdf

Thanks in advance for any guidance you can offer!

My Code (which I'm guessing is not very clean, as I'm new):

$pdftk = new pdftk();
if($create_cover) { $pdftk->setInputFile(array("filename" => $cover_page['server'])); }

// Load a list of attachments
$sql = "SELECT * FROM actg_attachments WHERE trans_id = {$trans_id}";
$attachments = Attachment::find_by_sql($sql);
foreach($attachments as $attachment) {
    // Check if the file exists from the attachments
    $attachment->set_variables();
    $file = $attachment->abs_path . DS . $attachment->filename;
    if(file_exists($file)){
        // Use the pdftk tool to attach the documents to this PDF
        $pdftk->setInputFile(array("filename" => $file));
    }
}

$pdftk->setOutputFile($save_file);
$pdftk->_renderPdf();

the $pdftk class it is calling is from: https://github.com/bensquire/php-pdtfk-toolkit


回答1:


You could possibly use Ghostscript using exec() to check the file.

The non-accepted answer here may help:

How can you find a problem with a programmatically generated PDF?




回答2:


I wont say this is an appropriate/best fix, but it may resolve your problem,

In: pdf_parser.php, comment out the line:

$this->error("Unable to find object ({$obj_spec[1]}, {$obj_spec[2]}) at expected location");

It should be near line 544.

You'll also likely need to replace:

    if (!is_array($kids))
        $this->error('Cannot find /Kids in current /Page-Dictionary');

with:

    if (!is_array($kids)){
     //   $this->error('Cannot find /Kids in current /Page-Dictionary');
     return;
    }

in the fpdi_pdf_parser.php file

Hope that helps. It worked for me.



来源:https://stackoverflow.com/questions/16928698/with-php-how-can-i-check-if-a-pdf-file-has-errors

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!