Debugging PDF for error

不问归期 提交于 2019-12-13 11:43:49

问题


I'm creating PDF files using PDFClown java library.

Sometimes, when openning these files with Adobe Acrobat Reader I get the famous error message:

"An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem."

The error shows while reading (with Adobe) the attached file only when scrolling down to the 8'th page, then scrolling back up to 3'td page. Alternatively, Zooming out to 33.3% will also produce the message.

Just for the record, Foxit reader reads the file flawlessly, as well as other PDF readers like browsers.

My questions are:

  1. What's wrong with my file?? (file is attached)

  2. How can I find what's wrong with it? is there a tool which tells you where does the error lie?

Thanks!


回答1:


Ok, this wasn't easy -

Due to a bug in PDFClown the my main stream of information in the PDF page has been corrupted. After it's end it had a copy of a past instance of it. This caused a partial text section without the starting command "BT" - which left a single "ET" without a "BT" in the end of the stream.

once I corrected this, it ran great.

Thank you all for your help. I would have much more difficult time debugging it without the tool RUPS which @Bruno suggested.

edit:

The bug was in the Buffer.java:clone() (line 217)

instead of line:

clone.append(data);

needs to be:

clone.append(data, 0, this.length);

Without this correction it clones the whole data buffer, and set the cloned Buffer's length to the data[].length. This is very problematic if the Buffer.length is smaller than the data[].length. The result in my case was that in the end of the stream there was garbage.




回答2:


The error shows while reading (with Adobe) the attached file only when scrolling down to the 8'th page, then scrolling back up to 3'td page. Alternatively, Zooming out to 33.3% will also produce the message.

Well, I get it easier, I merely open the PDF and scroll down using the cursor keys. As soon as the top 2 cm of page 3 appear, the message appears.

What's wrong with my file??

The content of pages 1 and 2 look ok, so let's look at the content of page 3.

My initial attributing the issue to the use of text specific operations (especially Tf and Tw) outside of a text object was wrong as Stefano Chizzolini pointed out: Some text related operations indeed are allowed outside text objects, namely the text state operations, cf. figure 9 from the PDF specification:

So while being less common, text state operations at page description level are completely ok.

After my incorrect attempt to explain the issue, the OP's own answer indicated that the

main stream of information in the PDF page has been corrupted. After it's end it had a copy of a past instance of it. This caused a partial text section without the starting command "BT" - which left a single "ET" without a "BT" in the end of the stream.

An ET without a prior BT indeed would be an error, and quite likely it would be accompanied by operations at the wrong level... Inspecting the stream content of that third page (the focused page of this issue), though, I could not find any unmatched ET. In the course of that inspection, though, I discovered that the content stream contains more than 2000 trailing 0 bytes! Adobe Reader seems not to be able to cope with these 0 bytes.

The bug the OP found, can explain the issue:

in the Buffer.java:clone() (line 217)

instead of line:

clone.append(data);

needs to be:

clone.append(data, 0, this.length);

Without this correction it clones the whole data buffer, and set the cloned Buffer's length to the data[].length. This is very problematic if the Buffer.length`` is smaller than the data[].length.

Trailing 0 bytes can be an effect of such a buffer copying bug.

Furthermore symptoms as found by the OP (After it's end it had a copy of a past instance of it) can also be the effect of such a bug. So I assume the OP found those symptoms on a different page, not page 3, but fixing the bug healed all symptoms.

How can I find what's wrong with it? is there a tool which tells you where does the error lie?

There are PDF syntax checkers, e.g. the Preflight tool included in Adobe Acrobat. but even that fails on your file.

So essentially you have to extract the page content (using a PDF browser, e.g. RUPS) and check manually with the PDF specification on the other screen.




回答3:


the general post about debugging pdf might have been also helpful as rups / pdfstreamdump etc is mentioned there How do you debug PDF files?



来源:https://stackoverflow.com/questions/18812789/debugging-pdf-for-error

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!