Understanding PDF operators - for iOS app

后端 未结 2 644
[愿得一人]
[愿得一人] 2021-01-15 10:20

I am tasked to create a pdf reader app for our company. After a few research, I became confused with the different operators inside the PDF. Here are a few things that I wou

相关标签:
2条回答
  • 2021-01-15 10:46

    Hrmm... you've been tasked with a very non-trivial job then. You should tell your them that the PDF-1.7 spec is a dense document of roughly 800 pages...

    Yes, it's a very good idea to use a third-party library for this. It's impossible for a single person to implement a conforming PDF reader that can truthfully display all the graphic objects, fonts, colors, transparencies, vector graphics, images.... that may be embedded in a PDF-1.7 (ISO spec) file.

    The first few things you need to be aware of:

    • PDF builds on the same graphics model as PostScript did. (But PostScript is a Turing-complete programming language, while PDF has been -- on purpose! -- stripped of all programming language capabilities.)
    • Like PostScript, the PDF graphic description "language" is using stacks and it uses the inverted "Polish notation" for expressions: operators come last, arguments for operators come first. To express "1 + 2" you'd write "1 2 add" in PostScript.
    • PDF is hardly "line based". So regarding your questions about Tm: it's not the starting point of a new line, it's the end of the expression 1 0 0 1 100 100 saying: "the previous 6 numbers represent the setting of a text line matrix, and it is for now set to the named values". Tm would rather be the end of a line, than the start of one!
    0 讨论(0)
  • 2021-01-15 10:55

    You'll need to familiarize yourself with the PDF specification, the annex A contains a summary of all the operators with links to more detailed documentation about the parameters, so that may be a good starting point.

    The Tm operator doesn't necessarily set the starting point of each line, it generally sets the text matrix, which is basically equivalent to a CGAffineTransform in terms of Quartz2D. To move to the next line, a document could also use the Td, TD, " or T* operators. PDF documents don't necessarily draw their text in the order that appears on screen, they may move around on the page freely and position the glyphs in any order they see fit. PDF doesn't really have the concept of a "line", you'll have to infer those from the position of the glyphs yourself (which can be tricky for things like subscript/superscript).

    0 讨论(0)
提交回复
热议问题