Calculating the exact positions of(Td, TD, Tm, cm, T*) content stream in pdf?

那年仲夏 提交于 2020-02-24 05:00:08

问题


Getting or calculating the exact positions of(Td, TD, Tm, cm, T*) content stream in pdf?

As a human I am able to calculate(whether it is replacing last Td or adding to last Td or multiplication with fontsize) the positions of tags in pdf content stream by comparing , where the glyphs are located in pdf and content stream position values. But I am unable to calculate perfect positions of glyph's programatically . Please see the screen short.

In above image left side box is pdf ui glyphs and right side box contains the related content stream. In content stream I highlighted two Td positions.

In first circle

3.321 -6.475999832 Td

The Td positions should add to the last Td positions. Assume x1, y1.

Current_x_pos = x1+3.321

Curent_y_pos = y1-6.475999832

then we can get the exact position of glyph "t".

In second highlighted circle the new Td positions (231.544 366.377990 Td) are completely replaced like

Current_x_pos = 231.544

Curent_y_pos = 366.377990

Along with that some times the parent tag is Tm at that case the formula might be like this

Current_x_pos = x1+(tdx1*font_size)

Curent_y_pos = y1+(tdy1*font_size)

When we need to multiply like above, and some times addition. Programatically how can I know this. To parse exact positions?(new screen short added for multiplication)

Any help ? Thanks.


回答1:


When we need to multiply like above, and some times addition. Programatically how can I know this. To parse exact positions?

It's quite simple, for a Td operation you always multiply, see the specification ISO 32000-1 (similarly in ISO 32000-2):

For a freshly initialized (i.e. identity) text line matrix Tlm this matrix multiplication looks like replacing its bottom row with tx ty 1.

For a text line matrix Tlm with only changes in the bottom row against an identity this matrix multiplication looks like an addition to the bottom row, e.g. x y 1 becomes x+tx y+ty 1.

For a text line matrix Tlm like in your second example

a 0 0
0 a 0
x y 1

this matrix multiplication looks like a multiplication with a followed by an addition to the bottom row, i.e. x y 1 becomes x+a·tx y+a·ty 1. If the font size parameter of the preceding Tf operation was 1, then a would effectively be the resultant font size giving rise to your assumption the font size is part of the formula.

In general, for an arbitrary, non-degenerate text line matrix Tlm

a b 0
c d 0
x y 1

this matrix multiplication looks even more complex, x y 1 becomes x+a·tx+c·ty y+b·tx+d·ty 1.

Thus, concerning your question

Programatically how can I know this. To parse exact positions?

your program should simply always use matrix multiplication and ignore what it looks like on the level of the separate coordinates.


What makes the second circled instruction look like a mere replacement, is that the prior text line matrix is the identity matrix. This is not due to the restore-state operation as assumed by François, though, but more simply to the start of text object operation BT:

As the text matrix and the text line matrix are reset at the start of a text object and the graphics state cannot be saved or restored in a text object, the save and restore graphics state operations are not to blame in this case.

(Screen shots are from the ISO 32000-1 copy shared by Adobe.)




回答2:


When you say:

In second highlighted circle the new Td positions (231.544 366.377990 Td) are completely replaced

Actually, the positions Current_x_pos and Current_x_pos are not replaced. This Td command does exactly like always:

Current_x_pos = x1 + 231.544
Curent_y_pos = y1 - 366.377990

It is the Q from 3 line above that reloads previous graphic state, right after the current graphic state has been saved with q.



来源:https://stackoverflow.com/questions/57039685/calculating-the-exact-positions-oftd-td-tm-cm-t-content-stream-in-pdf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!