write in unicode text on visible signature - pdfbox

梦想与她 提交于 2019-12-23 01:37:29

问题


I'we build PDF, using PDFBox. I've visible signature too. I write some text like that:

...
builderSting.append("Tm\n");
builderSting.append(" /F1 " + fontSize + "\n");
builderSting.append("Tf\n");
builderSting.append("(hello world)");
builderSting.append("Tj\n");
builderSting.append("ET");
...
PDStream stream= ...;
stream.createOutputStream().write(builder.toString().getBytes("ISO-8859-1"));

everything works well. but if I write some unicode characters in builderString, there is "???"s instead of text.

that's sample PDF: link here

QUESTION 1) when I see PDF structure , there is Question-Marks instead of text. Yes. and I dont know how to write with unicode characters?

9 0 obj
<<
/Type /XObject
/Subtype /Form
/BBox [100 50 0 0]
/Matrix [1 0 0 1 0 0]
/Resources <<
/Font 11 0 R
/XObject <<
/img0 12 0 R
>>
/ProcSet [/PDF /Text /ImageB /ImageC /ImageI]
>>
/FormType 1
/Length 13 0 R
>>
stream
q 93.70079 0 0 50 0 0 cm /img0 Do Q
BT
1 0 0 1 93.70079 25 Tm
 /F1 2
Tf
(????)Tj
ET
endstream
endobj

I've font with Encoding WinAsciEncoding. can i use another encoding in pdfbox?

PDFont font = PDTrueTypeFont.loadTTF(template, new File("//fontName.ttf"));
    font.setFontEncoding(new WinAnsiEncoding());

QUESTION 2) I 've embedded font in PDF. but text is not written with this font (in visible singature Rectangle). Why?

Question 3) when I remove font, text was still there (when the text was in english). what is the default font? /F1 - which is is 1st font?

Question 4) How to calculate width of my text in visible signature ? Any ideas?


回答1:


QUESTION 1) when I see PDF structure , there is Question-Marks instead of text. Yes. and I dont know how to write with unicode characters?

I assume that with unicode characters you mean characters present in Unicode but not in e.g. Latin-1. (Because the letter 'a' for example does have a Unicode representation, too, but most likely won't cause you trouble.)

You call getBytes("ISO-8859-1") on your StringBuilder result. Your unicode characters most likely are not in ISO 8859-1. Thus, String.getBytes returns the ASCII code for a question mark in their respective place.

If the question was merely how to write to an output stream with unicode characters in Java, the answer would be easy: Choose an encoding which contains all you characters, e.g. UTF-8, which all consumers of your program support, and call String.getBytes for that encoding.

The case at hand is different, though, as you want to serialize those information as a PDF form xobject stream. In this context your whole approach is somewhere along the route from highly questionable to completely wrong:

In PDFs, each font might come along with its own encoding which might be similar to a common encoding, e.g. /WinAnsiEncoding, or completely custom. These encodings, furthermore, in many cases are restricted to one byte per character, but in case of composite fonts they can also be multi-byte-encodings.

As a corollary, not all elements of the stream elements need to be encoded using the same encoding. E.g. the operator names Tm, Tf, and Tj are encoded using their ASCII codes while the characters of a string to be displayed have to be encoded using the respective font's encoding (and may thereafter be yet again hex-encoded if added in sharp brackets <>).

Thus, creating the stream as a string and then converting them to bytes with a single encoding only works if all used fonts use the same encoding (for the actually used code points) which furthermore needs to be ASCII'ish to correctly represent the operators.

Essentially, you should directly construct the stream in some byte buffer and for each inserted element use the appropriate encoding. In case of characters to be displayed, therefore, you have to be aware of the encoding used by the currently selected font.

If you want to do it right, first study the PDF specification ISO 32000-1, especially the sections on general syntax and chapter 9 Text.

QUESTION 2) I've embedded font in PDF. but text is not written with this font (in visible signature Rectangle). Why?

In the resources of the stream xobject in question there is exactly one embedded font associated to the name /F0. In your stream, though, you have /F1 2 Tf, i.e. you select a font /F1 at size 2.

Question 3) when I remove font, text was still there (when the text was in english). what is the default font?

According to the specification, section 9.3.1,

font shall be the name of a font resource in the Font subdictionary of the current resource dictionary [...] There is no initial value for either font or size

Most likely, though, PDF viewers for the sake of compatibility with old or broken documents use some default font.

Question 4) How to calculate width of my text in visible signature ? Any ideas?

The widths obviously depends on the metrics of the font used (glyph widths in this case) and the graphics state you set (font size, character spacing, word spacing, current transformation matrix, text transformation matrix, ...).

In your case you hardly do anything in the graphics state and, therefore, only the selected font size from it is of interest. so the more interesting part are the character widths from the font metrics. As long as you use the standard 14 fonts, you find the metrics here. As soon as you start using other, custom fonts, you have to read them from the font definition files yourself.




回答2:


Ad 1)

Could it be that

stream.createOutputStream().write(builder.toString().getBytes("ISO-8859-1"));

should be

stream.createOutputStream().write(builderString.toString().getBytes("UTF-8"));

The conversion in getBytes to ISO-8859-1 would make some special characters missing in ISO-8859-1 a ?.



来源:https://stackoverflow.com/questions/17697777/write-in-unicode-text-on-visible-signature-pdfbox

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!