PDFBox Inconsistent PDTextField Autosize Behavior after setValue

后端 未结 1 347
情深已故
情深已故 2021-01-29 04:30

I am using Apache PDFBox for configuration of PDTextField\'s on a PDF document where I load Lato onto the document using:

font = PDType         


        
1条回答
  •  佛祖请我去吃肉
    2021-01-29 05:02

    Especially the appearances of the Resident Name fields, the Phone fields, and the Care Providers Address fields appear conspicuous. Only the former two are mentioned by the OP.

    Let's inspect these fields; all screen shots are made using Adobe Reader DC on MS Windows:

    The Resident Name fields

    The filled in Resident Name fields look like this

    While the height is appropriate, the glyphs are narrower than they should be. Actually this effect can already be seen in the original PDF:

    This horizontal compression is caused by the field widget rectangles having a different aspect ratio than the respectively matching normal appearance stream bounding box:

    • The widget rectangles: [ 45.72 601.44 118.924 615.24 ] and [ 119.282 601.127 192.486 614.927 ], i.e. 73.204*13.8 in both cases.
    • The appearance bounding box: [ 0 0 147.24 13.8 ], i.e. 147.24*13.8.

    So they have the same height but the appearance bounding box is approximately twice as wide as the widget rectangle. Thus, the text drawn normally in the appearance stream gets compressed to half its width when the appearance is displayed in the widget rectangle.

    When setting the value of a field PDFBox unfortunately re-uses the appearance stream as is and only updates details from the default appearance, i.e. font name, font size, and color, and the actual text value, apparently assuming the other properties of the appearance are as they are for a reason. Thus, the PDFBox output also shows this horizontal compression

    To make PDFBox create a proper appearance, it is necessary to remove the old appearances before setting the new value.

    The Phone fields

    The filled in Phone fields look like this

    and again there is a similar display in the original file

    That only the first two letters are shown even though there is enough space for the whole word, is due to the configuration of these fields: They are configured as comb fields with a maximum length of 2 characters.

    To have a value here set with PDFBox displayed completely and not so spaced out, you have to remove the maximum length (or at least have to make it no less than the length of your value) and unset the comb flag.

    The Care Providers Address fields

    Filled in they look like this:

    Originally they look similar:

    This vertical compression is again caused by the field widget rectangles having a different aspect ratio than the respectively matching normal appearance stream bounding box:

    • A widget rectangle: [ 278.6 642.928 458.36 657.96 ], i.e. 179.76*15.032.
    • The appearance bounding box: [ 0 0 179.76 58.56 ], i.e. 179.76*58.56.

    Just like in the case of the Resident Name fields above it is necessary to remove the old appearances before setting the new value to make PDFBox create a proper appearance.

    A complication

    Actually there is an additional issue when filling in the Care Providers Address fields, after removing the old appearances they look like this:

    This is due to a shortcoming of PDFBox: These fields are configured as multi line text fields. While PDFBox for single line text fields properly calculates the font size based on the content and later finely makes sure that the text vertically fits quite well, it proceeds very crudely for multi line fields, it selects a hard coded font size of 12 and does not fine tune the vertical position, see the code of the AppearanceGeneratorHelper methods calculateFontSize(PDFont, PDRectangle) and insertGeneratedAppearance(PDAnnotationWidget, PDAppearanceStream, OutputStream).

    As in your form these address fields anyways are only one line high, an obvious solution would be to make these fields single line fields, i.e. clear the Multiline flag.

    Example code

    Using Java one can implement the solutions explained above like this:

    final int FLAG_MULTILINE = 1 << 12;
    final int FLAG_COMB = 1 << 24;
    
    PDDocument doc = PDDocument.load(originalStream);
    PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
    
    PDType0Font font = PDType0Font.load(doc, fontStream, false);
    String font_name = acroForm.getDefaultResources().add(font).getName();
    
    for (PDField field : acroForm.getFieldTree()) {
        if (field instanceof PDTextField) {
            PDTextField textField = (PDTextField) field;
            textField.getCOSObject().removeItem(COSName.MAX_LEN);
            textField.getCOSObject().setFlag(COSName.FF, FLAG_COMB | FLAG_MULTILINE, false);;
            textField.setDefaultAppearance(String.format("/%s 0 Tf 0 g", font_name));
            textField.getWidgets().forEach(w -> w.getAppearance().setNormalAppearance((PDAppearanceEntry)null));
            textField.setValue("Test");
        }
    }
    

    (FillInForm test testFill0DropOldAppearanceNoCombNoMaxNoMultiLine)

    Screen shots of the output of the example code

    The Resident Name field value now is not vertically compressed anymore:

    The Phone and Care Providers Address fields also look appropriate now:

    0 讨论(0)
提交回复
热议问题