PDFBox Form fill - saveIncremental does not work

后端 未结 1 437
予麋鹿
予麋鹿 2021-01-14 11:41

I have a pdf file with some form field that I want to fill from java. Right now I\'m trying to fill just one form which I am finding by its name. My code looks like this:

相关标签:
1条回答
  • 2021-01-14 12:00

    In general

    When saving changes to a PDF as an incremental update with PDFBox 2.0.x, you have to set the property NeedToBeUpdated to true for every PDF object changed. Furthermore, the object must be reachable from the PDF catalog via a chain of references, and each PDF object in this chain also has to have the property NeedToBeUpdated set to true.

    This is due to the way PDFBox saves incrementally, starting from the catalog it inspects the NeedToBeUpdated property, and if it is set to true, PDFBox stores the object, and only in this case it recurses deeper into the objects referenced from this object in search for more objects to store.

    In particular this implies that some objects unnecessarily have to be marked NeedToBeUpdated, e.g. the PDF catalog itself, and in some cases this even defeats the purpose of the incremental update at large, see below.

    In case of the OP's document

    Setting the NeedToBeUpdated properties

    On one hand one has to extend the setField method to mark the chain of field dictionaries up to and including the changed field and also the appearance:

    public static void setField(PDDocument pdfDocument, String name, String Value) throws IOException 
    {
        PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
        PDAcroForm acroForm = docCatalog.getAcroForm();
        PDField field = acroForm.getField(name);
    
        if (field instanceof PDCheckBox) {
            field.setValue("Yes");
        }
        else if (field instanceof PDTextField) {
            System.out.println("Original value: " + field.getValueAsString());
            field.setValue(Value);
            System.out.println("New value: " + field.getValueAsString());
        }
        else {
            System.out.println("Nie znaleziono pola");
        }
    
        // vvv--- new 
        COSDictionary fieldDictionary = field.getCOSObject();
        COSDictionary dictionary = (COSDictionary) fieldDictionary.getDictionaryObject(COSName.AP);
        dictionary.setNeedToBeUpdated(true);
        COSStream stream = (COSStream) dictionary.getDictionaryObject(COSName.N);
        stream.setNeedToBeUpdated(true);
        while (fieldDictionary != null)
        {
            fieldDictionary.setNeedToBeUpdated(true);
            fieldDictionary = (COSDictionary) fieldDictionary.getDictionaryObject(COSName.PARENT);
        }
        // ^^^--- new 
    }
    

    (FillInFormSaveIncremental method setField)

    On the other hand the main code has to be extended to mark a chain from the catalog to the fields array:

    PDDocument document = PDDocument.load(...);
    PDDocumentCatalog doc = document.getDocumentCatalog();
    PDAcroForm Form = doc.getAcroForm();
    
    String formName = "topmostSubform[0].Page1[0].pana_pania[0]";
    PDField f = Form.getField(formName);
    setField(document, formName, "Artur");
    System.out.println("New value 2nd: " + f.getValueAsString());
    
    // vvv--- new 
    COSDictionary dictionary = document.getDocumentCatalog().getCOSObject();
    dictionary.setNeedToBeUpdated(true);
    dictionary = (COSDictionary) dictionary.getDictionaryObject(COSName.ACRO_FORM);
    dictionary.setNeedToBeUpdated(true);
    COSArray array = (COSArray) dictionary.getDictionaryObject(COSName.FIELDS);
    array.setNeedToBeUpdated(true);
    // ^^^--- new 
    
    document.saveIncremental(new FileOutputStream(...));
    document.close();
    

    (FillInFormSaveIncremental test testFillInSkierowanie3)

    Beware: for use with generic PDFs one obviously should introduce some null tests...


    Opening the result file in Adobe Reader one will unfortunately see that the program complains about changes which disable extended features in the file.

    This is due to the quirk in PDFBox' incremental saving that it requires some unnecessary objects in the update section. In particular the catalog is saved there which contains a usage rights signature (the technology granting extended features). The re-saved signature obviously is not at its original position in its original revision anymore. Thus, is invalidated.

    Most likely the OP OP wanted to save the PDF incrementally to not break this signature but PDFBox does not permit this. Oh well...

    Thus, the only thing one can do is prevent the warning by completely removing the signature.

    Removing the usage rights signature

    We already have retrieved the catalog object in the additions above, so removing the signature is easy:

    COSDictionary dictionary = document.getDocumentCatalog().getCOSObject();
    // vvv--- new 
    dictionary.removeItem(COSName.PERMS);
    // ^^^--- new 
    dictionary.setNeedToBeUpdated(true);
    

    (FillInFormSaveIncremental test testFillInSkierowanie3)


    Opening the result file in Adobe Reader one will unfortunately see that the program complains about missing extended features in the file to save it.

    This is due to the fact that Adobe Reader requires extended features to save changes to XFA forms, extended features we had to remove in this step.

    But the document at hand is a hybrid AcroForm & XFA form document, and Adobe Reader requires no extended features to save AcroForm documents. Thus, all we have to do is remove the XFA form. As our code only sets the AcroForm value, this is a good idea anyways...

    Removing the XFA form

    We already have retrieved the acroform object in the additions above, so removing the XFA form referenced from there is easy:

    dictionary = (COSDictionary) dictionary.getDictionaryObject(COSName.ACRO_FORM);
    // vvv--- new 
    dictionary.removeItem(COSName.XFA);
    // ^^^--- new 
    dictionary.setNeedToBeUpdated(true);
    

    (FillInFormSaveIncremental test testFillInSkierowanie3)


    Opening the result file in Adobe Reader one will see that one now can without further ado edit the form and save the file.

    Beware, a sufficiently new Adobe Reader version is required for this, earlier versions (up to at least version 9) did require extended features even for saving changes to an AcroForm form

    0 讨论(0)
提交回复
热议问题