Using pdfbox to get form field values

前端 未结 2 1849
隐瞒了意图╮
隐瞒了意图╮ 2021-02-11 02:11

I\'m using pdfbox for the first time. Now I\'m reading something on the website Pdf

Summarizing I have a pdf like this:

相关标签:
2条回答
  • 2021-02-11 02:41

    The field can be a top-level field. So you need to loop until it is no longer a top-level field, then you can get the value. Code snippet below loops through all the fields and outputs the field names and values.

    {
        //from your original code
        PDDocument pdDoc = PDDocument.loadNonSeq( myFile, null );
        PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
        PDAcroForm pdAcroForm = pdCatalog.getAcroForm();
    
    
        //get all fields in form
        List<PDField> fields = acroForm.getFields();
        System.out.println(fields.size() + " top-level fields were found on the form");
    
        //inspect field values
        for (PDField field : fields)
        {
                processField(field, "|--", field.getPartialName());
        }
    
        ...
    }
    
    
    private void processField(PDField field, String sLevel, String sParent) throws IOException
    {
            String partialName = field.getPartialName();
    
            if (field instanceof PDNonTerminalField)
            {
                    if (!sParent.equals(field.getPartialName()))
                    {
                            if (partialName != null)
                            {
                                    sParent = sParent + "." + partialName;
                            }
                    }
                    System.out.println(sLevel + sParent);
    
                    for (PDField child : ((PDNonTerminalField)field).getChildren())
                    {
                            processField(child, "|  " + sLevel, sParent);
                    }
            }
            else
            {
                //field has no child. output the value
                    String fieldValue = field.getValueAsString();
                    StringBuilder outputString = new StringBuilder(sLevel);
                    outputString.append(sParent);
                    if (partialName != null)
                    {
                            outputString.append(".").append(partialName);
                    }
                    outputString.append(" = ").append(fieldValue);
                    outputString.append(",  type=").append(field.getClass().getName());
                    System.out.println(outputString);
            }
    }
    
    0 讨论(0)
  • 2021-02-11 02:45

    The code you have should work. If you are actually looking to do something with the values, you'll likely need to use some other methods. For example, you can get specific fields using pdAcroForm.getField(<fieldName>):

    PDField firstNameField = pdAcroForm.getField("firstName");
    PDField lastNameField = pdAcroForm.getField("lastName");
    

    Note that PDField is just a base class. You can cast things to sub classes to get more interesting information from them. For example:

    PDCheckbox fullTimeSalary = (PDCheckbox) pdAcroForm.getField("fullTimeSalary");
    if(fullTimeSalary.isChecked()) {
        log.debug("The person earns a full-time salary");
    } else {
        log.debug("The person does not earn a full-time salary");
    }
    

    As you suggest, you'll find more information at the apache pdfbox website.

    0 讨论(0)
提交回复
热议问题