How can I convert XHTML nested list to pdf with iText?

前端 未结 2 1479
情书的邮戳
情书的邮戳 2020-11-30 14:21

I have XHTML content, and I have to create from this content a PDF file on the fly. I use iText pdf converter. I tried the simple way, but I always get bad result after call

相关标签:
2条回答
  • 2020-11-30 14:59

    Please take a look at the example NestedListHtml

    In this example, I take your code snippet list.html:

    <ul>
      <li>First
        <ol>
          <li>Second</li>
          <li>Second</li>
        </ol>
      </li>
      <li>First</li>
    </ul>
    

    And I parse it into an ElementList:

    // CSS
    CSSResolver cssResolver =
        XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
    
    // HTML
    HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
    htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
    htmlContext.autoBookmark(false);
    
    // Pipelines
    ElementList elements = new ElementList();
    ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
    HtmlPipeline html = new HtmlPipeline(htmlContext, end);
    CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
    
    // XML Worker
    XMLWorker worker = new XMLWorker(css, true);
    XMLParser p = new XMLParser(worker);
    p.parse(new FileInputStream(HTML));
    

    Now I can add this list to the Document:

    for (Element e : elements) {
        document.add(e);
    }
    

    Or I can list this list to a Paragraph:

    Paragraph para = new Paragraph();
    for (Element e : elements) {
        para.add(e);
    }
    document.add(para);
    

    You will get the desired result as shown in nested_list.pdf

    You can not add nested lists to a PdfPCell or to a ColumnText. For instance: this will not work:

    PdfPTable table = new PdfPTable(2);
    table.addCell("Nested lists don't work in a cell");
    PdfPCell cell = new PdfPCell();
    for (Element e : elements) {
        cell.addElement(e);
    }
    table.addCell(cell);
    document.add(table);
    

    This is due to a limitation in the ColumnText class that has been there for many years. We have evaluated the problem and the only way to fix this, would be to rewrite ColumnText entirely. This is not an item on our current technical road map.

    0 讨论(0)
  • 2020-11-30 15:05

    Here's a workaround for nested ordered and un-ordered lists.

    The rich Text editor I am using giving the class attribute "ql-indent-1/2/2/" for li tags, based on the attribute adding ul/ol starting and ending tags.

    public String replaceIndentSubList(String htmlContent) {
        org.jsoup.nodes.Document document = Jsoup.parseBodyFragment(htmlContent);
        Elements element_UL = document.select("ul");
        Elements element_OL = document.select("ol");
        if (!element_UL.isEmpty()) {
            htmlContent = replaceIndents(htmlContent, element_UL, "ul");
        }
        if (!element_OL.isEmpty()) {
            htmlContent = replaceIndents(htmlContent, element_OL, "ol");
        }
        return htmlContent;
    }
    
    
    public String replaceIndents(String htmlContent, Elements element, String tagType) {
        String attributeKey = "class";
        String startingULTgas = "<" + tagType + ">";
        String endingULTags = "</" + tagType + ">";
        int lengthOfQLIndenet = new String("ql-indent-").length();
        HashMap<String, String> startingLiTagMap = new HashMap<String, String>();
        HashMap<String, String> lastLiTagMap = new HashMap<String, String>();
        Pattern regex = Pattern.compile("ql-indent-\\d");
        HashSet<String> hash_Set = new HashSet<String>();
        Elements element_Tag = element.select("li");
        for (org.jsoup.nodes.Element element2 : element_Tag) {
            org.jsoup.nodes.Attributes att = element2.attributes();
            if (att.hasKey(attributeKey)) {
                String attributeValue = att.get(attributeKey);
                Matcher matcher = regex.matcher(attributeValue);
                if (matcher.find()) {
                    if (!startingLiTagMap.containsKey(attributeValue)) {
                        startingLiTagMap.put(attributeValue, element2.toString());
                    }
                    hash_Set.add(matcher.group(0));
                    if (!startingLiTagMap.get(attributeValue)
                            .equalsIgnoreCase(element2.toString())) {
                        lastLiTagMap.put(attributeValue, element2.toString());
                    }
                }
            }
        }
        System.out.println(htmlContent);
        Iterator value = hash_Set.iterator();
        while (value.hasNext()) {
            String liAttributeKey = (String) value.next();
            int noOfIndentes = Integer
                    .parseInt(liAttributeKey.substring(lengthOfQLIndenet));
            if (noOfIndentes > 1)
                for (int i = 1; i < noOfIndentes; i++) {
                    startingULTgas = startingULTgas + "<" + tagType + ">";
                    endingULTags = endingULTags + "</" + tagType + ">";
                }
            htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
                    startingULTgas + startingLiTagMap.get(liAttributeKey));
            if (lastLiTagMap.get(liAttributeKey) != null) {
                System.out.println("Inside last Li Map");
                htmlContent = htmlContent.replace(lastLiTagMap.get(liAttributeKey),
                        lastLiTagMap.get(liAttributeKey) + endingULTags);
            }
            else {
                htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
                        startingLiTagMap.get(liAttributeKey) + endingULTags);
            }
            startingULTgas = "<" + tagType + ">";
            endingULTags = "</" + tagType + ">";
        }
        System.out.println(htmlContent);[enter image description here][1]
        return htmlContent;
    }
    
    0 讨论(0)
提交回复
热议问题