extracting <table> from html string and generating pdf using

本小妞迷上赌 提交于 2019-12-13 04:39:45

问题


I am trying to extract table tags(html) from a string and output them as table on pdf which I download on my local.

As the string which contains the html content is dynamic, so I can't do cell by cell or row by row mapping.

For eg.

private String message = "<html><body><p class=\"MsoNormal\"><b><span style=\"color: rgb(68, 84, 106);\">Dear Agent,<br><br>Please be informed that because no TRMF or reason for delay were received by the due date mentioned below, we consider the Transaction to be Paid in Error. We are going to act accordingly which means charging the Paying Account in case we are not able to defend legal dispute without TRMF.</span></b><span style=\"font-size: 10pt; line-height: 14.2667px;\"><o:p></o:p></span></p><p class=\"MsoNormal\"><span style=\"font-size: 10pt; line-height: 14.2667px;\">&nbsp;</span></p><div><span style=\"font-size: 10pt; line-height: 14.2667px;\"><br></span></div><table class=\"MsoNormalTable\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"0\" style=\"width: 472.9pt; margin-left: 5.9pt;border-collapse: collapse;\"><tr><td>Neeraj</td><td>Chand</td></tr><tr><td>Sowmya</td><td>Javvadi</td></tr></table></body></html>";

I will be receiving such string which will hold the html content. I have to generate the pdf file corresponding to such content. The input string might or might not have any table content.

I tried below but it doesn't work and I receive error that "table width can't be 0".

public StreamedContent getFile() throws IOException, DocumentException {
        final PortletResponse portletResponse = (PortletResponse) FacesContext.getCurrentInstance().getExternalContext()
                .getResponse();
        final HttpServletResponse res = PortalUtil.getHttpServletResponse(portletResponse);
        res.setContentType("application/pdf");
        res.setHeader("Cache-Control", "no-store, no-cache, must-revalidate");
        // res.setHeader("Content-Disposition", "attachment; filename=\".pdf\"");
        res.setHeader("Content-Disposition", "attachment; filename=" + subject + ".pdf");
        res.setHeader("Refresh", "1");
        res.flushBuffer();
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        OutputStream out = res.getOutputStream();
        Document document = new Document(PageSize.LETTER);
        PdfWriter.getInstance(document, baos);
        document.open();
        document.addCreationDate();
        /* without parsing html, it works and generates pdf
        Table table = new Table(2, 2);
        document.add(new Paragraph("converted to PdfPTable:"));
        table.setConvert2pdfptable(true);
        document.add(table);
         */

        //below doesn't work
        HTMLWorker htmlWorker = new HTMLWorker(document);
        String str = this.getMessage();
        htmlWorker.parse(new StringReader(str));
        PdfPTable table= new PdfPTable(2); // not sure what to give here as nummber of columns is dynamic
        table.setTotalWidth(document.getPageSize().getWidth() - 80);
        document.add(table);
        document.close();
        baos.writeTo(out);
        out.flush();
        out.close();
        return null;
    }

Is there a way I can generate pdf from any html string provided? Or if there is any other tool which I can use for this please let me know.

来源:https://stackoverflow.com/questions/52773998/extracting-table-from-html-string-and-generating-pdf-using

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!