I want to create a word document from an HTML page. I am planning to get the values on the HTML page and then pass these values to a document template. I have used JSOUP to
JODReports and Docmosis might also be useful options for you since there is template populate and Doc output. If DOCX is your real target, then you can write out the document yourself since the XML is published - but that is a lot of work.
i suggest you use xslt, because your data is already in xml-format and there are well defined xml-formats from microsoft.
You could write a document template with word and save it in xml-format. Then you can convert the word-xml to a xsl-template with your html-xml as input. After the xslt-transformation you have a valid word-xml with your dynamic values from the html-xml.
XSLT example for excel
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" encoding="UTF-8" omit-xml-declaration="no" />
<xsl:template match="/">
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
...
<xsl:for-each
select="/yourroot/person">
...
<Cell ss:StyleID="uf">
<Data ss:Type="String">
<xsl:value-of
select="@Name" />
</Data>
</Cell>
..
</xsl:for-each>
...
</xsl:template>
</xsl:stylesheet>
I found something very Interesting and simple. We just need to create a simple .xml template for the document we want to create and then programmatically change the contents of the xml file and save it as a ms word document.
You can find the xml template and the code here.