TransformerFactory - avoiding network lookups to verify DTDs

ぐ巨炮叔叔 提交于 2019-12-05 20:56:37

Ok, here is the answer which works for me.

1st step : load the original document, turning off validation and dtd loading within the factory.

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// stop the network loading of DTD files
factory.setValidating(false);
factory.setNamespaceAware(true);
factory.setFeature("http://xml.org/sax/features/namespaces", false);
factory.setFeature("http://xml.org/sax/features/validation", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// open up the xml document
DocumentBuilder docbuilder = factory.newDocumentBuilder();
Document doc = docbuilder.parse(new FileInputStream(m_strFilePath));

2nd step : Now that I have got the document in memory ... and after having detected that I need to transform it -

TransformerFactory transformfactory = TransformerFactory.newInstance();
Templates xsl = transformfactory.newTemplates(new StreamSource(new FileInputStream((String)m_XslFile)));
Transformer transformer = xsl.newTransformer();
Document newdoc = docbuilder.newDocument();
Result XmlResult = new DOMResult(newdoc);
// now transform
transformer.transform(
        new DOMSource(doc.getDocumentElement()),
        XmlResult);

I needed to do this as I have further processing going on afterwards and did not want the overhead of outputting to file and reloading.

Little explanation :

The trick is to use the original DOM object which has had all the validation features turned off. You can see this here :

transformer.transform(
        new DOMSource(doc.getDocumentElement()),  // <<-----
        XmlResult);

This has been tested with network access TURNED OFF. So I know that there are no more network lookups.

However, if the DTDs, MODs, etc are available locally, then, as per the suggestions, the use of an EntityResolver is the answer. This to be applied, again, to the original docbuilder object.

I now have a transformed document stored in newdoc, ready to play with.

I hope this will help others.

You can use a library like Apache xml-commons-resolver and write a catalog file to map web URLs to your local copy of the relevant files. To wire this catalog up to the transformer mechanism you would need to use a SAXSource instead of a StreamSource as the source of your stylesheet:

SAXSource styleSource = new SAXSource(new InputSource("file:/path/to/stylesheet.xsl"));
CatalogResolver resolver = new CatalogResolver();
styleSource.getXMLReader().setEntityResolver(resolver);
TransformerFactory tf = TransformerFactory.newInstance();
tf.setURIResolver(resolver);
Transformer transformer = tf.newTransformer(styleSource);

The usual way to do this in Java is to use an LSResourceResolver to resolve the system ID (and/or public ID) to your local file. This is documented at http://docs.oracle.com/javase/7/docs/api/org/w3c/dom/ls/LSResourceResolver.html. You shouldn't need anything outside of standard Java XML parser features to get this working.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!