Efficient XSLT pipeline in Java (or redirecting Results to Sources)

纵饮孤独 提交于 2019-11-27 18:49:59
Mads Hansen

I found this: #3. Chaining Transformations that shows two ways to use the TransformerFactory to chain transformations, having the results of one transform feed the next transform and then finally output to system out. This avoids the need for an intermediate serialization to String, file, etc. between transforms.

When multiple, successive transformations are required to the same XML document, be sure to avoid unnecessary parsing operations. I frequently run into code that transforms a String to another String, then transforms that String to yet another String. Not only is this slow, but it can consume a significant amount of memory as well, especially if the intermediate Strings aren't allowed to be garbage collected.

Most transformations are based on a series of SAX events. A SAX parser will typically parse an InputStream or another InputSource into SAX events, which can then be fed to a Transformer. Rather than having the Transformer output to a File, String, or another such Result, a SAXResult can be used instead. A SAXResult accepts a ContentHandler, which can pass these SAX events directly to another Transformer, etc.

Here is one approach, and the one I usually prefer as it provides more flexibility for various input and output sources. It also makes it fairly easy to create a transformation chain dynamically and with a variable number of transformations.

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();

// These templates objects could be reused and obtained from elsewhere.
Templates templates1 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet1.xslt")));
Templates templates2 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet1.xslt")));

TransformerHandler th1 = stf.newTransformerHandler(templates1);
TransformerHandler th2 = stf.newTransformerHandler(templates2);

th1.setResult(new SAXResult(th2));
th2.setResult(new StreamResult(System.out));

Transformer t = stf.newTransformer();
t.transform(new StreamSource(System.in), new SAXResult(th1));

// th1 feeds th2, which in turn feeds System.out.

Your best bet is to stick to DOM as you're doing, because an XSLT processor would have to build a tree anyway - streaming is only an option for very limited category of transforms, and few if any processors can figure it out automatically and switch to a streaming-only implementation; otherwise they just read the input and build the tree.

Vadzim

Related question Efficient XSLT pipeline, with params, in Java clarified on correct parameters passing to such transformer chain.

And it also gave a hint on slightly shorter solution without third transformer:

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();

Templates templates1 = stf.newTemplates(new StreamSource(
        getClass().getResourceAsStream("MyStylesheet1.xslt")));
Templates templates2 = stf.newTemplates(new StreamSource(
        getClass().getResourceAsStream("MyStylesheet2.xslt")));

TransformerHandler th1 = stf.newTransformerHandler(templates1);
TransformerHandler th2 = stf.newTransformerHandler(templates2);

th2.setResult(new StreamResult(System.out));

// Note that indent, etc should be applied to the last transformer in chain:
th2.getTransformer().setOutputProperty(OutputKeys.INDENT, "yes");

th1.getTransformer().transform(new StreamSource(System.in), new SAXResult(th2));
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!