HTML DOM Tree to String - Transformer NullPointerException

回眸只為那壹抹淺笑 提交于 2019-12-11 19:13:55

问题


I'm trying to convert the content of an org.w3c.dom.Document object into a string. I get the Document object of the current page displayed in the JBrowser component. The most common way to convert a document dom tree into a string seems to be using a javax.xml.transform.Transformer. So I implemented this:

ByteArrayOutputStream baos = new ByteArrayOutputStream();

TransformerFactory.newInstance().newTransformer().transform(
            new DOMSource(aDocument), new StreamResult(baos));

return baos.toString();

This works for simple websites, but the more complex they get the higher the probability of me getting this exception:

    ERROR:  ''
05.07.2012 10:17:09 com.de.test.Demonstrator$1 run
FATAL: null
javax.xml.transform.TransformerException: java.lang.NullPointerException
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:717)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at com.de.test.DocumentUtils.toHTML(DocumentUtils.java:47)
at com.de.test.Demonstrator$1.run(Demonstrator.java:172)
Caused by: java.lang.NullPointerException
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:178)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:132)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:94)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:662)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:708)
... 3 more
---------
java.lang.NullPointerException
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:178)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:132)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:94)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:662)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:708)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at com.de.test.DocumentUtils.toHTML(DocumentUtils.java:47)
at com.de.test.Demonstrator$1.run(Demonstrator.java:172)

After some research found out about the hint, that some text elements might be null and that this causes the Transformer to crash. So I did just that:

    public static final void traverseLevel(TreeWalker walker, Document aDocument, String indent)
{
    // describe current node:
    Node parent = walker.getCurrentNode();

    if (parent != null && parent.getNodeValue() == null)
        parent.setNodeValue(" ");

    System.out.println(indent + "- <" + ((Element) parent).getTagName() + ">" + parent.getNodeValue());

    // traverse children:
    for (Node n = walker.firstChild(); n != null; n = walker.nextSibling())
    {
        if(n != null)
            traverseLevel(walker, aDocument, indent + '\t');
    }

    System.out.println("</"+ ((Element) parent).getTagName() + ">");

    // return position to the current (level up):
    walker.setCurrentNode(parent);
}

This is where I found out that "parent.getNodeValue()" always returns null. The funny thing is, that the problem also occurs on simple websites, but the transformer still outputs the tree's values. Any idea what's wrong with my replacement of null text nodes? Are there other potential problems that might cause this issue?

Thanks!


回答1:


Ok, I found a workaround for my problem. I've changed the JBrowser to the DJ Project browser which has a convertToHtml() function. I could not solve the Transformer problem that's why I've taken this route.



来源:https://stackoverflow.com/questions/11340779/html-dom-tree-to-string-transformer-nullpointerexception

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!