I am parsing a XML file in Java using the W3C DOM. I am stuck at a specific problem, I can\'t figure out how to get the whole inner XML of a node.
The node looks lik
er... you could also call toString() and just chop off the beginning and end tags, either manually or using regexps.
edit: toString() doesn't do what I expected. Pulling out the O'Reilly Java & XML book talks about the Load and Save module of Java DOM.
See in particular the LSSerializer which looks very promising. You could either call writeToString(node) and chop off the beginning and end tags, as I suggested, or try to use LSSerializerFilter to not print the top node tags (not sure if that would work; I admit I've never used LSSerializer before.)
Reading the O'Reilly book seems to indicate doing something like this:
DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementationLS lsImpl =
(DOMImplementationLS)registry.getDOMImplementation("LS");
LSSerializer serializer = lsImpl.createLSSerializer();
String nodeString = serializer.writeToString(node);
node.getTextContent();
You ought to be using JDom of Dom4J to handle nodes, if for no other reasons, to handle whitespace correctly.
To remove unneccesary tags probably such code can be used:
DOMConfiguration config = serializer.getDomConfig(); config.setParameter("canonical-form", true);
But it will not always work, because "canonical-form=true" is optional
I know this was asked long ago but for the next person searching (was me today), this works with JDOM:
JDOMXPath xpath = new JDOMXPath("/td");
String innerXml = (new XMLOutputter()).outputString(xpath.selectNodes(document));
This passes a list of all child nodes into outputString, which will serialize them out in order.
What do you say about this ? I had same problem today on android, but i managed to make simple "serializator"
private String innerXml(Node node){
String s = "";
NodeList childs = node.getChildNodes();
for( int i = 0;i<childs.getLength();i++ ){
s+= serializeNode(childs.item(i));
}
return s;
}
private String serializeNode(Node node){
String s = "";
if( node.getNodeName().equals("#text") ) return node.getTextContent();
s+= "<" + node.getNodeName()+" ";
NamedNodeMap attributes = node.getAttributes();
if( attributes!= null ){
for( int i = 0;i<attributes.getLength();i++ ){
s+=attributes.item(i).getNodeName()+"=\""+attributes.item(i).getNodeValue()+"\"";
}
}
NodeList childs = node.getChildNodes();
if( childs == null || childs.getLength() == 0 ){
s+= "/>";
return s;
}
s+=">";
for( int i = 0;i<childs.getLength();i++ )
s+=serializeNode(childs.item(i));
s+= "</"+node.getNodeName()+">";
return s;
}
You have to use the transform/xslt API using your <b> node as the node to be transformed and put the result into a new StreamResult(new StringWriter()); . See how-to-pretty-print-xml-from-java