I have a file which contains multiple sets of root elements. How can I extract the root element one by one?
This is my XML
If your XML is valid, using a SAX or DOM parser. Please consult the XML Developer's Kit Programmer's Guide for more details.
You cannot parse your file using an XML parser because your file is not XML. XML cannot have more than one root element.
You have to treat it as text, repair it to be well-formed, and then you can parse it with an XML parser.
Use java.io.SequenceInputStream to trick xml parser:
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.ByteArrayInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.SequenceInputStream;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
public class MultiRootXML{
public static void main(String[] args) throws Exception{
List<InputStream> streams = Arrays.asList(
new ByteArrayInputStream("<root>".getBytes()),
new FileInputStream("persons.xml"),
new ByteArrayInputStream("</root>".getBytes())
);
InputStream is = new SequenceInputStream(Collections.enumeration(streams));
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is);
NodeList children = doc.getDocumentElement().getChildNodes();
for(int i=0; i<children.getLength(); i++){
Node child = children.item(i);
if(child.getNodeType()==Node.ELEMENT_NODE){
System.out.println("persion: "+child);
}
}
}
}