XPath and Java with Repeated Tags

问题

I'm having some trouble parsing an XML file in Java. The file takes the form:

<root>
  <thing>
    <name>Thing1</name>
    <property>
      <name>Property1</name>
    </property>
    ...
  </thing>
  ...
</root>

Ultimately, I would like to convert this file into a list of Thing objects, which will have a String name (Thing1) and a list of Property objects, which will each also have a name (Property1).

I've been trying to use xpaths to get this data out, but when I try to get just the name for 'thing', it gives me all of the names that appear in 'thing', including those of the 'property's. My code is:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse(filename);
XPath xpath = XPathFactory.newInstance().newXPath();


XPathExpression thingExpr = xpath.compile("//thing");
NodeList things = (NodeList)thingExpr.evaluate(dom, XPathConstants.NODESET);
for(int count = 0; count < things.getLength(); count++)
{
    Element thing = (Element)things.item(count);
    XPathExpression nameExpr = xpath.compile(".//name/text()");
    NodeList name = (NodeList) nameExpr.evaluate(thing, XPathConstants.NODESET);
    for(int i = 0; i < name.getLength(); i++)
    {
        System.out.println(name.item(i).getNodeValue());    
    }
}

Can anyone help? Thanks in advance!

回答1:

You could try something like...

public class TestXPath {

    public static void main(String[] args) {
        String xml =
                        "<root>\n"
                        + "    <thing>\n"
                        + "        <name>Thing1</name>\n"
                        + "        <property>\n"
                        + "            <name>Property1</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property2</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property3</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property4</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property5</name>\n"
                        + "        </property>\n"
                        + "    </thing>/n"
                        + "    <NoAThin>\n"
                        + "        <name>Thing2</name>\n"
                        + "        <property>\n"
                        + "            <name>Property1</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property2</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property3</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property4</name>\n"
                        + "        </property>\n"
                        + "        <property>\n"
                        + "            <name>Property5</name>\n"
                        + "        </property>\n"
                        + "    </NoAThin>/n"
                        + "</root>";

        try {
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            ByteArrayInputStream bais = new ByteArrayInputStream(xml.getBytes());
            Document dom = db.parse(bais);
            XPath xpath = XPathFactory.newInstance().newXPath();

            // Find the "thing" node...
            XPathExpression thingExpr = xpath.compile("/root/thing");
            NodeList things = (NodeList) thingExpr.evaluate(dom, XPathConstants.NODESET);

            System.out.println("Found " + things.getLength() + " thing nodes...");

            // Find the property nodes of thing
            XPathExpression expr = xpath.compile("property");
            NodeList nodes = (NodeList) expr.evaluate(things.item(0), XPathConstants.NODESET);

            System.out.println("Found " + nodes.getLength() + " thing/property nodes...");

            // Find all the property "name" nodes under thing
            expr = xpath.compile("property/name");
            nodes = (NodeList) expr.evaluate(things.item(0), XPathConstants.NODESET);

            System.out.println("Found " + nodes.getLength() + " name nodes...");
            System.out.println("Property value = " + nodes.item(0).getTextContent());

            // Find all nodes that have property nodes
            XPathExpression exprAll = xpath.compile("/root/*/property");
            NodeList nodesAll = (NodeList) exprAll.evaluate(dom, XPathConstants.NODESET);
            System.out.println("Found " + nodesAll.getLength() + " property nodes...");

        } catch (Exception exp) {
            exp.printStackTrace();
        }
    }
}

Which will give you an output of something like

Found 1 thing nodes...
Found 5 thing/property nodes...
Found 5 name nodes...
Property value = Property1
Found 10 property nodes...

回答2:

How about "//thing/name/text()" ?

The double slashes you have now before name mean "anywhere in the tree, not necessarily direct child nodes".

回答3:

Use these XPath expressions:

//thing[name='Thing1']

this selects any thing element in the XML document, that has a name child whose string value is "Thing1".

also use:

//property[name='Property1']

this selects any property element in the XML document, that has a name child whose string value is "Property1".

Update:

To get all text nodes, each containing a string value of a thing element, just do:

//thing/text()

In XPath 2.0 one can get a sequence of the strings themselves, using:

//thing/string(.)

This isn't possible with a single XPath expression, but one can get the string value of a particular (the n-th)thing element like this:

string((//thing)[$n])

where $n must be substituted with a specific number from 1 to count(//thing) .

So, in your prograaming language, you can first determine cnt by evaluating this XPath expression:

count(//thing)

And then in a loop for $n from 1 to cnt dynamically produce the xpath expression and evaluate it:

string((//thing)[$n])

Exactly the same goes for obtaining all the values for property elements.

来源：https://stackoverflow.com/questions/12966127/xpath-and-java-with-repeated-tags

标签

java

xml

xpath