How to read XML using XPath in Java

后端 未结 8 1371
无人及你
无人及你 2020-11-21 05:38

I want to read XML data using XPath in Java, so for the information I have gathered I am not able to parse XML according to my requirement.

here is what I want to do

相关标签:
8条回答
  • 2020-11-21 05:55

    This shows you how to

    1. Read in an XML file to a DOM
    2. Filter out a set of Nodes with XPath
    3. Perform a certain action on each of the extracted Nodes.

    We will call the code with the following statement

    processFilteredXml(xmlIn, xpathExpr,(node) -> {/*Do something...*/;});
    

    In our case we want to print some creatorNames from a book.xml using "//book/creators/creator/creatorName" as xpath to perform a printNode action on each Node that matches the XPath.

    Full code

    @Test
    public void printXml() {
        try (InputStream in = readFile("book.xml")) {
            processFilteredXml(in, "//book/creators/creator/creatorName", (node) -> {
                printNode(node, System.out);
            });
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    
    private InputStream readFile(String yourSampleFile) {
        return Thread.currentThread().getContextClassLoader().getResourceAsStream(yourSampleFile);
    }
    
    private void processFilteredXml(InputStream in, String xpath, Consumer<Node> process) {
        Document doc = readXml(in);
        NodeList list = filterNodesByXPath(doc, xpath);
        for (int i = 0; i < list.getLength(); i++) {
            Node node = list.item(i);
            process.accept(node);
        }
    }
    
    public Document readXml(InputStream xmlin) {
        try {
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            return db.parse(xmlin);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    
    private NodeList filterNodesByXPath(Document doc, String xpathExpr) {
        try {
            XPathFactory xPathFactory = XPathFactory.newInstance();
            XPath xpath = xPathFactory.newXPath();
            XPathExpression expr = xpath.compile(xpathExpr);
            Object eval = expr.evaluate(doc, XPathConstants.NODESET);
            return (NodeList) eval;
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    
    private void printNode(Node node, PrintStream out) {
        try {
            Transformer transformer = TransformerFactory.newInstance().newTransformer();
            transformer.setOutputProperty(OutputKeys.INDENT, "yes");
            transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
            transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
            StreamResult result = new StreamResult(new StringWriter());
            DOMSource source = new DOMSource(node);
            transformer.transform(source, result);
            String xmlString = result.getWriter().toString();
            out.println(xmlString);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    

    Prints

    <creatorName>Fosmire, Michael</creatorName>
    
    <creatorName>Wertz, Ruth</creatorName>
    
    <creatorName>Purzer, Senay</creatorName>
    

    For book.xml

    <book>
      <creators>
        <creator>
          <creatorName>Fosmire, Michael</creatorName>
          <givenName>Michael</givenName>
          <familyName>Fosmire</familyName>
        </creator>
        <creator>
          <creatorName>Wertz, Ruth</creatorName>
          <givenName>Ruth</givenName>
          <familyName>Wertz</familyName>
        </creator>
        <creator>
          <creatorName>Purzer, Senay</creatorName>
           <givenName>Senay</givenName>
           <familyName>Purzer</familyName>
        </creator>
      </creators>
      <titles>
        <title>Critical Engineering Literacy Test (CELT)</title>
      </titles>
    </book>
    
    0 讨论(0)
  • 2020-11-21 05:58

    Getting started example:

    xml file:

    <inventory>
        <book year="2000">
            <title>Snow Crash</title>
            <author>Neal Stephenson</author>
            <publisher>Spectra</publisher>
            <isbn>0553380958</isbn>
            <price>14.95</price>
        </book>
    
        <book year="2005">
            <title>Burning Tower</title>
            <author>Larry Niven</author>
            <author>Jerry Pournelle</author>
            <publisher>Pocket</publisher>
            <isbn>0743416910</isbn>
            <price>5.99</price>
        </book>
    
        <book year="1995">
            <title>Zodiac</title>
            <author>Neal Stephenson</author>
            <publisher>Spectra</publisher>
            <isbn>0553573862</isbn>
            <price>7.50</price>
        </book>
    
        <!-- more books... -->
    
    </inventory>
    

    Java code:

    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    
    import org.testng.annotations.DataProvider;
    import org.testng.annotations.Test;
    import org.w3c.dom.Document;
    import org.w3c.dom.Element;
    import org.w3c.dom.Node;
    import org.w3c.dom.NodeList;
    import org.xml.sax.SAXException;
    import org.xml.sax.SAXParseException;
    
    
    try {
    
        DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
        Document doc = docBuilder.parse (new File("c:\\tmp\\my.xml"));
    
        // normalize text representation
        doc.getDocumentElement().normalize();
        System.out.println ("Root element of the doc is " + doc.getDocumentElement().getNodeName());
    
        NodeList listOfBooks = doc.getElementsByTagName("book");
        int totalBooks = listOfBooks.getLength();
        System.out.println("Total no of books : " + totalBooks);
    
        for(int i=0; i<listOfBooks.getLength() ; i++) {
    
            Node firstBookNode = listOfBooks.item(i);
            if(firstBookNode.getNodeType() == Node.ELEMENT_NODE) {
    
                Element firstElement = (Element)firstBookNode;                              
                System.out.println("Year :"+firstElement.getAttribute("year"));
    
                //-------
                NodeList firstNameList = firstElement.getElementsByTagName("title");
                Element firstNameElement = (Element)firstNameList.item(0);
    
                NodeList textFNList = firstNameElement.getChildNodes();
                System.out.println("title : " + ((Node)textFNList.item(0)).getNodeValue().trim());
            }
        }//end of for loop with s var
    } catch (SAXParseException err) {
        System.out.println ("** Parsing error" + ", line " + err.getLineNumber () + ", uri " + err.getSystemId ());
        System.out.println(" " + err.getMessage ());
    } catch (SAXException e) {
        Exception x = e.getException ();
        ((x == null) ? e : x).printStackTrace ();
    } catch (Throwable t) {
        t.printStackTrace ();
    }                
    
    0 讨论(0)
  • 2020-11-21 06:09

    You can try this.

    XML Document

    Save as employees.xml.

    <?xml version="1.0" encoding="UTF-8"?>
    <Employees>
        <Employee id="1">
            <age>29</age>
            <name>Pankaj</name>
            <gender>Male</gender>
            <role>Java Developer</role>
        </Employee>
        <Employee id="2">
            <age>35</age>
            <name>Lisa</name>
            <gender>Female</gender>
            <role>CEO</role>
        </Employee>
        <Employee id="3">
            <age>40</age>
            <name>Tom</name>
            <gender>Male</gender>
            <role>Manager</role>
        </Employee>
        <Employee id="4">
            <age>25</age>
            <name>Meghan</name>
            <gender>Female</gender>
            <role>Manager</role>
        </Employee>
    </Employees>
    

    Parser class

    The class have following methods

    • List item
    • A Method that will return the Employee Name for input ID.
    • A Method that will return list of Employees Name with age greater than the input age.
    • A Method that will return list of Female Employees Name.

    Source Code

    import java.io.IOException;
    import java.util.ArrayList;
    import java.util.Arrays;
    import java.util.List;
    
    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.parsers.ParserConfigurationException;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathExpression;
    import javax.xml.xpath.XPathExpressionException;
    import javax.xml.xpath.XPathFactory;
    
    import org.w3c.dom.Document;
    import org.w3c.dom.NodeList;
    import org.xml.sax.SAXException;
    
    
    public class Parser {
    
        public static void main(String[] args) {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            factory.setNamespaceAware(true);
            DocumentBuilder builder;
            Document doc = null;
            try {
                builder = factory.newDocumentBuilder();
                doc = builder.parse("employees.xml");
    
                // Create XPathFactory object
                XPathFactory xpathFactory = XPathFactory.newInstance();
    
                // Create XPath object
                XPath xpath = xpathFactory.newXPath();
    
                String name = getEmployeeNameById(doc, xpath, 4);
                System.out.println("Employee Name with ID 4: " + name);
    
                List<String> names = getEmployeeNameWithAge(doc, xpath, 30);
                System.out.println("Employees with 'age>30' are:" + Arrays.toString(names.toArray()));
    
                List<String> femaleEmps = getFemaleEmployeesName(doc, xpath);
                System.out.println("Female Employees names are:" +
                        Arrays.toString(femaleEmps.toArray()));
    
            } catch (ParserConfigurationException | SAXException | IOException e) {
                e.printStackTrace();
            }
    
        }
    
    
        private static List<String> getFemaleEmployeesName(Document doc, XPath xpath) {
            List<String> list = new ArrayList<>();
            try {
                //create XPathExpression object
                XPathExpression expr =
                    xpath.compile("/Employees/Employee[gender='Female']/name/text()");
                //evaluate expression result on XML document
                NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
                for (int i = 0; i < nodes.getLength(); i++)
                    list.add(nodes.item(i).getNodeValue());
            } catch (XPathExpressionException e) {
                e.printStackTrace();
            }
            return list;
        }
    
    
        private static List<String> getEmployeeNameWithAge(Document doc, XPath xpath, int age) {
            List<String> list = new ArrayList<>();
            try {
                XPathExpression expr =
                    xpath.compile("/Employees/Employee[age>" + age + "]/name/text()");
                NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
                for (int i = 0; i < nodes.getLength(); i++)
                    list.add(nodes.item(i).getNodeValue());
            } catch (XPathExpressionException e) {
                e.printStackTrace();
            }
            return list;
        }
    
    
        private static String getEmployeeNameById(Document doc, XPath xpath, int id) {
            String name = null;
            try {
                XPathExpression expr =
                    xpath.compile("/Employees/Employee[@id='" + id + "']/name/text()");
                name = (String) expr.evaluate(doc, XPathConstants.STRING);
            } catch (XPathExpressionException e) {
                e.printStackTrace();
            }
    
            return name;
        }
    
    }
    
    0 讨论(0)
  • 2020-11-21 06:17

    You need something along the lines of this:

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document doc = builder.parse(<uri_as_string>);
    XPathFactory xPathfactory = XPathFactory.newInstance();
    XPath xpath = xPathfactory.newXPath();
    XPathExpression expr = xpath.compile(<xpath_expression>);
    

    Then you call expr.evaluate() passing in the document defined in that code and the return type you are expecting, and cast the result to the object type of the result.

    If you need help with a specific XPath expressions, you should probably ask it as separate questions (unless that was your question in the first place here - I understood your question to be how to use the API in Java).

    Edit: (Response to comment): This XPath expression will get you the text of the first URL element under PowerBuilder:

    /howto/topic[@name='PowerBuilder']/url/text()
    

    This will get you the second:

    /howto/topic[@name='PowerBuilder']/url[2]/text()
    

    You get that with this code:

    expr.evaluate(doc, XPathConstants.STRING);
    

    If you don't know how many URLs are in a given node, then you should rather do something like this:

    XPathExpression expr = xpath.compile("/howto/topic[@name='PowerBuilder']/url");
    NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
    

    And then loop over the NodeList.

    0 讨论(0)
  • 2020-11-21 06:17

    Expanding on the excellent answer by @bluish and @Yishai, here is how you make the NodeLists and node attributes support iterators, i.e. the for(Node n: nodelist) interface.

    Use it like:

    NodeList nl = ...
    for(Node n : XmlUtil.asList(nl))
    {...}
    

    and

    Node n = ...
    for(Node attr : XmlUtil.asList(n.getAttributes())
    {...}
    

    The code:

    /**
     * Converts NodeList to an iterable construct.
     * From: https://stackoverflow.com/a/19591302/779521
     */
    public final class XmlUtil {
        private XmlUtil() {}
    
        public static List<Node> asList(NodeList n) {
            return n.getLength() == 0 ? Collections.<Node>emptyList() : new NodeListWrapper(n);
        }
    
        static final class NodeListWrapper extends AbstractList<Node> implements RandomAccess {
            private final NodeList list;
    
            NodeListWrapper(NodeList l) {
                this.list = l;
            }
    
            public Node get(int index) {
                return this.list.item(index);
            }
    
            public int size() {
                return this.list.getLength();
            }
        }
    
        public static List<Node> asList(NamedNodeMap n) {
            return n.getLength() == 0 ? Collections.<Node>emptyList() : new NodeMapWrapper(n);
        }
    
        static final class NodeMapWrapper extends AbstractList<Node> implements RandomAccess {
            private final NamedNodeMap list;
    
            NodeMapWrapper(NamedNodeMap l) {
                this.list = l;
            }
    
            public Node get(int index) {
                return this.list.item(index);
            }
    
            public int size() {
                return this.list.getLength();
            }
        }
    }
    
    0 讨论(0)
  • 2020-11-21 06:18

    If you have a xml like below

    <e:Envelope
        xmlns:d = "http://www.w3.org/2001/XMLSchema"
        xmlns:e = "http://schemas.xmlsoap.org/soap/envelope/"
        xmlns:wn0 = "http://systinet.com/xsd/SchemaTypes/"
        xmlns:i = "http://www.w3.org/2001/XMLSchema-instance">
        <e:Header>
            <Friends>
                <friend>
                    <Name>Testabc</Name>
                    <Age>12121</Age>
                    <Phone>Testpqr</Phone>
                </friend>
            </Friends>
        </e:Header>
        <e:Body>
            <n0:ForAnsiHeaderOperResponse xmlns:n0 = "http://systinet.com/wsdl/com/magicsoftware/ibolt/localhost/ForAnsiHeader/ForAnsiHeaderImpl#ForAnsiHeaderOper?KExqYXZhL2xhbmcvU3RyaW5nOylMamF2YS9sYW5nL1N0cmluZzs=">
                <response i:type = "d:string">12--abc--pqr</response>
            </n0:ForAnsiHeaderOperResponse>
        </e:Body>
    </e:Envelope>
    

    and wanted to extract the below xml

    <e:Header>
       <Friends>
          <friend>
             <Name>Testabc</Name>
             <Age>12121</Age>
             <Phone>Testpqr</Phone>
          </friend>
       </Friends>
    </e:Header>
    

    The below code helps to achieve the same

    public static void main(String[] args) {
    
        File fXmlFile = new File("C://Users//abhijitb//Desktop//Test.xml");
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        Document document;
        Node result = null;
        try {
            document = dbf.newDocumentBuilder().parse(fXmlFile);
            XPath xPath = XPathFactory.newInstance().newXPath();
            String xpathStr = "//Envelope//Header";
            result = (Node) xPath.evaluate(xpathStr, document, XPathConstants.NODE);
            System.out.println(nodeToString(result));
        } catch (SAXException | IOException | ParserConfigurationException | XPathExpressionException
                | TransformerException e) {
            e.printStackTrace();
        }
    }
    
    private static String nodeToString(Node node) throws TransformerException {
        StringWriter buf = new StringWriter();
        Transformer xform = TransformerFactory.newInstance().newTransformer();
        xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        xform.transform(new DOMSource(node), new StreamResult(buf));
        return (buf.toString());
    }
    

    Now if you want only the xml like below

    <Friends>
       <friend>
          <Name>Testabc</Name>
          <Age>12121</Age>
          <Phone>Testpqr</Phone>
       </friend>
    </Friends>
    

    You need to change the

    String xpathStr = "//Envelope//Header"; to String xpathStr = "//Envelope//Header/*";

    0 讨论(0)
提交回复
热议问题