XPath, XML Namespaces and Java

前端 未结 3 2072
悲&欢浪女
悲&欢浪女 2020-12-19 09:47

I\'ve spent the past day attempting to extract a one XML node out of the following document and am unable to grasp the nuances of XML Namespaces to make it work.

The

相关标签:
3条回答
  • 2020-12-19 09:48

    Aha, I tried to debug your expression + got it to work. You missed a few things. This XPath expression should do it:

    /XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number
    
    1. You need to include the root element (XFDL in this case)
    2. I didn't end up needing to use any namespaces in the expression for some reason. Not sure why. If this is the case, then the NamespaceContext.getNamespaceURI() never gets called. If I replace instance with xforms:instance then getNamespaceURI() gets called once with xforms as the input argument, but the program throws an exception.
    3. The syntax for attribute values is @attr, not [attr].

    My complete sample code:

    import java.io.File;
    import java.io.IOException;
    import java.util.Collections;
    import java.util.HashMap;
    import java.util.Iterator;
    import java.util.Map;
    
    import javax.xml.XMLConstants;
    import javax.xml.namespace.NamespaceContext;
    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.parsers.ParserConfigurationException;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathExpressionException;
    import javax.xml.xpath.XPathFactory;
    
    import org.w3c.dom.Document;
    import org.w3c.dom.Node;
    import org.xml.sax.SAXException;
    
    public class XPathNamespaceExample {
        static public class MyNamespaceContext implements NamespaceContext {
            final private Map<String, String> prefixMap;
            MyNamespaceContext(Map<String, String> prefixMap)
            {
                if (prefixMap != null)
                {
                    this.prefixMap = Collections.unmodifiableMap(new HashMap<String, String>(prefixMap));
                }
                else
                {
                    this.prefixMap = Collections.emptyMap();
                }
            }
            public String getPrefix(String namespaceURI) {
                // TODO Auto-generated method stub
                return null;
            }
            public Iterator getPrefixes(String namespaceURI) {
                // TODO Auto-generated method stub
                return null;
            }
            public String getNamespaceURI(String prefix) {
                    if (prefix == null) throw new NullPointerException("Invalid Namespace Prefix");
                    else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX))
                        return "http://www.PureEdge.com/XFDL/6.5";
                    else if ("custom".equals(prefix))
                        return "http://www.PureEdge.com/XFDL/Custom";
                    else if ("designer".equals(prefix)) 
                        return "http://www.PureEdge.com/Designer/6.1";
                    else if ("pecs".equals(prefix)) 
                        return "http://www.PureEdge.com/PECustomerService";
                    else if ("xfdl".equals(prefix))
                        return "http://www.PureEdge.com/XFDL/6.5";      
                    else if ("xforms".equals(prefix)) 
                        return "http://www.w3.org/2003/xforms";
                    else    
                        return XMLConstants.NULL_NS_URI;
            }
    
    
        }
    
        protected static final String QUERY_FORM_NUMBER = 
            "/XFDL/globalpage/global/xmlmodel/xforms:instances/instance" + 
            "/form_metadata/title/documentnbr[number]";
    
        public static void main(String[] args) {
            try
            {
                DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
                DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
                Document doc = docBuilder.parse(new File(args[0]));
                System.out.println(extractNodeValue(doc, "/XFDL/globalpage/@sid"));
                System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/@id" ));
                System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number" ));
            } catch (SAXException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            } catch (ParserConfigurationException e) {
                e.printStackTrace();
            }
        }
    
        private static String extractNodeValue(Document doc, String expression) {
            try{
    
                XPath xPath = XPathFactory.newInstance().newXPath();
                xPath.setNamespaceContext(new MyNamespaceContext(null));
    
                Node result = (Node)xPath.evaluate(expression, doc, XPathConstants.NODE);
                if(result != null) {
                    return result.getNodeValue();
                } else {
                    throw new RuntimeException("can't find expression");
                }
    
            } catch (XPathExpressionException err) {
                throw new RuntimeException(err);
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-19 10:06

    Have a look at the XPathAPI library. It is a simpler way to use XPath without messing with the low-level Java API, especially when dealing with namespaces.

    The code to get the number attribute would be:

    String num = XPathAPI.selectSingleNodeAsString(doc, '//documentnbr/@number');
    

    Namespaces are automatically extracted from the root node (doc in this case). In case you need to explicitly define additional namespaces you can use this:

    Map<String, String> nsMap = new HashMap<String, String>();
    nsMap.put("xforms", "http://www.w3.org/2003/xforms");
    
    String num =
        XPathAPI.selectSingleNodeAsString(doc, '//documentnbr/@number', nsMap);
    

    (Disclaimer: I'm the author of the library.)

    0 讨论(0)
  • 2020-12-19 10:08

    SAX (alternative to XPath) version:

    SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
    final String[] number = new String[1];
    DefaultHandler handler = new DefaultHandler()
    {           
        @Override
        public void startElement(String uri, String localName, String qName,
        Attributes attributes) throws SAXException
        {
            if (qName.equals("documentnbr"))
                number[0] = attributes.getValue("number");
        }
    };
    saxParser.parse("input.xml", handler);
    System.out.println(number[0]);
    

    I see it's more complicated to use XPath with namespaces as it should be (my opinion). Here is my (simple) code:

    XPath xpath = XPathFactory.newInstance().newXPath();
    
    NamespaceContextMap contextMap = new NamespaceContextMap();
    contextMap.put("custom", "http://www.PureEdge.com/XFDL/Custom");
    contextMap.put("designer", "http://www.PureEdge.com/Designer/6.1");
    contextMap.put("pecs", "http://www.PureEdge.com/PECustomerService");
    contextMap.put("xfdl", "http://www.PureEdge.com/XFDL/6.5");
    contextMap.put("xforms", "http://www.w3.org/2003/xforms");
    contextMap.put("", "http://www.PureEdge.com/XFDL/6.5");
    
    xpath.setNamespaceContext(contextMap);
    String expression = "//:documentnbr/@number";
    InputSource inputSource = new InputSource("input.xml");
    String number;
    number = (String) xpath.evaluate(expression, inputSource, XPathConstants.STRING);
    System.out.println(number);
    

    You can get NamespaceContextMap class (not mine) from here (GPL license). There is also 6376058 bug.

    0 讨论(0)
提交回复
热议问题