Getting XML Node text value with Java DOM

后端 未结 4 1225
滥情空心
滥情空心 2020-11-27 06:16

I can\'t fetch text value with Node.getNodeValue(), Node.getFirstChild().getNodeValue() or with Node.getTextContent().

My XML

相关标签:
4条回答
  • 2020-11-27 06:52

    If you are open to vtd-xml, which excels at both performance and memory efficiency, below is the code to do what you are looking for...in both XPath and manual navigation... the overall code is much concise and easier to understand ...

    import com.ximpleware.*;
    public class queryText {
        public static void main(String[] s) throws VTDException{
            VTDGen vg = new VTDGen();
            if (!vg.parseFile("input.xml", true))
                return;
            VTDNav vn = vg.getNav();
            AutoPilot ap = new AutoPilot(vn);
            // first manually navigate
            if(vn.toElement(VTDNav.FC,"tag")){
                int i= vn.getText();
                if (i!=-1){
                    System.out.println("text ===>"+vn.toString(i));
                }
                if (vn.toElement(VTDNav.NS,"tag")){
                    i=vn.getText();
                    System.out.println("text ===>"+vn.toString(i));
                }
            }
    
            // second version use XPath
            ap.selectXPath("/add/tag/text()");
            int i=0;
            while((i=ap.evalXPath())!= -1){
                System.out.println("text node ====>"+vn.toString(i));
            }
        }
    }
    
    0 讨论(0)
  • 2020-11-27 07:07

    I'd print out the result of an2.getNodeName() as well for debugging purposes. My guess is that your tree crawling code isn't crawling to the nodes that you think it is. That suspicion is enhanced by the lack of checking for node names in your code.

    Other than that, the javadoc for Node defines "getNodeValue()" to return null for Nodes of type Element. Therefore, you really should be using getTextContent(). I'm not sure why that wouldn't give you the text that you want.

    Perhaps iterate the children of your tag node and see what types are there?

    Tried this code and it works for me:

    String xml = "<add job=\"351\">\n" +
                 "    <tag>foobar</tag>\n" +
                 "    <tag>foobar2</tag>\n" +
                 "</add>";
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    ByteArrayInputStream bis = new ByteArrayInputStream(xml.getBytes());
    Document doc = db.parse(bis);
    Node n = doc.getFirstChild();
    NodeList nl = n.getChildNodes();
    Node an,an2;
    
    for (int i=0; i < nl.getLength(); i++) {
        an = nl.item(i);
        if(an.getNodeType()==Node.ELEMENT_NODE) {
            NodeList nl2 = an.getChildNodes();
    
            for(int i2=0; i2<nl2.getLength(); i2++) {
                an2 = nl2.item(i2);
                // DEBUG PRINTS
                System.out.println(an2.getNodeName() + ": type (" + an2.getNodeType() + "):");
                if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getTextContent());
                if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getNodeValue());
                System.out.println(an2.getTextContent());
                System.out.println(an2.getNodeValue());
            }
        }
    }
    

    Output was:

    #text: type (3): foobar foobar
    #text: type (3): foobar2 foobar2
    
    0 讨论(0)
  • 2020-11-27 07:14

    If your XML goes quite deep, you might want to consider using XPath, which comes with your JRE, so you can access the contents far more easily using:

    String text = xp.evaluate("//add[@job='351']/tag[position()=1]/text()", 
        document.getDocumentElement());
    

    Full example:

    import static org.junit.Assert.assertEquals;
    import java.io.StringReader;    
    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathFactory;    
    import org.junit.Before;
    import org.junit.Test;
    import org.w3c.dom.Document;
    import org.xml.sax.InputSource;
    
    public class XPathTest {
    
        private Document document;
    
        @Before
        public void setup() throws Exception {
            String xml = "<add job=\"351\"><tag>foobar</tag><tag>foobar2</tag></add>";
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            document = db.parse(new InputSource(new StringReader(xml)));
        }
    
        @Test
        public void testXPath() throws Exception {
            XPathFactory xpf = XPathFactory.newInstance();
            XPath xp = xpf.newXPath();
            String text = xp.evaluate("//add[@job='351']/tag[position()=1]/text()",
                    document.getDocumentElement());
            assertEquals("foobar", text);
        }
    }
    
    0 讨论(0)
  • 2020-11-27 07:15

    I use a very old java. Jdk 1.4.08 and I had the same issue. The Node class for me did not had the getTextContent() method. I had to use Node.getFirstChild().getNodeValue() instead of Node.getNodeValue() to get the value of the node. This fixed for me.

    0 讨论(0)
提交回复
热议问题