Java: Most efficient method to iterate over all elements in a org.w3c.dom.Document?

后端 未结 3 371
不知归路
不知归路 2020-12-02 06:17

What is the most efficient way to iterate through all DOM elements in Java?

Something like this but for every single DOM elements on current org.w3c.dom.Docume

相关标签:
3条回答
  • 2020-12-02 07:00

    I also stumbled over this problem recently. Here is my solution. I wanted to avoid recursion, so I used a while loop.

    Because of the adds and removes in arbitrary places on the list, I went with the LinkedList implementation.

    /* traverses tree starting with given node */
      private static List<Node> traverse(Node n)
      {
        return traverse(Arrays.asList(n));
      }
    
      /* traverses tree starting with given nodes */
      private static List<Node> traverse(List<Node> nodes)
      {
        List<Node> open = new LinkedList<Node>(nodes);
        List<Node> visited = new LinkedList<Node>();
    
        ListIterator<Node> it = open.listIterator();
        while (it.hasNext() || it.hasPrevious())
        {
          Node unvisited;
          if (it.hasNext())
            unvisited = it.next();
          else
            unvisited = it.previous();
    
          it.remove();
    
          List<Node> children = getChildren(unvisited);
          for (Node child : children)
            it.add(child);
    
          visited.add(unvisited);
        }
    
        return visited;
      }
    
      private static List<Node> getChildren(Node n)
      {
        List<Node> children = asList(n.getChildNodes());
        Iterator<Node> it = children.iterator();
        while (it.hasNext())
          if (it.next().getNodeType() != Node.ELEMENT_NODE)
            it.remove();
        return children;
      }
    
      private static List<Node> asList(NodeList nodes)
      {
        List<Node> list = new ArrayList<Node>(nodes.getLength());
        for (int i = 0, l = nodes.getLength(); i < l; i++)
          list.add(nodes.item(i));
        return list;
      }
    
    0 讨论(0)
  • 2020-12-02 07:01

    Basically you have two ways to iterate over all elements:

    1. Using recursion (the most common way I think):

    public static void main(String[] args) throws SAXException, IOException,
            ParserConfigurationException, TransformerException {
    
        DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
            .newInstance();
        DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
        Document document = docBuilder.parse(new File("document.xml"));
        doSomething(document.getDocumentElement());
    }
    
    public static void doSomething(Node node) {
        // do something with the current node instead of System.out
        System.out.println(node.getNodeName());
    
        NodeList nodeList = node.getChildNodes();
        for (int i = 0; i < nodeList.getLength(); i++) {
            Node currentNode = nodeList.item(i);
            if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
                //calls this method for all the children which is Element
                doSomething(currentNode);
            }
        }
    }
    

    2. Avoiding recursion using getElementsByTagName() method with * as parameter:

    public static void main(String[] args) throws SAXException, IOException,
            ParserConfigurationException, TransformerException {
    
        DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
                .newInstance();
        DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
        Document document = docBuilder.parse(new File("document.xml"));
    
        NodeList nodeList = document.getElementsByTagName("*");
        for (int i = 0; i < nodeList.getLength(); i++) {
            Node node = nodeList.item(i);
            if (node.getNodeType() == Node.ELEMENT_NODE) {
                // do something with the current element
                System.out.println(node.getNodeName());
            }
        }
    }
    

    I think these ways are both efficient.
    Hope this helps.

    0 讨论(0)
  • 2020-12-02 07:02

    for (int i = 0; i < nodeList.getLength(); i++)

    change to

    for (int i = 0, len = nodeList.getLength(); i < len; i++)

    to be more efficient.

    The second way of javanna answer may be the best as it tends to use a flatter, predictable memory model.

    0 讨论(0)
提交回复
热议问题