Removing DOM nodes when traversing a NodeList

前端 未结 7 798
南方客
南方客 2020-12-06 04:54

I\'m about to delete certain elements in an XML document, using code like the following:

NodeList nodes = ...;
for (int i = 0; i < nodes.getLength(); i++)         


        
相关标签:
7条回答
  • 2020-12-06 05:24

    According to the DOM Level 3 Core specification,

    the result of a call to method node.getElementsByTagName("...") will be a reference to a "live" NodeList type.

    NodeList and NamedNodeMap objects in the DOM are live; that is, changes to the underlying document structure are reflected in all relevant NodeList and NamedNodeMap objects. ... changes are automatically reflected in the NodeList, without further action on the user's part.

    1.1.1 The DOM Structure Model, para. 2

    JavaSE 7 conforms to the DOM Level 3 specification: it implements the live NodeList interface and defines it as a type; it defines and exposes getElementsByTagName method on Interface Element, which returns the live NodeList type.


    References

    W3C - Document Object Model (DOM) Level 3 Core Specification - getElementsByTagName

    JavaSE 7 - Interface Element

    JavaSE 7 - NodeList Type

    0 讨论(0)
  • 2020-12-06 05:27

    Removing nodes while looping will cause undesirable results, e.g. either missed or duplicated results. This isn't even an issue with synchronization and thread safety, but if the nodes are modified by the loop itself. Most of Java's Iterator's will throw a ConcurrentModificationException in such a case, something that NodeList does not account for.

    It can be fixed by decrementing NodeList size and by decrementing iteraror pointer at the same time. This solution can be used only if we proceed one remove action for each loop iteration.

    NodeList nodes = ...;
    for (int i = nodes.getLength() - 1; i >= 0; i--) {
      Element e = (Element)nodes.item(i);
       if (certain criteria involving Element e) {
        e.getParentNode().removeChild(e);
      }
    }
    
    0 讨论(0)
  • 2020-12-06 05:29

    As already mentioned, removing an element reduces the size of the list but the counter is still increasing (i++):

    [element 1] <- Delete 
    [element 2]
    [element 3]
    [element 4]
    [element 5]
    
    [element 2]  
    [element 3] <- Delete
    [element 4]
    [element 5]
    --
    
    [element 2]  
    [element 4] 
    [element 5] <- Delete
    --
    --
    
    [element 2]  
    [element 4] 
    --
    --
    --
    

    The simplest solution, in my opinion, would be to remove i++ section in the loop and do it as needed when the iterated element was not deleted.

    NodeList nodes = ...;
    for (int i = 0; i < nodes.getLength();) {
      Element e = (Element)nodes.item(i);
      if (certain criteria involving Element e) {
        e.getParentNode().removeChild(e);        
      } else {
        i++;
      }
    }
    

    Pointer stays on the same place when the iterated element was deleted. The list shifts by itself.

    [element 1] <- Delete 
    [element 2]
    [element 3]
    [element 4]
    [element 5]
    
    [element 2] <- Leave
    [element 3]
    [element 4]
    [element 5]
    --
    
    [element 2] 
    [element 3] <- Leave
    [element 4]
    [element 5]
    --
    
    [element 2] 
    [element 3] 
    [element 4] <- Delete
    [element 5]
    --
    
    [element 2] 
    [element 3] 
    [element 5] <- Delete
    --
    --
    
    [element 2] 
    [element 3] 
    --
    --
    --
    
    0 讨论(0)
  • 2020-12-06 05:30

    According to the DOM specificaion, the result of a call to node.getElementsByTagName("...") is supposed to be "live", that is, any modification made to the DOM tree will be reflected in the NodeList object. Well, for conforming implementations, that is...

    NodeList and NamedNodeMap objects in the DOM are live; that is, changes to the underlying document structure are reflected in all relevant NodeList and NamedNodeMap objects.

    (DOM Specification)

    So, when you modify the tree structure, a conforming implementation will change the NodeList to reflect these changes.

    0 讨论(0)
  • 2020-12-06 05:42

    The Practical XML library now contains NodeListIterator, which wraps a NodeList and provides full Iterator support (this seemed like a better choice than posting the code that we discussed in the comments). If you don't want to use the full library, feel free to copy that one class: http://practicalxml.svn.sourceforge.net/viewvc/practicalxml/trunk/src/main/java/net/sf/practicalxml/util/NodeListIterator.java?revision=125&view=markup

    0 讨论(0)
  • 2020-12-06 05:42

    Old post, but nothing marked as answer. My approach is to iterate from the end, ie

    for (int i = nodes.getLength() - 1; i >= 0; i--) {
        // do processing, and then
        e.getParentNode().removeChild(e);
    }
    

    With this, you needn't worry about the NodeList getting shorter while you delete.

    0 讨论(0)
提交回复
热议问题