Obtain attribute names from XML using VBA

前端 未结 1 1830
猫巷女王i
猫巷女王i 2020-12-02 01:42

I need to get the distinct attributes names from the XML using VBA.

this is my code.

 sub test() 
 Dim XMLFile As Object
Dim XMLFileName As String
Se         


        
相关标签:
1条回答
  • 2020-12-02 01:54

    Displaying XML structures including attributes via recursive function calls

    My example code demonstrates a way to

    • [1] assign the entire XML structure to a 2-dim array using XMLDOM methods and
    • [2] optionally write it back to a sheet.

    Amplifying hints:

    I added these ► structured hints to offer more help than by displaying code only, as I remarked that many of these points lead to repeated questions by other users, too:

    • Trying to list XML structures you lose good view with increasing hierarchy depth of your node elements (type constant 1 NODE_ELEMENT), so I urgently recommend the use of ► recursive calls as used in this example code.
    • Furthermore you might have not considered the special construction of node text (type constant 3 NODE_TEXT) being the first child of a name giving parent element - c.f. sections A. and B. in main function listChildNodes. Your loops through child nodes would not distinguish between the mentioned types. Just study the comments in the cited function for details.
    • I suppose your XML file starts with a needed processing instruction like e.g. <?xml version="1.0" encoding="utf-8"?>, so that it can be actually identified as XML file.
    • The calling procedure DisplayXML() uses late binding instead of early bound reference to MS XML similar to your post, but uses the recommended MSXML2 version 6.0. It calls the main function via its DocumentElement <Elements> (BTW a single node element) and a second argument referring to a predefined 2-dim array v.
    • Versioning: If you would set your XMLFILE object to memory with Set XDoc = CreateObject("MSXML2.DOMDocument") generally you are getting the older default Version (3.0), so in most cases it's preferrable to use explicitly Set XDoc = CreateObject("MSXML2.DOMDocument.6.0") instead (including XPath automatically).
    • If you don't use the Load function to get a True (file loaded successfully) or False (load error) back, it is not necessary to set the file name into brackets ().
    • The XPath operator // in search strings would return any occurences at any level (c.f. XMLFile.SelectNodes("//Elements") in your OP).
    • Consider also the use of XSLT, a special-purpose language designed to tranform XML files into all kind of end-use formats.

    Calling procedure DisplayXML

    Hint: It would sufficient to dimension the array 's row count only with an estimated number of items in the calling procedure (e.g. 1000), as the main function executes a ReDim (including a double transposition) automatically if needed. Nevertheless I added the exact items count here from the start via XPath/XMLDOM expression XMLFile.SelectNodes("//*").Length counting any item in the entire file.

    Option Explicit          ' declaration head of your code module
    
    Sub DisplayXML()
    Dim XMLFile As Object
    Dim XMLFileName As String
    'Set XMLFile = CreateObject("Microsoft.XMLDOM")   ' former style not recommended
    Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")
    
    XMLFileName = "C:\Users\Input.xml"                             ' << change to your xml file name
    XMLFile.Async = False
    XMLFile.ValidateOnParse = False
    Debug.Print XMLFile.XML
    
    If XMLFile.Load(XMLFileName) Then
    ' [1] write xml info to array with exact or assumed items count
      Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
      listChildNodes XMLFile.DocumentElement, v                 ' call helper function
    
    ' [2] write results to sheet "Dump"                         ' change to your sheet name
      With ThisWorkbook.Worksheets("Dump")
           .Range("A:B") = ""                                   ' clear result range
           .Range("A1:B1") = Split("XML Tag,Node Value", ",")   ' titles
           .Range("A2").Resize(UBound(v), UBound(v, 2)) = v     ' get  2-dim info array
      End With
    Else
           MsgBox "Load Error " & XMLFileName
    End If
    Set XMLFile = Nothing
    End Sub
    

    Structured results displayed in sheet

    Hint: If you don't want the level indentation or enumerated Level hierarchy, you can easily adapt the main function listChildNodes() below.

    +----+---------------------+-----------------+
    |    |         A           |       B         |
    +----+---------------------+-----------------+
    |1   | XML Tag             | Node Value      |
    +----+---------------------+-----------------+
    |2   | 0 Elements          |                 |
    +----+---------------------+-----------------+
    |3   |   1 Details         |                 |
    +----+---------------------+-----------------+
    |4   |     2 Name          | ABC             |
    +----+---------------------+-----------------+
    |5   |     2 Address       | 123ABC          |
    +----+---------------------+-----------------+
    |6   |     2 College       |                 |
    +----+---------------------+-----------------+
    |7   |       3 collname    | safasf          |
    +----+---------------------+-----------------+
    |8   |       3 collnumber  |                 |
    +----+---------------------+-----------------+
    |9   |   1 Dept[@num="123"]|                 |
    +----+---------------------+-----------------+
    |10  |     2 Deptname      | IT              |
    +----+---------------------+-----------------+
    |11  |     2 ID            | A123            |
    +----+---------------------+-----------------+
    

    It is also possible to refer to a precise node element, e.g. via

    listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1       ' starting from item no 1 and Level no 1
    

    This would list the indicated node set alone:

    +----+---------------------+-----------------+
    |    |         A           |       B         |
    +----+---------------------+-----------------+
    |1   | XML Tag             | Node Value      |
    +----+---------------------+-----------------+
    |2   |   1 Dept[@num="123"]|                 |
    +----+---------------------+-----------------+
    |3   |     2 Deptname      | IT              |
    +----+---------------------+-----------------+
    |4   |     2 ID            | A123            |
    +----+---------------------+-----------------+
    

    Recursive main function listChildNodes()

    Looping through childnode collections this function calls itself (i.e. the current node object) repeatedly ("recursively") and assigns the entire XML structure to a given 2-dim array (2nd argument). Furthermore it allows indendation and indicates the hierarchy levels. Note that the array in this example has to be 1-based.

    Edit 20/8 2018 includes an automatic increase of array size if the items counter i exceeds the current array's upper boundary (UBound(v), i.e. in its first dimension = items count). Technical note: As such a ReDim isn't possible in a minor (here 1st) dimension, an intermediate transposition changing 'rows' (dim 1) to 'columns' (dim 2) is necessary.

    Function listChildNodes(oCurrNode As Object, _
                            ByRef v As Variant, _
                            Optional ByRef i As Long = 1, _
                            Optional iLvl As Integer = 0 _
                            ) As Boolean
    ' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
    ' Author:  T.M.
    ' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
    '       (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
    ' Escape
      If oCurrNode Is Nothing Then Exit Function
      If i < 1 Then i = 1                                       ' one based items Counter
    ' Edit 20/8 2018 - Automatic increase of array size if needed 
      If i >= UBound(v) Then                                    ' change array size if needed
         Dim tmp As Variant
         tmp = Application.Transpose(v)                         ' change rows to columns
         ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000)      ' increase row numbers
         v = Application.Transpose(tmp)                         ' transpose back
         Erase tmp
      End If
      Const NAMEColumn& = 1, VALUEColumn& = 2                   ' constants for column 1 and 2
    ' Declare variables
      Dim oChildNode As Object                                  ' late bound node object
      Dim bDisplay   As Boolean
    ' ---------------------------------------------------------------------
    ' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
    ' ---------------------------------------------------------------------
    If (oCurrNode.NodeType = 3) Then                                 ' 3 ... NODE_TEXT
      ' display pure text content (NODE_TEXT) of parent elements
        v(i, VALUEColumn) = oCurrNode.Text                           ' nodeValue of text node
      ' return
        listChildNodes = True
    ElseIf oCurrNode.NodeType = 1 Then                                ' 1 ... NODE_ELEMENT
       ' --------------------------------------------------------------
       ' B.1 NODE_ELEMENT WITHOUT text node immediately below,
       '     a) e.g. <Details> followed by node element <NAME>,
       '        (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
       '     b) e.g. <College> node element without any child node
       '     Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
       '           (see section A. getting the FirstChild of a NODE_ELEMENT)
       ' --------------------------------------------------------------
       ' a) display parent elements of other element nodes
         If oCurrNode.HasChildNodes Then
             If Not oCurrNode.FirstChild.NodeType = 3 Then             ' <>3 ... not a NODE_TEXT
                bDisplay = True
             End If
       ' b) always display empty node elements
         Else                                                           ' empty NODE_ELEMENT
                bDisplay = True
         End If
         If bDisplay Then
                v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                                   iLvl & " " & _
                                   oCurrNode.nodename & getAtts(oCurrNode)
                i = i + 1
         End If
    
       ' --------------------------------------------------------------
       ' B.2 check child nodes
       ' --------------------------------------------------------------
         For Each oChildNode In oCurrNode.ChildNodes
          ' ~~~~~~~~~~~~~~~~~
          ' recursive call <<
          ' ~~~~~~~~~~~~~~~~~
            bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)
    
            If bDisplay Then
                v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                                   iLvl & " " & _
                                   oCurrNode.nodename & getAtts(oCurrNode)
                i = i + 1
            End If
         Next oChildNode
       ' return
         listChildNodes = False
    
    Else    ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
         If oCurrNode.NodeType = 8 Then   ' 8 ... NODE_COMMENT
            v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
            i = i + 1
         End If
       ' return
         listChildNodes = False
    End If
    
    End Function
    

    'Helper function getAtts()

    This helper function called by the above function returns a string enumerating all attribute names and values of a given node similar to XPath notation; the code can be easily adapted to your needs.

    Function getAtts(ByRef node As Object) As String
    ' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
    ' Note:    called by above function listChildNodes()
    ' Author:  T.M.
      Dim sAtts$, ii&
      If node.Attributes.Length > 0 Then
          ii = 0: sAtts = ""
          For ii = 0 To node.Attributes.Length - 1
            sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
          Next ii
      End If
    ' return
      getAtts = sAtts
    End Function
    
    0 讨论(0)
提交回复
热议问题