I need to get the distinct attributes names from the XML using VBA.
this is my code.
sub test()
Dim XMLFile As Object
Dim XMLFileName As String
Se
Displaying XML structures including attributes via recursive function calls
My example code demonstrates a way to
[1]
assign the entire XML structure to a 2-dim array using XMLDOM methods and [2]
optionally write it back to a sheet.Amplifying hints:
I added these ► structured hints to offer more help than by displaying code only, as I remarked that many of these points lead to repeated questions by other users, too:
XML
structures you lose good view with increasing hierarchy depth of your node elements (type constant 1 NODE_ELEMENT
), so I urgently recommend the use of ► recursive calls as used in this example code. NODE_TEXT
) being the first child of a name giving parent element - c.f. sections A. and B. in main function listChildNodes
. Your loops through child nodes would not distinguish between the mentioned types. Just study the comments in the cited function for details.<?xml version="1.0" encoding="utf-8"?>
, so that it can be actually identified as XML file.DisplayXML()
uses late binding instead of early bound reference to MS XML similar to your post, but uses the recommended MSXML2 version 6.0. It calls the main function via its DocumentElement
<Elements>
(BTW a single node element) and a second argument referring to a predefined 2-dim array v
.XMLFILE
object to memory with Set XDoc = CreateObject("MSXML2.DOMDocument")
generally you are getting the older default Version (3.0), so in most cases it's preferrable to use explicitly Set XDoc = CreateObject("MSXML2.DOMDocument.6.0")
instead (including XPath automatically).Load
function to get a True
(file loaded successfully) or False
(load error) back, it is not necessary to set the file name into brackets ().//
in search strings would return any occurences at any level (c.f. XMLFile.SelectNodes("//Elements")
in your OP).Calling procedure DisplayXML
Hint: It would sufficient to dimension the array 's row count only with an estimated number of items in the calling procedure (e.g. 1000), as the main function executes a ReDim
(including a double transposition) automatically if needed. Nevertheless I added the exact items count here from the start via XPath/XMLDOM expression XMLFile.SelectNodes("//*").Length
counting any item in the entire file.
Option Explicit ' declaration head of your code module
Sub DisplayXML()
Dim XMLFile As Object
Dim XMLFileName As String
'Set XMLFile = CreateObject("Microsoft.XMLDOM") ' former style not recommended
Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")
XMLFileName = "C:\Users\Input.xml" ' << change to your xml file name
XMLFile.Async = False
XMLFile.ValidateOnParse = False
Debug.Print XMLFile.XML
If XMLFile.Load(XMLFileName) Then
' [1] write xml info to array with exact or assumed items count
Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
listChildNodes XMLFile.DocumentElement, v ' call helper function
' [2] write results to sheet "Dump" ' change to your sheet name
With ThisWorkbook.Worksheets("Dump")
.Range("A:B") = "" ' clear result range
.Range("A1:B1") = Split("XML Tag,Node Value", ",") ' titles
.Range("A2").Resize(UBound(v), UBound(v, 2)) = v ' get 2-dim info array
End With
Else
MsgBox "Load Error " & XMLFileName
End If
Set XMLFile = Nothing
End Sub
Structured results displayed in sheet
Hint: If you don't want the level indentation or enumerated Level hierarchy, you can easily adapt the main function listChildNodes()
below.
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 0 Elements | |
+----+---------------------+-----------------+
|3 | 1 Details | |
+----+---------------------+-----------------+
|4 | 2 Name | ABC |
+----+---------------------+-----------------+
|5 | 2 Address | 123ABC |
+----+---------------------+-----------------+
|6 | 2 College | |
+----+---------------------+-----------------+
|7 | 3 collname | safasf |
+----+---------------------+-----------------+
|8 | 3 collnumber | |
+----+---------------------+-----------------+
|9 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|10 | 2 Deptname | IT |
+----+---------------------+-----------------+
|11 | 2 ID | A123 |
+----+---------------------+-----------------+
It is also possible to refer to a precise node element, e.g. via
listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1 ' starting from item no 1 and Level no 1
This would list the indicated node set alone:
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|3 | 2 Deptname | IT |
+----+---------------------+-----------------+
|4 | 2 ID | A123 |
+----+---------------------+-----------------+
Recursive main function listChildNodes()
Looping through childnode collections this function calls itself (i.e. the current node object) repeatedly ("recursively") and assigns the entire XML structure to a given 2-dim array (2nd argument). Furthermore it allows indendation and indicates the hierarchy levels. Note that the array in this example has to be 1-based.
Edit 20/8 2018
includes an automatic increase of array size if the items counter i
exceeds the current array's upper boundary (UBound(v)
, i.e. in its first dimension = items count). Technical note: As such a ReDim
isn't possible in a minor (here 1st) dimension, an intermediate transposition changing 'rows' (dim 1) to 'columns' (dim 2) is necessary.
Function listChildNodes(oCurrNode As Object, _
ByRef v As Variant, _
Optional ByRef i As Long = 1, _
Optional iLvl As Integer = 0 _
) As Boolean
' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
' Author: T.M.
' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
' (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
' Escape
If oCurrNode Is Nothing Then Exit Function
If i < 1 Then i = 1 ' one based items Counter
' Edit 20/8 2018 - Automatic increase of array size if needed
If i >= UBound(v) Then ' change array size if needed
Dim tmp As Variant
tmp = Application.Transpose(v) ' change rows to columns
ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000) ' increase row numbers
v = Application.Transpose(tmp) ' transpose back
Erase tmp
End If
Const NAMEColumn& = 1, VALUEColumn& = 2 ' constants for column 1 and 2
' Declare variables
Dim oChildNode As Object ' late bound node object
Dim bDisplay As Boolean
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
If (oCurrNode.NodeType = 3) Then ' 3 ... NODE_TEXT
' display pure text content (NODE_TEXT) of parent elements
v(i, VALUEColumn) = oCurrNode.Text ' nodeValue of text node
' return
listChildNodes = True
ElseIf oCurrNode.NodeType = 1 Then ' 1 ... NODE_ELEMENT
' --------------------------------------------------------------
' B.1 NODE_ELEMENT WITHOUT text node immediately below,
' a) e.g. <Details> followed by node element <NAME>,
' (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
' b) e.g. <College> node element without any child node
' Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
' (see section A. getting the FirstChild of a NODE_ELEMENT)
' --------------------------------------------------------------
' a) display parent elements of other element nodes
If oCurrNode.HasChildNodes Then
If Not oCurrNode.FirstChild.NodeType = 3 Then ' <>3 ... not a NODE_TEXT
bDisplay = True
End If
' b) always display empty node elements
Else ' empty NODE_ELEMENT
bDisplay = True
End If
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
' --------------------------------------------------------------
' B.2 check child nodes
' --------------------------------------------------------------
For Each oChildNode In oCurrNode.ChildNodes
' ~~~~~~~~~~~~~~~~~
' recursive call <<
' ~~~~~~~~~~~~~~~~~
bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
Next oChildNode
' return
listChildNodes = False
Else ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
If oCurrNode.NodeType = 8 Then ' 8 ... NODE_COMMENT
v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
i = i + 1
End If
' return
listChildNodes = False
End If
End Function
'Helper function getAtts()
This helper function called by the above function returns a string enumerating all attribute names and values of a given node similar to XPath
notation; the code can be easily adapted to your needs.
Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
' Note: called by above function listChildNodes()
' Author: T.M.
Dim sAtts$, ii&
If node.Attributes.Length > 0 Then
ii = 0: sAtts = ""
For ii = 0 To node.Attributes.Length - 1
sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
Next ii
End If
' return
getAtts = sAtts
End Function