Read an unknown XML with namespaces, attributes etc. into a full EAV list

陌路散爱 提交于 2020-06-09 05:37:30

问题


After answering a question about how to read an unknown JSON I tried to find something similar for XML (triggered by this related question).

The question is: How can I read the whole and everything out of an unknown XML?

The ideal output is an EAV-list in the expected sort order together with a full XPath for the element:

  • Full XPath (aware of any namespace and element position)
  • Element's or attribute's name (with namespace)
  • Content (text() node or attribute's value)
  • As the sort order is an inherent part of the XML document, the output should be sorted as such.

Some research brought me to the well known function tvf-xml-hier() by John Cappelletti. But this is missing namespace-support and denies to shred multi-text elements.

So I came up with an approach you can find below.

Find a full MCVE in my self-answer.


回答1:


UPDATE (too much text for one answer-body)

This is not the real answer but might help too...

Just to mention it: There is the absolutely outdated FROM OPENXML, which is - afaik - the only way to get literally everything back (use the XML from the other answer or any other XML in a variable @xml):

DECLARE @DocHandle INT;
EXEC sp_xml_preparedocument @DocHandle OUTPUT, @xml;
SELECT * FROM OPENXML(@DocHandle,'/*');
EXEC sp_xml_removedocument @DocHandle;

The result

+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | parentid | nodetype | localname    | prefix | namespaceuri | datatype | prev | text                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 0  | NULL     | 1        | root         | NULL   | defaultNs    | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2  | 0        | 2        | xmlns        | xmlns  | NULL         | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 36 | 2        | 3        | #text        | NULL   | NULL         | NULL     | NULL | defaultNs                                                                                                                                                       |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3  | 0        | 2        | ns1          | xmlns  | NULL         | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 37 | 3        | 3        | #text        | NULL   | NULL         | NULL     | NULL | dummy1                                                                                                                                                          |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 4  | 0        | 2        | other        | xmlns  | NULL         | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 38 | 4        | 3        | #text        | NULL   | NULL         | NULL     | NULL | SomeOther                                                                                                                                                       |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 5  | 0        | 8        | #comment     | NULL   | NULL         | NULL     | NULL | this element contains several attributes in various namespaces      Hint: An attribute without a prefix is assumed to live in the same namespace as its element |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 6  | 0        | 1        | level1       | ns1    | dummy1       | NULL     | 5    | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 7  | 6        | 2        | test1        | NULL   | NULL         | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 39 | 7        | 3        | #text        | NULL   | NULL         | NULL     | NULL | test1                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 8  | 6        | 2        | test2        | ns1    | dummy1       | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 40 | 8        | 3        | #text        | NULL   | NULL         | NULL     | NULL | test2                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 9  | 6        | 2        | test3        | other  | SomeOther    | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 41 | 9        | 3        | #text        | NULL   | NULL         | NULL     | NULL | test3                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 10 | 6        | 1        | InnerElement | other  | SomeOther    | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 42 | 10       | 3        | #text        | NULL   | NULL         | NULL     | NULL | Some inner element                                                                                                                                              |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 11 | 6        | 8        | #comment     | NULL   | NULL         | NULL     | 10   | this element contains several text nodes                                                                                                                        |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 12 | 6        | 1        | multiText    | NULL   | defaultNs    | NULL     | 11   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 14 | 12       | 3        | #text        | NULL   | NULL         | NULL     | NULL | text1                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 13 | 12       | 1        | someInner    | NULL   | defaultNs    | NULL     | 14   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 43 | 13       | 3        | #text        | NULL   | NULL         | NULL     | NULL | blah                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 15 | 12       | 3        | #text        | NULL   | NULL         | NULL     | 13   | text2                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 16 | 12       | 1        | someInner    | NULL   | defaultNs    | NULL     | 15   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 17 | 12       | 3        | #text        | NULL   | NULL         | NULL     | 16   | text3                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 18 | 6        | 8        | #comment     | NULL   | NULL         | NULL     | 12   | repeating elements some of them with attributes                                                                                                                 |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 19 | 6        | 1        | repeating    | NULL   | defaultNs    | NULL     | 18   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 44 | 19       | 3        | #text        | NULL   | NULL         | NULL     | NULL | rep 1                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 20 | 6        | 1        | repeating    | NULL   | defaultNs    | NULL     | 19   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 21 | 20       | 2        | r2           | NULL   | NULL         | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 45 | 21       | 3        | #text        | NULL   | NULL         | NULL     | NULL | r2                                                                                                                                                              |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 22 | 20       | 3        | #text        | NULL   | NULL         | NULL     | NULL | rep 2                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 23 | 6        | 8        | #comment     | NULL   | NULL         | NULL     | 20   | one with the same name, but living in another namespace                                                                                                         |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 24 | 6        | 1        | repeating    | other  | SomeOther    | NULL     | 23   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 25 | 24       | 2        | r4           | NULL   | NULL         | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 46 | 25       | 3        | #text        | NULL   | NULL         | NULL     | NULL | r4                                                                                                                                                              |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 26 | 24       | 3        | #text        | NULL   | NULL         | NULL     | NULL | rep 4                                                                                                                                                           |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 27 | 6        | 8        | #comment     | NULL   | NULL         | NULL     | 24   | some deeper nesting                                                                                                                                             |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 28 | 6        | 1        | level2       | NULL   | defaultNs    | NULL     | 27   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 29 | 28       | 1        | level3       | NULL   | defaultNs    | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 30 | 28       | 1        | level3       | NULL   | defaultNs    | NULL     | 29   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 31 | 30       | 1        | content      | NULL   | defaultNs    | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 47 | 31       | 3        | #text        | NULL   | NULL         | NULL     | NULL | Content in second level3 element                                                                                                                                |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 32 | 6        | 8        | #comment     | NULL   | NULL         | NULL     | 28   | and one more of the repeating, but listed in a lower position                                                                                                   |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 33 | 6        | 1        | repeating    | NULL   | defaultNs    | NULL     | 32   | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 34 | 33       | 2        | oneMore      | NULL   | NULL         | NULL     | NULL | NULL                                                                                                                                                            |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 48 | 34       | 3        | #text        | NULL   | NULL         | NULL     | NULL | oneMore                                                                                                                                                         |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 35 | 33       | 3        | #text        | NULL   | NULL         | NULL     | NULL | one more                                                                                                                                                        |
+----+----------+----------+--------------+--------+--------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+

As you can see, this result contains namespaces, prefixes and content - even the comments! But it is very clumsy and far away from "today" :-)




回答2:


In the following I create a XML with several namespaces, one multi-text element and various nestings, repetitions, name-clashes and attributes. This should cover most real-world scenarios.

Hint: It's easy to wrap this as inline TVF and call it as a one liner, passing the XML as parameter.

DECLARE @xml XML=
N'<root xmlns="defaultNs" xmlns:ns1="dummy1" xmlns:other="SomeOther">

  <!-- this element contains several attributes in various namespaces
       Hint: An attribute without a prefix is assumed to live in the same namespace as its element -->
  <ns1:level1 test1="test1" ns1:test2="test2" other:test3="test3">

    <other:InnerElement>Some inner element</other:InnerElement>

    <!-- this element contains several text nodes -->
    <multiText>text1<someInner>blah</someInner>text2<someInner/>text3</multiText>

    <!-- repeating elements some of them with attributes -->
    <repeating>rep 1</repeating>
    <repeating r2="r2">rep 2</repeating>

    <!-- one with the same name, but living in another namespace -->
    <other:repeating r4="r4">rep 4</other:repeating>

    <!-- some deeper nesting -->
    <level2>
        <level3/>
        <level3>
            <content>Content in second level3 element</content>
        </level3>
    </level2>

    <!-- and one more of the repeating, but listed in a lower position -->
    <repeating oneMore="oneMore">one more</repeating>

  </ns1:level1>
</root>';

--the query

WITH AllNamespaces As
(
    SELECT  CONCAT('ns',ROW_NUMBER() OVER(ORDER BY (B.namespaceUri))) Prefix
           ,B.namespaceUri
    FROM @xml.nodes('//*') A(nd)
    CROSS APPLY(VALUES(A.nd.value('namespace-uri(.)','nvarchar(max)')))B(namespaceUri)
    WHERE LEN(B.namespaceUri)>0
    GROUP BY B.namespaceUri
)
,recCte AS
(
    SELECT 1 AS RecursionLevel
          ,1 AS NodeType
          ,ROW_NUMBER() OVER(ORDER BY A.nd) AS ElementPosition
          ,CAST(REPLACE(STR(ROW_NUMBER() OVER(ORDER BY A.nd),5),' ','0') AS VARCHAR(900)) COLLATE DATABASE_DEFAULT AS SortString
          ,ns.Prefix AS CurrentPrefix
          ,ns.namespaceUri AS CurrentUri
          ,CONCAT(ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)'),'[',ROW_NUMBER() OVER(PARTITION BY CONCAT(ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)')) ORDER BY A.nd),']') AS FullName
          ,CAST(CONCAT('/',ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)'),'[',ROW_NUMBER() OVER(PARTITION BY CONCAT(ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)')) ORDER BY A.nd),']') AS NVARCHAR(MAX)) COLLATE DATABASE_DEFAULT AS XPath
          ,A.nd.query('.') CurrentFragment
          ,A.nd.query('./*') NextFragment
    FROM @xml.nodes('/*') A(nd)
    LEFT JOIN AllNamespaces ns ON ns.namespaceUri=A.nd.value('namespace-uri(.)','nvarchar(max)') 

    UNION ALL

    SELECT r.RecursionLevel+1
          ,1 
          ,ROW_NUMBER() OVER(ORDER BY A.nd)  
          ,CAST(CONCAT(r.SortString,REPLACE(STR(ROW_NUMBER() OVER(ORDER BY A.nd),5),' ','0')) AS VARCHAR(900)) COLLATE DATABASE_DEFAULT
          ,ns.Prefix
          ,ns.namespaceUri
          ,CONCAT(ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)'),'[',ROW_NUMBER() OVER(PARTITION BY CONCAT(ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)')) ORDER BY A.nd),']') 
          ,CONCAT(r.XPath,'/',ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)'),'[',ROW_NUMBER() OVER(PARTITION BY CONCAT(ns.Prefix+':',A.nd.value('local-name(.)','nvarchar(max)')) ORDER BY A.nd),']') 
          ,A.nd.query('.') CurrentFragment
          ,A.nd.query('./*') NextFragment
    FROM recCte r
    CROSS APPLY NextFragment.nodes('*') A(nd)
    OUTER APPLY(SELECT Prefix,namespaceUri FROM AllNamespaces ns WHERE ns.namespaceUri=A.nd.value('namespace-uri(.)','nvarchar(max)')) ns
)
,WithValues AS
(
    SELECT r.RecursionLevel
          ,CASE WHEN LEN(B.NodeValue)>0 THEN 3 ELSE r.NodeType END AS NodeType
          ,r.ElementPosition
          ,CASE WHEN LEN(B.NodeValue)>0 THEN CONCAT(r.SortString,REPLACE(STR(ROW_NUMBER() OVER(PARTITION BY r.Xpath ORDER BY A.txt),5),' ','0')) ELSE r.SortString END AS SortString 
          ,r.CurrentPrefix 
          ,r.CurrentUri
          ,CASE WHEN LEN(B.NodeValue)>0 THEN 'text()' ELSE r.FullName END AS FullName
          ,r.XPath AS OrigXPath
          ,CASE WHEN LEN(B.NodeValue)>0 THEN CONCAT(r.XPath,'/text()[',ROW_NUMBER() OVER(PARTITION BY r.Xpath ORDER BY A.txt),']') ELSE r.XPath END AS XPath
          ,CASE WHEN LEN(B.NodeValue)>0 THEN B.NodeValue ELSE NULL END AS NodeValue
          ,r.CurrentFragment
          ,r.NextFragment
    FROM recCte r
    OUTER APPLY r.CurrentFragment.nodes('*/text()') A(txt)
    OUTER APPLY (SELECT A.txt.value('.','nvarchar(max)')) B(NodeValue)
)
,WithAttributes AS
(
    SELECT RecursionLevel
          ,NodeType
          ,ElementPosition
          ,SortString
          ,CurrentPrefix
          ,CurrentUri
          ,FullName
          ,XPath
          ,NodeValue
          ,CurrentFragment
          ,NextFragment 
    FROM WithValues

    UNION ALL

    SELECT wv.RecursionLevel
          ,2
          ,wv.ElementPosition
          ,wv.SortString 
          ,CASE WHEN ns.Prefix IS NOT NULL THEN ns.Prefix ELSE wv.CurrentPrefix END AS CurrentPrefix
          ,CASE WHEN ns.namespaceUri IS NOT NULL THEN ns.namespaceUri ELSE wv.CurrentUri END AS CurrentUri
          ,CONCAT('@',ns.Prefix+':',B.AttrName) AS FullName
          ,CONCAT(wv.OrigXPath,'/@',ns.Prefix+':',B.AttrName) AS XPath
          ,A.attr.value('.','nvarchar(max)') AS NodeValue
          ,wv.CurrentFragment
          ,wv.NextFragment
    FROM WithValues wv
    CROSS APPLY wv.CurrentFragment.nodes('*/@*') A(attr)
    CROSS APPLY (SELECT A.attr.value('local-name(.)','nvarchar(max)') AS AttrName
                       ,A.attr.value('.','nvarchar(max)') AS AttrValue
                       ,A.attr.value('namespace-uri(.)','nvarchar(max)') AS namespaceUri) B
    OUTER APPLY(SELECT Prefix,namespaceUri FROM AllNamespaces ns WHERE ns.namespaceUri=B.namespaceUri) ns
)

SELECT NodeType
      ,CurrentPrefix
      ,CurrentUri
      ,FullName
      ,XPath
      ,NodeValue 
FROM WithAttributes
WHERE NodeValue IS NOT NULL
ORDER BY SortString;

--The result

/*
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| NodeType | CurrentPrefix | CurrentUri | FullName   | XPath                                                                           | NodeValue                        |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 2        | ns2           | dummy1     | @test1     | /ns1:root[1]/ns2:level1[1]/@test1                                               | test1                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 2        | ns2           | dummy1     | @ns2:test2 | /ns1:root[1]/ns2:level1[1]/@ns2:test2                                           | test2                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 2        | ns3           | SomeOther  | @ns3:test3 | /ns1:root[1]/ns2:level1[1]/@ns3:test3                                           | test3                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns3           | SomeOther  | text()     | /ns1:root[1]/ns2:level1[1]/ns3:InnerElement[1]/text()[1]                        | Some inner element               |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:multiText[1]/text()[1]                           | text1                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:multiText[1]/ns1:someInner[1]/text()[1]          | blah                             |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:multiText[1]/text()[2]                           | text2                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:multiText[1]/text()[3]                           | text3                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:repeating[1]/text()[1]                           | rep 1                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:repeating[2]/text()[1]                           | rep 2                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 2        | ns1           | defaultNs  | @r2        | /ns1:root[1]/ns2:level1[1]/ns1:repeating[2]/@r2                                 | r2                               |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 2        | ns3           | SomeOther  | @r4        | /ns1:root[1]/ns2:level1[1]/ns3:repeating[1]/@r4                                 | r4                               |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns3           | SomeOther  | text()     | /ns1:root[1]/ns2:level1[1]/ns3:repeating[1]/text()[1]                           | rep 4                            |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:level2[1]/ns1:level3[2]/ns1:content[1]/text()[1] | Content in second level3 element |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 2        | ns1           | defaultNs  | @oneMore   | /ns1:root[1]/ns2:level1[1]/ns1:repeating[3]/@oneMore                            | oneMore                          |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
| 3        | ns1           | defaultNs  | text()     | /ns1:root[1]/ns2:level1[1]/ns1:repeating[3]/text()[1]                           | one more                         |
+----------+---------------+------------+------------+---------------------------------------------------------------------------------+----------------------------------+
*/

--Just to show, that the created XPaths return the expected (attention: We must use our own prefixes - even for the default namespace):

WITH XMLNAMESPACES( 'defaultNs' AS ns1
                   ,'dummy1'    AS ns2
                   ,'SomeOther' AS ns3)
SELECT @xml.value('/ns1:root[1]/ns2:level1[1]/ns1:multiText[1]/ns1:someInner[1]/text()[1]','nvarchar(max)') Is_blah
      ,@xml.value('/ns1:root[1]/ns2:level1[1]/ns1:level2[1]/ns1:level3[2]/ns1:content[1]/text()[1]','nvarchar(max)') Is_Content_in_second_level3_element
      ,@xml.value('/ns1:root[1]/ns2:level1[1]/ns1:repeating[3]/@oneMore','nvarchar(max)') Is_attribute_oneMore
      ,@xml.value('/ns1:root[1]/ns2:level1[1]/ns1:multiText[1]/text()[3]','nvarchar(max)') Is_3rd_text_in_multiText;

The idea in short:

  • The namespace prefixes can be defined by your own. There is no XQuery-function available in T-SQL to find the actual prefix, so we just use our own prefixes. The underlying URI is important.
  • The first cte will create a set of all occuring URIs and return this together with a prefix.
  • The recursive CTE will traverse deeper and deeper into the XML. This will continue as long as APPLY with .nodes() can return nested nodes.
  • One CTE adds text() nodes - if there are any.
  • One CTE adds attributes - if there are any.
  • The full name is concatenated as well as the full XPath.
  • The NodeType helps to distinguish between elements (=1), attributes (=2) and text() (=3)
  • The CASTs and COLLATEs help to avoid data type mismatch (recursive CTEs are very picky with this).
  • The concatenated SortString is needed to ensure the same order in your output.
  • You might use SELECT * ... to see all returned colums...
  • You might query this without WHERE NodeValue IS NOT NULL to see more of the empty structure.


来源:https://stackoverflow.com/questions/61886650/read-an-unknown-xml-with-namespaces-attributes-etc-into-a-full-eav-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!