Merge XML nodes sharing the same name with “_LIST” in the node name and also at root level

非 Y 不嫁゛ 提交于 2019-12-11 19:15:15

问题


Below is the Input XML and I am looking for the desired output -

   <xml>
    <a>
        <element0>987</element0>
    </a>
    <a>
        <a_list_one>
            <a_lag_one>
                <element1>123</element1>
                <element2>456</element2>
            </a_lag_one>
        </a_list_one>
        <a_list_one>
            <a_lag_one>
                <element1>789</element1>
                <element2>678</element2>
            </a_lag_one>                
        </a_list_one>
        <a_list_two>
            <a_lag_two>
                <a_list_three>
                    <a_lag_three>
                        <element3>570</element3>
                        <element4>678</element4>
                    </a_lag_three>
                </a_list_three>
                <a_list_three>
                    <a_lag_three>
                        <element3>989</element3>
                        <element4>231</element4>
                    </a_lag_three>
                </a_list_three>
            </a_lag_two>
            <a_lag_two>
                <a_list_three>
                    <a_lag_three>
                        <element3>570</element3>
                        <element4>678</element4>
                    </a_lag_three>
                </a_list_three>
                <a_list_three>
                    <a_lag_three>
                        <element3>9873</element3>
                        <element4>278</element4>
                    </a_lag_three>
                </a_list_three>
                <a_list_four>
                    <a_lag_four>
                        <element5>9121</element5>
                        <element6>9879</element6>
                    </a_lag_four>
                </a_list_four>
                <a_list_three>
                    <a_lag_four>
                        <element5>098</element5>
                        <element6>231</element6>
                    </a_lag_four>
                </a_list_three>
            </a_lag_two>
        </a_list_two>
        <a_list_four>
                    <a_lag_four>
                        <element5>654</element5>
                        <element6>7665</element6>
                    </a_lag_four>
        </a_list_four>
    </a>
    <b>
        <b_list_one>
            <b_lag_one>
                <element8>123</element8>
                <element9>456</element9>
            </b_lag_one>
        </b_list_one>
    </b>
    <b>
        <b_list_one>
            <b_lag_one>
                <element8>789</element8>
                <element9>678</element9>
            </b_lag_one>            
        </b_list_one>
    </b>
</xml>

Desired XML is:

   <xml>
    <a>
        <element0>987</element0>
        <a_list_one>
            <a_lag_one>
                <element1>123</element1>
                <element2>456</element2>
            </a_lag_one>
            <a_lag_one>
                <element1>789</element1>
                <element2>678</element2>
            </a_lag_one>
        </a_list_one>
        <a_list_two>
            <a_lag_two>
                <a_list_three>
                    <a_lag_three>
                        <element3>570</element3>
                        <element4>678</element4>
                    </a_lag_three>
                    <a_lag_three>
                        <element3>989</element3>
                        <element4>231</element4>
                    </a_lag_three>
                </a_list_three>
            </a_lag_two>
            <a_lag_two>
                <a_list_three>
                    <a_lag_three>
                        <element3>570</element3>
                        <element4>678</element4>
                    </a_lag_three>
                    <a_lag_three>
                        <element3>9873</element3>
                        <element4>278</element4>
                    </a_lag_three>
                    <a_lag_four>
                        <element5>098</element5>
                        <element6>231</element6>
                    </a_lag_four>
                </a_list_three>
                <a_list_four>
                    <a_lag_four>
                        <element5>9121</element5>
                        <element6>9879</element6>
                    </a_lag_four>
                </a_list_four>
            </a_lag_two>
        </a_list_two>
        <a_list_four>
            <a_lag_four>
                <element5>654</element5>
                <element6>7665</element6>
            </a_lag_four>
        </a_list_four>      
    </a>
    <b>
        <b_list_one>
            <b_lag_one>
                <element8>123</element8>
                <element9>456</element9>
            </b_lag_one>
            <b_lag_one>
                <element8>789</element8>
                <element9>678</element9>
            </b_lag_one>            
        </b_list_one>
    </b>
</xml>

I am looking for XSL which does the conversion to the desired output. Here, the nodes which share the same name and also contains "_LIST" should be merged together. However, this logic should happen only within the first "_LIST" node and should not apply to inner nodes. Secondly, at the root level also, the nodes to be merged. For example here, there should be only one "a" tag and "b" tag. Kindly help.


回答1:


Here is a solution for XSLT 1.0

  <xsl:stylesheet version="1.0"
  xmlns:msxml="urn:schemas-microsoft-com:xslt"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes"/>

    <xsl:key name="xmlChildren" match="xml/*" use="local-name()"/>
    <xsl:key name="list" match="*[contains(local-name(),'_list')]" use="generate-id(..)"/>

    <!-- Select the child nodes of the xml node. -->
    <xsl:template match="xml/*">
      <!-- Get the name of the current node. -->
      <xsl:variable name="localName" select="local-name()"/>
      <!-- Is this the first child of the xml node with this name? -->
      <xsl:if test="generate-id(.) = generate-id(key('xmlChildren', $localName)[1])">
        <xsl:copy>
          <!-- Output all of the xml grandchild nodes of any xml child node with same name as the current node. -->
          <xsl:apply-templates select="key('xmlChildren', $localName)/*">
              <xsl:with-param name="parentName" select="$localName"/>
          </xsl:apply-templates>
        </xsl:copy>
      </xsl:if>
    </xsl:template>

    <!-- Select the nodes with a local name that contains '_list'. -->
    <xsl:template match="*[contains(local-name(),'_list')]">
      <xsl:param name="parentName"/>

      <xsl:variable name="parentID" select="generate-id(..)"/>

      <!-- Get the name of the current node. -->
      <xsl:variable name="localName" select="local-name()"/>

      <xsl:choose>
        <!-- Is this list a first generation grandchild of xml? -->
        <xsl:when test="parent::*/parent::xml">
          <!-- Is this the first instance of this list? -->
          <xsl:if test="generate-id(.) = generate-id(key('xmlChildren', $parentName)/*[local-name()=$localName][1])">
            <xsl:copy>
              <xsl:apply-templates select="key('xmlChildren', $parentName)/*[local-name()=$localName]/*"/>
            </xsl:copy>
          </xsl:if> 
        </xsl:when>
        <xsl:otherwise>
          <!-- Is this the first instance of this list? -->
          <xsl:if test="generate-id(.) = generate-id(key('list', $parentID)[local-name()=$localName][1])">
            <xsl:copy>
              <xsl:apply-templates select="key('list', $parentID)[local-name() = $localName]/*"/>
            </xsl:copy>
          </xsl:if>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:template>

    <xsl:template match="node()|@*">
      <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
    </xsl:template>  

  </xsl:stylesheet>



回答2:


I think in XQuery 3 you can solve this using two nested for .. group by expressions:

/*/element { node-name(.) } {
    for $child-element at $pos in *
    group by $element-name := node-name($child-element)
    order by $pos[1]
    return
        element { $element-name } {
            for $grand-child at $pos in $child-element/*
            let $grand-child-name := node-name($grand-child)
            group by $key := $grand-child-name, $handle := contains(string($grand-child-name), '_list')
            order by $pos[1]
            return
                if ($handle)
                then
                    element { $key } {
                        $grand-child/*
                    }
                else $grand-child
        }
}

https://xqueryfiddle.liberty-development.net/pPgCcor

For XSLT 1 I would keys like the already suggested solution but I think it is then easier to use two different match patterns for each key, one for the first item in a group established by the key that makes a copy and processes the child nodes of the group, and the second being empty to suppress processing the duplicated element names of a group:

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:key name="child-group" match="/*/*" use="name()"/>
  <xsl:key name="grand-child-group" match="/*/*/*[contains(local-name(), '_list')]" use="name()"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/*/*[generate-id() = generate-id(key('child-group', name())[1])]">
      <xsl:copy>
          <xsl:apply-templates select="key('child-group', name())/node()"/>
      </xsl:copy>
  </xsl:template>

  <xsl:template match="/*/*[not(generate-id() = generate-id(key('child-group', name())[1]))]"/>

  <xsl:template match="/*/*/*[contains(local-name(), '_list')][generate-id() = generate-id(key('grand-child-group', name())[1])]">
      <xsl:copy>
          <xsl:apply-templates select="key('grand-child-group', name())/node()"/>
      </xsl:copy>
  </xsl:template>

  <xsl:template match="/*/*/*[contains(local-name(), '_list')][not(generate-id() = generate-id(key('grand-child-group', name())[1]))]"/>  

</xsl:stylesheet>

https://xsltfiddle.liberty-development.net/jyH9rN5

Based on your comment I have also tried to make the XQuery 3 solution recursive:

declare function local:group($elements as element()*) as element()*
{
  for $child-element at $pos in $elements
  let $child-name := node-name($child-element)
  group by $name-group := $child-name, $match := contains(string($child-name), '_list')
  order by $pos[1]
  return
      if ($match)
      then element { $name-group } {
          local:group($child-element/*)
      }
      else if (not($child-element/*))
      then $child-element
      else $child-element/element {$name-group} { local:group(*) }
};

/*/element { node-name(.) } {
    for $child-element at $pos in *
    group by $element-name := node-name($child-element)
    order by $pos[1]
    return element { $element-name } {
         local:group($child-element/*)
    }

}

https://xqueryfiddle.liberty-development.net/pPgCcor/1



来源:https://stackoverflow.com/questions/52762727/merge-xml-nodes-sharing-the-same-name-with-list-in-the-node-name-and-also-at

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!