how to get the most deeply nested element nodes using xpath? (implementation with XMLTWIG)

邮差的信 提交于 2019-11-28 01:20:03

问题


I need to extract (XSLT, xpath, xquery... Preferably xpath) the most deeply nested element nodes with method (DEST id="RUSSIA" method="delete"/>) and his direct ancestor (SOURCE id="AFRICA" method="modify">).

I don't want to get the top nodes with methods ( main method="modify"> or main method="modify"> ).

The deepest nested elements with method correspond to real actions. The top elements with method actually are dummy actions that must not be taken into account.

Here is my XML sample file:

<?xml version="1.0" encoding="UTF-8"?>
<main method="modify">
<MACHINE method="modify">  
  <SOURCE id="AFRICA" method="modify">
    <DEST id="RUSSIA" method="delete"/>
    <DEST id="USA" method="modify"/>
  </SOURCE>

  <SOURCE id="USA" method="modify">
    <DEST id="AUSTRALIA" method="modify"/>
    <DEST id="CANADA" method="create"/>
  </SOURCE>
</MACHINE>
</main>

This is Xpath output I expect:

<SOURCE id="AFRICA" method="modify"><DEST id="RUSSIA" method="delete"/>

<SOURCE id="AFRICA" method="modify"><DEST id="USA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="AUSTRALIA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="CANADA" method="create"/>

My current xpath command does not provide the adequate result.

Command xpath("//[@method]/ancestor::*") which is returning:

<main><MACHINE method="modify">                                        # NOT WANTED

<MACHINE method="modify"><SOURCE id="AFRICA" method="modify">          # NOT WANTED

<MACHINE method="modify"><SOURCE id="USA" method="modify">             # NOT WANTED

<SOURCE id="AFRICA" method="modify"><DEST id="RUSSIA" method="delete"/>

<SOURCE id="AFRICA" method="modify"><DEST id="USA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="AUSTRALIA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="CANADA" method="create"/>

My xmltwig code for additional information (context):

#!/usr/bin/perl -w
use warnings;
use XML::Twig;
use XML::XPath;

@my $t= XML::Twig->new;
my $v= XML::Twig::Elt->new;
$t-> parsefile ('input.xml');

@abc=$t->get_xpath("\/\/[\@method]\/ancestor\:\:\*") ;
 foreach $v (@abc)   # outer 1
 {
    foreach $v ($v ->children)  # internal 1
    {
      $w=$v->parent;
      print $w->start_tag;
      print $v->start_tag;
    }
  }

回答1:


The nodes with maximum depth can be found with

//*[count(ancestor::*) = max(//*/count(ancestor::*))]

but it might perform horribly, depending how smart your optimizer is.

Having found those nodes, it is of course trivial to find their ancestors. But you are looking for output with more structure than XPath alone can provide.




回答2:


As I mentioned in my comment on the question, I don't think this is possible with pure XPath as XPath doesn't have anything like a current() function that would allow to refer to the context outside of a [] restriction.

The most similar solution should be this XSLT:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ZD="http://xyz.abc">
    <xsl:output method="text"/>

    <xsl:template match="//*">
        <xsl:choose>
            <xsl:when test="not(//*[count(ancestor::node()) > count(current()/ancestor::node())])"><xsl:value-of select="local-name(.)"/><xsl:text>
</xsl:text></xsl:when>
            <xsl:otherwise>
                <xsl:copy>
                    <xsl:apply-templates select="@*|node()"/>
                </xsl:copy>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

    <xsl:template match="text()|@*"/>
</xsl:stylesheet>

The <xsl:when> element finds the most deeply nested elements. As an example, I'm outputting the local names of the found elements, followed by a newline, but of course you can output anything you need there.

Update: Note that this is based on XPath 1.0 knowledge/tools. It seems that this is indeed possible to express in XPath 2.0.




回答3:


One such XPath2.0 expression is:

//*[not(*)
  and
   count(ancestor::*)
  =
   max(//*[not(*)]/count(ancestor::*))
   ]
     /(self::node|..)

To illustrate this with a complete XSLT 2.0 example:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:variable name="vResult" select=
     "//*[not(*)
        and
          count(ancestor::*)
       =
        max(//*[not(*)]/count(ancestor::*))
        ]
          /(self::node|..)
     "/>

 <xsl:template match="/">
     <xsl:sequence select="$vResult"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<main method="modify">
    <MACHINE method="modify">
        <SOURCE id="AFRICA" method="modify">
            <DEST id="RUSSIA" method="delete"/>
            <DEST id="USA" method="modify"/>
        </SOURCE>
        <SOURCE id="USA" method="modify">
            <DEST id="AUSTRALIA" method="modify"/>
            <DEST id="CANADA" method="create"/>
        </SOURCE>
    </MACHINE>
</main>

the XPath expression is evaluated and the selected elements (the elements at maximum depth and their parents) are copied to the output:

<SOURCE id="AFRICA" method="modify">
            <DEST id="RUSSIA" method="delete"/>
            <DEST id="USA" method="modify"/>
        </SOURCE>
<SOURCE id="USA" method="modify">
            <DEST id="AUSTRALIA" method="modify"/>
            <DEST id="CANADA" method="create"/>
        </SOURCE>



回答4:


The stylesheet

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/">
  <xsl:apply-templates 
     select="//DEST[@method and not(node())]"/>
</xsl:template>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="DEST[@method and not(node())]">
  <xsl:apply-templates select="..">
    <xsl:with-param name="leaf" select="current()"/>
  </xsl:apply-templates>
</xsl:template>

<xsl:template match="*[DEST[@method and not(node())]]">
  <xsl:param name="leaf"/>
  <xsl:copy>
    <xsl:copy-of select="@* , $leaf"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

transforms

<?xml version="1.0" encoding="UTF-8"?>
<main method="modify">
<MACHINE method="modify">  
  <SOURCE id="AFRICA" method="modify">
    <DEST id="RUSSIA" method="delete"/>
    <DEST id="USA" method="modify"/>
  </SOURCE>

  <SOURCE id="USA" method="modify">
    <DEST id="AUSTRALIA" method="modify"/>
    <DEST id="CANADA" method="create"/>
  </SOURCE>
</MACHINE>
</main>

into

<SOURCE id="AFRICA" method="modify">
   <DEST id="RUSSIA" method="delete"/>
</SOURCE>
<SOURCE id="AFRICA" method="modify">
   <DEST id="USA" method="modify"/>
</SOURCE>
<SOURCE id="USA" method="modify">
   <DEST id="AUSTRALIA" method="modify"/>
</SOURCE>
<SOURCE id="USA" method="modify">
   <DEST id="CANADA" method="create"/>
</SOURCE>


来源:https://stackoverflow.com/questions/11135620/how-to-get-the-most-deeply-nested-element-nodes-using-xpath-implementation-wit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!