I went through XSLT Grouping Examples and Using for-each-group for high performance XSLT . I have a problem with for-each-group.
My XML
This transformation uses keys and handles h1-title
to h6-title
:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="body">
<xsl:apply-templates select="p[@name='h1-title']" />
</xsl:template>
<xsl:key name="next-headings" match="p[@name='h6-title']"
use="generate-id(preceding-sibling::p
[ @name='h1-title'
or @name='h2-title'
or @name='h3-title'
or @name='h4-title'
or @name='h5-title'
][1])" />
<xsl:key name="next-headings" match="p[@name='h5-title']"
use="generate-id(preceding-sibling::p
[ @name='h1-title'
or @name='h2-title'
or @name='h3-title'
or @name='h4-title'
][1])" />
<xsl:key name="next-headings" match="p[@name='h4-title']"
use="generate-id(preceding-sibling::p
[ @name='h1-title'
or @name='h2-title'
or @name='h3-title'
][1])" />
<xsl:key name="next-headings" match="p[@name='h3-title']"
use="generate-id(preceding-sibling::p
[ @name='h1-title'
or @name='h2-title'
][1])" />
<xsl:key name="next-headings" match="p[@name='h2-title']"
use="generate-id(preceding-sibling::p
[@name='h1-title'][1])" />
<xsl:key name="immediate-nodes" match=
"node()[not(self::p)
or
not(contains('|h1-title|h2-title|h3-title|h4-title|h5-title|h6-title|',
concat('|',@name,'|')
)
)]"
use="generate-id(preceding-sibling::p
[contains('|h1-title|h2-title|h3-title|h4-title|h5-title|h6-title|',
concat('|',@name,'|')
)
][1])" />
<xsl:template match=
"p[contains('|h1-title|h2-title|h3-title|h4-title|h5-title|h6-title|',
concat('|',@name,'|')
)]">
<xsl:variable name="vLevel" select="substring(@name,2,1)" />
<xsl:element name="h{$vLevel}">
<xsl:copy-of select="."/>
<xsl:apply-templates select="key('immediate-nodes', generate-id())" />
<xsl:apply-templates select="key('next-headings', generate-id())" />
</xsl:element>
</xsl:template>
<xsl:template match="/*/node()" priority="-20">
<xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>
When applied on this XML document (corrected the provided one and usin uniform values for the name
attribute):
<body>
<p name="h1-title" other="main">Introduction</p>
<p name="h2-title" other="other-h2">XSLT and XQuery</p>
<p name="h3-title" other=" other-h3">XSLT</p>
<p name="">
<p1 name="bold"> XSLT is used to write stylesheets.</p1>
</p>
<p name="h2-title" other="other-h2">XQuery</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
<p name="h3-title" other="other-h3">XQuery and stylesheets</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
<p name="h1-title" other="other-h1">XSLT and XQuery</p>
<p name="h2-title" other=" other-h2">XSLT</p>
</body>
the wanted, correct result is produced:
<h1>
<p name="h1-title" other="main">Introduction</p>
<h2>
<p name="h2-title" other="other-h2">XSLT and XQuery</p>
<h3>
<p name="h3-title" other=" other-h3">XSLT</p>
<p name="">
<p1 name="bold"> XSLT is used to write stylesheets.</p1>
</p>
</h3>
</h2>
<h2>
<p name="h2-title" other="other-h2">XQuery</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
<h3>
<p name="h3-title" other="other-h3">XQuery and stylesheets</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
</h3>
</h2>
</h1>
<h1>
<p name="h1-title" other="other-h1">XSLT and XQuery</p>
<h2>
<p name="h2-title" other=" other-h2">XSLT</p>
</h2>
</h1>
Do note:
This transformation solves the main problem of generating the hierarchy. Only trivial changes are needed if it is required that the top level name
attribute has the value "h-title"
.
If more hierarchy levels are necessary, this requires only mechanical adding the corresponding or
clauses to the definition of the keys and appending the pipe-delimited string of all name
attributes' values with the corresponding new strings.
Here I have adapted and re-used a solution that Jeni Tennison gave for a similar problem.
Each of your grouping steps is taking the original set of elements as input, whereas you need each step to work on the groups produced by the previous grouping step. And there are lots of other errors too, for example h1-title is not an attribute name.
It needs to be something like this:
<xsl:for-each-group select="*" group-starting-with="*[@name='h1-title']">
<h1>
<xsl:choose>
<xsl:when test="@name='h1-title'">
<xsl:for-each-group select="current-group()" group-starting-with="*[name='h2-title']">
<xsl:choose>
<h2>
... similar logic for the next level ...
</h2>
</xsl:choose>
</xsl:for-each-group>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</h1>
</xsl:for-each-group>
You can nest that as deeply as you want depending how many levels you want to handle; or if you want to handle an indefinite number, you can put the code in a named template and make a recursive call to handle the next level. At the innermost level, leave out the xsl:choose
and just do xsl:copy-of select="current-group()
.
(I just noticed the trailing spaces in the "name" attribute. If these really exist, you will need to include them in the comparison test, or do normalize-space()
to get rid of them.)
Here is an XSLT 2.0 stylesheet using for-each-group
in a recursive function (I prefer that to a named template with XSLT 2.0):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="xs mf">
<xsl:param name="prefix" as="xs:string" select="'h'"/>
<xsl:param name="suffix" as="xs:string" select="'-title'"/>
<xsl:output method="html" version="4.0" indent="yes"/>
<xsl:function name="mf:group" as="node()*">
<xsl:param name="items" as="node()*"/>
<xsl:param name="level" as="xs:integer"/>
<xsl:for-each-group select="$items" group-starting-with="p[@name = concat($prefix, $level, $suffix)]">
<xsl:choose>
<xsl:when test="not(self::p[@name = concat($prefix, $level, $suffix)])">
<xsl:apply-templates select="current-group()"/>
</xsl:when>
<xsl:otherwise>
<xsl:element name="h{$level}">
<xsl:apply-templates select="."/>
<xsl:sequence select="mf:group(current-group() except ., $level + 1)"/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:function>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="body">
<xsl:copy>
<xsl:sequence select="mf:group(*, 1)"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When I apply that stylesheet with Saxon 9 to the input
<body>
<p name="h-title" other="main">Introduction</p>
<p name="h1-title" other="other-h1">XSLT and XQuery</p>
<p name="h2-title" other=" other-h2">XSLT</p>
<p name="">
<p1 name="bold"> XSLT is used to write stylesheets.</p1>
</p>
<p name="h2-title" other="other-h2">XQuery</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
<p name="h3-title" other="other-h3">XQuery and stylesheets</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
<p name="h1-title" other="other-h1">XSLT and XQuery</p>
<p name="h2-title" other=" other-h2">XSLT</p>
</body>
I get the result
<body>
<p name="h-title" other="main">Introduction</p>
<h1>
<p name="h1-title" other="other-h1">XSLT and XQuery</p>
<h2>
<p name="h2-title" other=" other-h2">XSLT</p>
<p name="">
<p1 name="bold"> XSLT is used to write stylesheets.</p1>
</p>
</h2>
<h2>
<p name="h2-title" other="other-h2">XQuery</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
<h3>
<p name="h3-title" other="other-h3">XQuery and stylesheets</p>
<p name="">
<p1 name="bold"> XQuery is used to query XML databases.</p1>
</p>
</h3>
</h2>
</h1>
<h1>
<p name="h1-title" other="other-h1">XSLT and XQuery</p>
<h2>
<p name="h2-title" other=" other-h2">XSLT</p>
</h2>
</h1>
</body>