I\'m a beginner in XSLT and figured out that I cannot just add up numbers to a variable and change its value in any way.
I have a XML document with a list of numbers
You need a variation on Muenchian grouping. Start by defining a key as:
<xsl:key name="numbers" match="entry[field/@type='num']" use="generate-id(following-sibling::entry[field/@type='summary'][1])" />
then use:
#<xsl:value-of select="sum(key('numbers', generate-id())/field/@value)" />#
to sum the numbers in the current group.
too late to the party and almost the same as matthias_h did:
<?xml version="1.0" encoding="utf-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="//field[@type='num']">
<xsl:value-of select="concat(@value,'
')"/>
</xsl:template>
<xsl:template match="//field[@type='summary']">
<xsl:variable name="prevSumCnt" select="count(preceding::field[@type='summary'])"/>
<xsl:variable name="sum" select="sum(preceding::field[count(preceding::field[@type='summary'])=$prevSumCnt]/@value)"/>
<xsl:value-of select="concat('#',$sum,'#
')"/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:transform>
the idea is to sum all fields that have the same number of summary-fields before them than the actual summary-field...
I. Here is a simple, forward-only solution -- do note that no reverse axis is used and the time complexity is just O(N) and the space complexity is just O(1).
This is probably the simplest and fastest of all presented solutions:
No monstrous complexity or grouping is required at all ...
No variables, no keys (and no space taken for caching key->values), no sum() ...
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/*"><xsl:apply-templates select="*[1]"/></xsl:template>
<xsl:template match="entry[field/@type = 'num']">
<xsl:param name="pAccum" select="0"/>
<xsl:value-of select="concat(field/@value, '
')"/>
<xsl:apply-templates select="following-sibling::entry[1]">
<xsl:with-param name="pAccum" select="$pAccum+field/@value"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="entry[field/@type = 'summary']">
<xsl:param name="pAccum" select="0"/>
<xsl:value-of select="concat('#', $pAccum, '#
')"/>
<xsl:apply-templates select="following-sibling::entry[1]"/>
</xsl:template>
</xsl:stylesheet>
This is an example of a streaming transformation -- it doesn't require the complete XML document tree to be present in memory and can be used to process documents of indefinite or infinite length.
When the transformation is applied on the provided source XML document:
<list>
<entry>
<field type="num" value="189.5" />
</entry>
<entry>
<field type="num" value="1.5" />
</entry>
<entry>
<field type="summary" />
</entry>
<entry>
<field type="num" value="9.5" />
</entry>
<entry>
<field type="num" value="11" />
</entry>
<entry>
<field type="num" value="10" />
</entry>
<entry>
<field type="summary" />
</entry>
</list>
the wanted, correct result is produced:
189.5
1.5
#191#
9.5
11
10
#30.5#
II. Update
The transformation above when run on sufficiently-big XML documents and with XSLT processors that don't optimize tail-recursion, causes stack overflow, due to a long chain of <xsl:apply-templates>
Below is another transformation, which doesn't cause stack overflow even with extremely big XML documents. Again, no reverse axes, no keys, no "grouping", no conditional instructions, no count()
, no <xsl:variable>
...
And, most importantly, compared with the "efficient", key-based Muenchian grouping, this transformation takes only 61% of the time of the latter, when run on an XML document having 105 000 (105 thousand) lines:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/*">
<xsl:apply-templates select=
"*[1] | entry[field/@type = 'summary']/following-sibling::*[1]"/>
</xsl:template>
<xsl:template match="entry[field/@type = 'num']">
<xsl:param name="pAccum" select="0"/>
<xsl:value-of select="concat(field/@value, '
')"/>
<xsl:apply-templates select="following-sibling::entry[1]">
<xsl:with-param name="pAccum" select="$pAccum+field/@value"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="entry[field/@type = 'summary']">
<xsl:param name="pAccum" select="0"/>
<xsl:value-of select="concat('#', $pAccum, '#
')"/>
</xsl:template>
</xsl:stylesheet>
Additionally, this transformation can be speeded to take less than 50% (that is, make it more than twice as fast) of the time taken by the Muenchian grouping transformation, by replacing every element name by just *
A lesson for us all to learn: A non-key solution sometimes can be more efficient than a key-based one.
Just as a different solution to the grouping suggested as comment - you could also use match patterns to get the sums:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="field[@type='num']">
<xsl:value-of select="@value"/>
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="entry[field[@type='summary']]">
<xsl:variable name="sumCount" select="count(preceding-sibling::entry[field[@type='summary']])"/>
<xsl:text>#</xsl:text>
<xsl:value-of select="sum(preceding-sibling::entry[count(preceding-sibling::entry[field[@type='summary']]) = $sumCount]/field[@type='num']/@value)"/>
<xsl:text>#
</xsl:text>
</xsl:template>
</xsl:transform>
When applied to your input XML this produces the output
189.5
1.5
#191#
9.5
11
10
#30.5#
The template matching field[@type='num']
prints the value and adds a newline, and the template matching entry[field[@type='summary']]
uses the variable
<xsl:variable name="sumCount" select="count(preceding-sibling::entry[field[@type='summary']])"/>
to check how many previous fields of the type summary
occured. Then only the sum of all values of entries of the type num
with the same amount of preceding summary
fields is printed:
<xsl:value-of select="sum(preceding-sibling::entry[
count(preceding-sibling::entry[field[@type='summary']]) = $sumCount
]/field[@type='num']/@value)"/>
Update: To explain in more detail how this works as requested: In the template matching entry[field[@type='summary']]
the variable sumCount
counts all previous entries that have a field of type summary
:
count(preceding-sibling::entry[field[@type='summary']])
So when the template matches the first summary
field, the value of sumCount
is 0
, and when matching the second summary
field, sumCount
is 1
.
The second line using the sum
function
sum(
preceding-sibling::entry
[
count(preceding-sibling::entry[field[@type='summary']]) =
$sumCount
]
/field[@type='num']/@value
)
sums all field[@type='num']/@value
for all previous (preceding) entries that have the same amount of previous fields of type summary
as the current field of type summary
:
count(preceding-sibling::entry[field[@type='summary']]) = $sumCount
So when the second summary
is matched, only the values of the num
fields with the values 9.5
, 10
and 11
will be summarized as they have the same amount of previous summary
fields as the current summary
field.
For the num
fields with the values 189.5
and 1.5
,
count(preceding-sibling::entry[field[@type='summary']])
is 0
, so these fields are omitted in the sum
function.