I have a xml grouping challenge for which I need to group AND remove duplicate as below:
<Person>
<name>John</name>
<date>June12</date>
<workTime taskID=1>34</workTime>
<workTime taskID=1>35</workTime>
<workTime taskID=2>12</workTime>
</Person>
<Person>
<name>John</name>
<date>June13</date>
<workTime taskID=1>21</workTime>
<workTime taskID=2>11</workTime>
<workTime taskID=2>14</workTime>
</Person>
Note that for a specific occurence of name/taskID/date, only the first one is picked up. In this example,
<workTime taskID=1>35</workTime>
<workTime taskID=2>14</workTime>
would be left aside.
Below is the expected output:
<Person>
<name>John</name>
<taskID>1</taskID>
<workTime>
<date>June12</date>
<time>34</time>
</worTime>
<workTime>
<date>June13</date>
<time>21</time>
</worTime>
</Person>
<Person>
<name>John</name>
<taskID>2</taskID>
<workTime>
<date>June12</date>
<time>12</time>
</worTime>
<workTime>
<date>June13</date>
<time>11</time>
</worTime>
</Person>
I have tried to use a muenchian grouping in XSLT 1.0 using the key below:
<xsl:key name="PersonTasks" match="workTime" use="concat(@taskID, ../name)"/>
but then how do I only pick up the first occurence of
concat(@taskID, ../name, ../date)
? It seems that I need two level of keys!?
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kwrkTimeByNameTask" match="workTime"
use="concat(../name, '+', @taskID)"/>
<xsl:key name="kDateByName" match="date"
use="../name"/>
<xsl:key name="kwrkTimeByNameTaskDate" match="workTime"
use="concat(../name, '+', @taskID, '+', ../date)"/>
<xsl:template match="/">
<xsl:for-each select=
"*/*/workTime
[generate-id()
=
generate-id(key('kwrkTimeByNameTask',
concat(../name, '+', @taskID)
)[1]
)
]
">
<xsl:sort select="../name"/>
<xsl:sort select="@taskID" data-type="number"/>
<xsl:variable name="vcurTaskId" select="@taskID"/>
<Person>
<name><xsl:value-of select="../name"/></name>
<taskID><xsl:value-of select="@taskID"/></taskID>
<xsl:for-each select=
"key('kDateByName', ../name)
[key('kwrkTimeByNameTaskDate',
concat(../name, '+', current()/@taskID, '+', .)
)
]
">
<workTime>
<date><xsl:value-of select="."/></date>
<time>
<xsl:value-of select=
"key('kwrkTimeByNameTaskDate',
concat(../name, '+', $vcurTaskId, '+', .)
)"/>
</time>
</workTime>
</xsl:for-each>
</Person>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML (corrected from multiple issues to become well-formed):
<t>
<Person>
<name>John</name>
<date>June12</date>
<workTime taskID="1">34</workTime>
<workTime taskID="1">35</workTime>
<workTime taskID="2">12</workTime>
</Person>
<Person>
<name>John</name>
<date>June13</date>
<workTime taskID="1">21</workTime>
<workTime taskID="2">11</workTime>
<workTime taskID="2">14</workTime>
</Person>
</t>
produces the wanted, correct result:
<Person>
<name>John</name>
<taskID>1</taskID>
<workTime>
<date>June12</date>
<time>34</time>
</workTime>
<workTime>
<date>June13</date>
<time>21</time>
</workTime>
</Person>
<Person>
<name>John</name>
<taskID>2</taskID>
<workTime>
<date>June12</date>
<time>12</time>
</workTime>
<workTime>
<date>June13</date>
<time>11</time>
</workTime>
</Person>
Explanation:
First we obtain all
workTime
elements with unique pairs of../name
,@taskID
by using the Muenchian method for grouping.We sort these by
../name
and@taskID
-- in that order.For each such
workTime
we get alldate
elements that are listed with the../name
of thisworkTime
and leave only those of thesedate
elements, for which there is aworkTime
that has the same../date
and../name
.In the previous step we use two different auxiliary keys:
'kDateByName'
indexes alldate
elements by their../name
, while'kwrkTimeByNameTaskDate'
indexes allworkTime
elements by their../name
, their../date
and their@taskID
.
So, the meaning of the following:
<xsl:for-each select=
"key('kDateByName', ../name)
[key('kwrkTimeByNameTaskDate',
concat(../name, '+', current()/@taskID, '+', .)
)
]
">
is:
For each date
for that name
, such that a workTime
for that name
, date
and @taskID
(of the current workTime
for the outer <xsl:for-each>
) exists, do whatever is in the body of this <xsl:for-each>
instruction.
Just for fun, another solutions with two keys. This stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="kWorkTimeByName-TaskID" match="workTime"
use="concat(../name,'++',@taskID)"/>
<xsl:key name="kWorkTimeByName-Date-TaskID" match="workTime"
use="concat(../name,'++',../date,'++',@taskID)"/>
<xsl:template match="/">
<xsl:variable name="vAllWorkTime" select="*/*/workTime"/>
<result>
<xsl:for-each select="$vAllWorkTime
[count(.|key('kWorkTimeByName-TaskID',
concat(../name,'++',@taskID))[1])=1]">
<xsl:sort select="../name"/>
<xsl:sort select="@taskID" data-type="number"/>
<Person>
<xsl:copy-of select="../name"/>
<taskID>
<xsl:value-of select="@taskID"/>
</taskID>
<xsl:for-each select="$vAllWorkTime
[count(.|key('kWorkTimeByName-Date-TaskID',
concat(current()/../name,'++',
../date,'++',current()/@taskID))[1])=1]">
<xsl:sort select="../date"/>
<xsl:copy>
<xsl:copy-of select="../date"/>
<time>
<xsl:value-of select="."/>
</time>
</xsl:copy>
</xsl:for-each>
</Person>
</xsl:for-each>
</result>
</xsl:template>
</xsl:stylesheet>
Output:
<result>
<Person>
<name>John</name>
<taskID>1</taskID>
<workTime>
<date>June12</date>
<time>34</time>
</workTime>
<workTime>
<date>June13</date>
<time>21</time>
</workTime>
</Person>
<Person>
<name>John</name>
<taskID>2</taskID>
<workTime>
<date>June12</date>
<time>12</time>
</workTime>
<workTime>
<date>June13</date>
<time>11</time>
</workTime>
</Person>
</result>
Grouping in XSLT is usually done using a method called the Muenchian method. Find more data here: http://www.jenitennison.com/xslt/grouping/muenchian.html
来源:https://stackoverflow.com/questions/3518399/xslt-1-0-grouping-and-removing-duplicate