Here is the sample xml document.
count the number of words
For this example I want
Here's an XSLT I was just building based on the answer by Dimitre Novatchev above. It's counting words under root/data/value (this is for .NET resource files [.RESX]), but you can easily adapt it.
Regarding the node-set function it uses, see the URL mentioned in the XSLT on how to do it with EXSLT enabled processors or others that support this function natively (the use of the msxml namespace is for .NET/MSXML, can easily change that to refer to EXSLT etc.)
<?xml version="1.0" encoding="UTF-8"?>
<!--
Filename: ResX_WordCount.xsl
Version: 20141006
-->
<xsl:stylesheet
version="1.0"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output method="text" indent="yes"/>
<xsl:template match="/root">
<!-- see http://www.xml.com/pub/a/2003/07/16/nodeset.html -->
<xsl:variable name="WordCounts">
<xsl:for-each select="data/value">
<!-- see http://stackoverflow.com/questions/6188189/count-the-number-of-words-in-a-xml-node-using-xsl/ -->
<count>
<xsl:value-of select="string-length(normalize-space(text())) - string-length(translate(normalize-space(text()),' ','')) + 1"/>
</count>
</xsl:for-each>
</xsl:variable>
<xsl:value-of select="sum(msxsl:node-set($WordCounts)/count)"/>
</xsl:template>
</xsl:stylesheet>
Use this XPath one-liner:
string-length(normalize-space(node))
-
string-length(translate(normalize-space(node),' ','')) +1
Here is a short verification using XSLT:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/*">
<xsl:value-of select=
" string-length(normalize-space(node))
-
string-length(translate(normalize-space(node),' ','')) +1"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<root>
<node> count the number of words </node>
</root>
the wanted, correct result is produced:
5
Explanation: Use of the standard XPath functions normalize-space(), translate() and string-length() .
Update1:
The OP asked:
"Your (Dimitre Novatchev) code is working fine for the above xml. Is your code will work for the following xml?"
<root> <test> <node> pass pass </node> </test> <test> <node> fail pass fail </node> </test> <test> <node> pass pass fail </node> </test> </root>
Answer: The same approach can be used:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:value-of select=
"string-length(normalize-space(.))
-
string-length(translate(normalize-space(.),' ','')) +1
"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is used on the newly-provided XML document (above), the wanted correct answer is produced:
8
Update2: The OP then asked in a comment:
"Can I have a comparision with the words in the node with some default word. Conside node contains value
"pass pass fail"
. I want to calculate number of pass and number of fail. LIkepass=2 fail=1
. is it possible? Help me man"
Answer:
The same approach works with this modification of the problem, too (in the general case, though. you need a good tokenization -- ask me about this in a new question, please):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node">
pass: <xsl:value-of select=
"string-length()
-
string-length(translate(.,'p',''))
"/>
<xsl:text/> fail: <xsl:value-of select=
"string-length()
-
string-length(translate(.,'f',''))
"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the last XML document (above), the wanted, correct is produced:
pass: 2 fail: 0
pass: 1 fail: 2
pass: 2 fail: 1
in xslt i think you would need to process to remove any double spacing and then count the remaining spaces to find an answer. although im sure there are better ways!
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="root">
<xsl:for-each select="node">
<xsl:call-template name="word-count">
<xsl:with-param name="data" select="normalize-space(.)"/>
<xsl:with-param name="num" select="1"/>
</xsl:call-template>
</xsl:for-each>
</xsl:template>
<xsl:template name="word-count">
<xsl:param name="data"/>
<xsl:param name="num"/>
<xsl:variable name="newdata" select="$data"/>
<xsl:variable name="remaining" select="substring-after($newdata,' ')"/>
<xsl:choose>
<xsl:when test="$remaining">
<xsl:call-template name="word-count">
<xsl:with-param name="data" select="$remaining"/>
<xsl:with-param name="num" select="$num+1"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="$num = 1">
no words...
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$num"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
this example code works, ammended it from a stylesheet i had which was processing some legacy code into usefull html output!
updated code to improve against errors, catches duplicate whitespace and also catches empty nodes :>
Updated to solve additional problem!
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="root">
<xsl:for-each select="test/node">
<xsl:call-template name="word-count">
<xsl:with-param name="data" select="normalize-space(.)"/>
<xsl:with-param name="num" select="1"/>
<xsl:with-param name="pass" select="0"/>
<xsl:with-param name="fail" select="0"/>
</xsl:call-template>
</xsl:for-each>
</xsl:template>
<xsl:template name="word-count">
<xsl:param name="data"/>
<xsl:param name="num"/>
<xsl:param name="fail"/>
<xsl:param name="pass"/>
<xsl:variable name="newdata" select="$data"/>
<xsl:variable name="first">
<xsl:choose>
<xsl:when test="substring-before($newdata,' ')">
<xsl:value-of select="substring-before($newdata,' ')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$newdata"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="remaining" select="substring-after($newdata,' ')"/>
<xsl:variable name="newpass">
<xsl:choose>
<xsl:when test="$first='pass'">
<xsl:value-of select="$pass+1"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$pass"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="newfail">
<xsl:choose>
<xsl:when test="$first='fail'">
<xsl:value-of select="$fail+1"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$fail"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:choose>
<xsl:when test="$remaining">
<xsl:call-template name="word-count">
<xsl:with-param name="data" select="$remaining"/>
<xsl:with-param name="num" select="$num+1"/>
<xsl:with-param name="pass" select="$newpass"/>
<xsl:with-param name="fail" select="$newfail"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="$num = 1">
it was empty
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$first"/>
wordcount:<xsl:value-of select="$num"/>
pass:<xsl:value-of select="$newpass"/>
fail:<xsl:value-of select="$newfail"/><br/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>