Using xsl 2.0 I\'m trying to convert all uppercase text to having only the first letter of text in each node upper-case. Their are a large number of possible child elements.
The spaces in your text made this an interesting problem. To match all text() nodes below 'head', use an XPath expression to look at the ancestor.
Here, I tokenize the string then loop through the result set changing the first character to uppercase and the following chars to lowercase.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()[ ancestor::head ]">
<xsl:value-of select="
for $str in tokenize( ., '\s' )
return concat( upper-case(substring($str,1,1)),
lower-case(substring($str,2)) )"/>
</xsl:template>
</xsl:stylesheet>
This might help you get part of the way.
<xsl:template match="/">
<xsl:apply-templates select="node()" mode="firstup"/>
</xsl:template>
<xsl:template match="text()" mode="firstup">
<!--<x>-->
<xsl:value-of select="concat(upper-case(substring(.,1,1)),lower-case(substring(.,2)))"/>
<!--</x>-->
</xsl:template>
Not sure about the third "BLAH" though, this text() node starts with a space so there will be some increased difficulty in getting sibling text nodes capitalisation correct. Uncomment the "x" element to see this. You might also want to look at normalizing spaces and the position() function to get further.
Try this. I haven't tested it and you might have to tweak it a bit.
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="head">
<xsl:copy>
<xsl:apply-templates select="@*|node()">
<xsl:value-of select="concat(upper-case(substring(.,1,1)),lower-case(substring(.,2)))"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
This transformation produces the wanted result regardles of the punctuation that delimits the words:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="head//text()">
<xsl:analyze-string select="." regex="\p{{L}}+">
<xsl:matching-substring>
<xsl:value-of select=
"concat(upper-case(substring(.,1,1)), lower-case(substring(.,2)))"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
When applied on the provided XML document:
<text> text text text
<head>BLAH <unkownTag>BLAH</unkownTag> BLAH </head>
</text>
the wanted, correct result is produced:
<text> text text text
<head>Blah <unkownTag>Blah</unkownTag> Blah </head>
</text>
When applied on this XML document:
<text> text text text
<head>BLAH$<unkownTag>BLAH</unkownTag>-BLAH;</head>
</text>
again the correct result is produced:
<text> text text text
<head>Blah$<unkownTag>Blah</unkownTag>-Blah;</head>
</text>
Explanation:
Proper use of the <xsl:analyze-string> instruction.
Proper use of the \p{L} character class.
Proper use of the <xsl:matching-substring> and <xsl:non-matching-substring> instructions.