Given the following xml file with the knowledge that the structure and contents can change:
<child>Bird is the word 1.</child>
<child>Curd is the word 2.</child>
<child>Nerd is the word 3.</child>
<child>Bird is the word 4.</child>
<child>Word is the word 5.</child>
<child>Bird is the word 6.</child>
I would like a way to use xquery (and even xslt) to replace all instances of a supplied string with another. For example, replace the word "Bird" with "Dog". Therefore the results would be:
<child>Dog is the word 1.</child>
<child>Curd is the word 2.</child>
<child>Nerd is the word 3.</child>
<child>Dog is the word 4.</child>
<child>Word is the word 5.</child>
<child>Dog is the word 6.</child>
I have no idea if this is even possible. Every attempt I have made has eliminated the tags. I have even tried this example (http://geekswithblogs.net/Erik/archive/2008/04/01/120915.aspx), but it is for text not an entire document.
Please help!
I tried running with the xslt 2.0 suggestion as it seemed to fit the best. While attempting to modify it for my case, I keep coming up dry.
I want to pass in an xml parameter to define the replacements. So, modifying the xslt like this:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="list">
<xsl:template match="@*|*|comment()|processing-instruction()">
<xsl:apply-templates select="@*|node()"/>
<xsl:template match="text()">
<xsl:param name="chosen" select="." />
<xsl:for-each select="$list//word">
<xsl:variable name="search"><xsl:value-of select="search" /></xsl:variable>
<xsl:analyze-string select="$chosen" regex="{$search}">
<xsl:matching-substring><xsl:value-of select="replace" /></xsl:matching-substring>
<xsl:non-matching-substring><xsl:value-of select="$chosen"/></xsl:non-matching-substring>
The results are:
<child>Bird is the word 1.Bird is the word 1.</child>
<child>Curd is the word 2.Curd is the word 2.</child>
<child>Nerd is the word 3.Nerd is the word 3.</child>
<child>Bird is the word 4.Bird is the word 4.</child>
<child>Word is the word 5.Word is the word 5.</child>
<child>Bird is the word 6.Bird is the word 6.</child>
Needless to say, but, I don't want it duplicated and also incorrect.
Please Help!
If both XQuery and XSLT are an option, you're probably using an XSLT 2.0 processor. If so, this should work:
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="search" select="'Bird'"/>
<xsl:param name="replace" select="'Dog'"/>
<xsl:template match="@*|*|comment()|processing-instruction()">
<xsl:apply-templates select="@*|node()"/>
<xsl:template match="text()">
<xsl:analyze-string select="." regex="{$search}">
<xsl:matching-substring><xsl:value-of select="$replace"/></xsl:matching-substring>
<xsl:non-matching-substring><xsl:value-of select="."/></xsl:non-matching-substring>
Using the XML input from the question, this XSLT produces the following output:
<child>Dog is the word 1.</child>
<child>Curd is the word 2.</child>
<child>Nerd is the word 3.</child>
<child>Dog is the word 4.</child>
<child>Word is the word 5.</child>
<child>Dog is the word 6.</child>
Note: No elements/attributes/comments/processing-instructions would be altered in the creation of the output.
The reason you're getting duplicates is because your xsl:for-each
is looping over the two word
elements. If you had 3, it would output the text 3 times.
You just need to build the regex a little differently:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="list">
<xsl:template match="@*|*|comment()|processing-instruction()">
<xsl:apply-templates select="@*|node()"/>
<xsl:template match="text()">
<xsl:variable name="search" select="concat('(',string-join($list/words/word/search,'|'),')')"/>
<xsl:analyze-string select="." regex="{$search}">
<xsl:value-of select="$list/words/word[search=current()]/replace"/>
<xsl:value-of select="."/>
This will produce:
<child>Dog is the man 1.</child>
<child>Curd is the man 2.</child>
<child>Nerd is the man 3.</child>
<child>Dog is the man 4.</child>
<child>Word is the man 5.</child>
<child>Dog is the man 6.</child>
This should do it:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:param name="findText" select="'Bird'" />
<xsl:param name="replaceText" select="'Dog'" />
<xsl:template match="@* | node()">
<xsl:apply-templates select="@* | node()"/>
<xsl:template match="text()">
<xsl:call-template name="string-replace-all">
<xsl:with-param name="text" select="." />
<xsl:with-param name="replace" select="$findText" />
<xsl:with-param name="by" select="$replaceText" />
<xsl:template name="string-replace-all">
<xsl:param name="text" />
<xsl:param name="replace" />
<xsl:param name="by" />
<xsl:when test="contains($text, $replace)">
<xsl:value-of select="substring-before($text,$replace)" />
<xsl:value-of select="$by" />
<xsl:call-template name="string-replace-all">
<xsl:with-param name="text"
select="substring-after($text,$replace)" />
<xsl:with-param name="replace" select="$replace" />
<xsl:with-param name="by" select="$by" />
<xsl:value-of select="$text" />
Note that I have specified 'Bird' and 'Dog' as default values for the parameters to I can easily demonstrate the result, but it should be possible to pass in values for these parameters from external code. When run on your sample input, this produces:
<child>Dog is the word 1.</child>
<child>Curd is the word 2.</child>
<child>Nerd is the word 3.</child>
<child>Dog is the word 4.</child>
<child>Word is the word 5.</child>
<child>Dog is the word 6.</child>
I think the trick is to understand that the document model is different from string parsing. Once you have that, this use-case is easy enough in either XQuery or XSLT. Your own preference will be a matter of taste. Here is a crude approach in XQuery. A more refined solution might use recursive function calls, ala http://docs.marklogic.com/4.1/guide/app-dev/typeswitch
let $in := <something>
<child>Bird is the word 1.</child>
<child>Curd is the word 2.</child>
<child>Nerd is the word 3.</child>
<child>Bird is the word 4.</child>
<child>Word is the word 5.</child>
<child>Bird is the word 6.</child>
return element { node-name($in) } {
for $n in $in/node()
return typeswitch($n)
case element(parent) return element { node-name($n) } {
for $c in $n/node()
return typeswitch($c)
case element(child) return element { node-name($c) } {
replace($c, 'Bird', 'Dog') }
default return $c }
default return $n }
Here's another XQuery option...
declare function local:searchReplace($element as element()) {
element {node-name($element)}
for $child in $element/node()
if ($child instance of element())
This also produces the same output as my XSLT 2.0 answer:
<child>Dog is the word 1.</child>
<child>Curd is the word 2.</child>
<child>Nerd is the word 3.</child>
<child>Dog is the word 4.</child>
<child>Word is the word 5.</child>
<child>Dog is the word 6.</child>