Replacing strings in various XML files

烂漫一生 提交于 2020-01-20 08:31:52

问题


Given the following xml file with the knowledge that the structure and contents can change:

<something>
  <parent>
    <child>Bird is the word 1.</child>
    <child>Curd is the word 2.</child>
    <child>Nerd is the word 3.</child>
  </parent>
  <parent>
    <child>Bird is the word 4.</child>
    <child>Word is the word 5.</child>
    <child>Bird is the word 6.</child>
  </parent>
</something>

I would like a way to use xquery (and even xslt) to replace all instances of a supplied string with another. For example, replace the word "Bird" with "Dog". Therefore the results would be:

<something>
  <parent>
    <child>Dog is the word 1.</child>
    <child>Curd is the word 2.</child>
    <child>Nerd is the word 3.</child>
  </parent>
  <parent>
    <child>Dog is the word 4.</child>
    <child>Word is the word 5.</child>
    <child>Dog is the word 6.</child>
  </parent>
</something>

I have no idea if this is even possible. Every attempt I have made has eliminated the tags. I have even tried this example (http://geekswithblogs.net/Erik/archive/2008/04/01/120915.aspx), but it is for text not an entire document.

Please help!

UPDATE

I tried running with the xslt 2.0 suggestion as it seemed to fit the best. While attempting to modify it for my case, I keep coming up dry.

I want to pass in an xml parameter to define the replacements. So, modifying the xslt like this:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>
  <xsl:param name="list">
<words>
  <word>
        <search>Bird</search>
    <replace>Dog</replace>
  </word>
      <word>
        <search>word</search>
    <replace>man</replace>
  </word>
</words>
  </xsl:param>


<xsl:template match="@*|*|comment()|processing-instruction()">
  <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="text()">
  <xsl:param name="chosen" select="." />
<xsl:for-each select="$list//word">
  <xsl:variable name="search"><xsl:value-of select="search" /></xsl:variable>
  <xsl:analyze-string select="$chosen" regex="{$search}">
    <xsl:matching-substring><xsl:value-of select="replace" /></xsl:matching-substring>
    <xsl:non-matching-substring><xsl:value-of select="$chosen"/></xsl:non-matching-substring>
  </xsl:analyze-string>
</xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

The results are:

<something>
  <parent>
    <child>Bird is the word 1.Bird is the word 1.</child>
    <child>Curd is the word 2.Curd is the word 2.</child>
    <child>Nerd is the word 3.Nerd is the word 3.</child>
  </parent>
  <parent>
    <child>Bird is the word 4.Bird is the word 4.</child>
    <child>Word is the word 5.Word is the word 5.</child>
    <child>Bird is the word 6.Bird is the word 6.</child>
  </parent>
</something>

Needless to say, but, I don't want it duplicated and also incorrect.

Please Help!


回答1:


If both XQuery and XSLT are an option, you're probably using an XSLT 2.0 processor. If so, this should work:

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:param name="search" select="'Bird'"/>
    <xsl:param name="replace" select="'Dog'"/>

    <xsl:template match="@*|*|comment()|processing-instruction()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="text()">
        <xsl:analyze-string select="." regex="{$search}">
            <xsl:matching-substring><xsl:value-of select="$replace"/></xsl:matching-substring>
            <xsl:non-matching-substring><xsl:value-of select="."/></xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>

</xsl:stylesheet>

Using the XML input from the question, this XSLT produces the following output:

<something>
   <parent>
      <child>Dog is the word 1.</child>
      <child>Curd is the word 2.</child>
      <child>Nerd is the word 3.</child>
   </parent>
   <parent>
      <child>Dog is the word 4.</child>
      <child>Word is the word 5.</child>
      <child>Dog is the word 6.</child>
   </parent>
</something>

Note: No elements/attributes/comments/processing-instructions would be altered in the creation of the output.


EDIT

The reason you're getting duplicates is because your xsl:for-each is looping over the two word elements. If you had 3, it would output the text 3 times.

You just need to build the regex a little differently:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>
    <xsl:param name="list">
        <words>
            <word>
                <search>Bird</search>
                <replace>Dog</replace>
            </word>
            <word>
                <search>word</search>
                <replace>man</replace>
            </word>
        </words>
    </xsl:param>

    <xsl:template match="@*|*|comment()|processing-instruction()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="text()">
        <xsl:variable name="search" select="concat('(',string-join($list/words/word/search,'|'),')')"/>
        <xsl:analyze-string select="." regex="{$search}">
            <xsl:matching-substring>
                <xsl:value-of select="$list/words/word[search=current()]/replace"/>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>
</xsl:stylesheet>

This will produce:

<something>
   <parent>
      <child>Dog is the man 1.</child>
      <child>Curd is the man 2.</child>
      <child>Nerd is the man 3.</child>
   </parent>
   <parent>
      <child>Dog is the man 4.</child>
      <child>Word is the man 5.</child>
      <child>Dog is the man 6.</child>
   </parent>
</something>



回答2:


This should do it:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>

  <xsl:param name="findText" select="'Bird'" />
  <xsl:param name="replaceText" select="'Dog'" />

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="text()">
    <xsl:call-template name="string-replace-all">
      <xsl:with-param name="text" select="." />
      <xsl:with-param name="replace" select="$findText" />
      <xsl:with-param name="by" select="$replaceText" />
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="string-replace-all">
    <xsl:param name="text" />
    <xsl:param name="replace" />
    <xsl:param name="by" />
    <xsl:choose>
      <xsl:when test="contains($text, $replace)">
        <xsl:value-of select="substring-before($text,$replace)" />
        <xsl:value-of select="$by" />
        <xsl:call-template name="string-replace-all">
          <xsl:with-param name="text"
          select="substring-after($text,$replace)" />
          <xsl:with-param name="replace" select="$replace" />
          <xsl:with-param name="by" select="$by" />
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$text" />
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

Note that I have specified 'Bird' and 'Dog' as default values for the parameters to I can easily demonstrate the result, but it should be possible to pass in values for these parameters from external code. When run on your sample input, this produces:

<something>
  <parent>
    <child>Dog is the word 1.</child>
    <child>Curd is the word 2.</child>
    <child>Nerd is the word 3.</child>
  </parent>
  <parent>
    <child>Dog is the word 4.</child>
    <child>Word is the word 5.</child>
    <child>Dog is the word 6.</child>
  </parent>
</something>



回答3:


I think the trick is to understand that the document model is different from string parsing. Once you have that, this use-case is easy enough in either XQuery or XSLT. Your own preference will be a matter of taste. Here is a crude approach in XQuery. A more refined solution might use recursive function calls, ala http://docs.marklogic.com/4.1/guide/app-dev/typeswitch

let $in := <something>
  <parent>
    <child>Bird is the word 1.</child>
    <child>Curd is the word 2.</child>
    <child>Nerd is the word 3.</child>
  </parent>
  <parent>
    <child>Bird is the word 4.</child>
    <child>Word is the word 5.</child>
    <child>Bird is the word 6.</child>
  </parent>
</something>
return element { node-name($in) } {
  $in/@*,
  for $n in $in/node()
  return typeswitch($n)
  case element(parent) return element { node-name($n) } {
    for $c in $n/node()
    return typeswitch($c)
    case element(child) return element { node-name($c) } {
      replace($c, 'Bird', 'Dog') }
    default return $c }
  default return $n }



回答4:


Here's another XQuery option...

declare function local:searchReplace($element as element()) {
  element {node-name($element)}
    {$element/@*,
     for $child in $element/node()
        return 
            if ($child instance of element())
            then
                local:searchReplace($child)
            else 
                replace($child,'Bird','Dog')
    }
};

local:searchReplace(/*)

This also produces the same output as my XSLT 2.0 answer:

<something>
      <parent>
            <child>Dog is the word 1.</child>
            <child>Curd is the word 2.</child>
            <child>Nerd is the word 3.</child>
      </parent>
      <parent>
            <child>Dog is the word 4.</child>
            <child>Word is the word 5.</child>
            <child>Dog is the word 6.</child>
      </parent>
</something>


来源:https://stackoverflow.com/questions/14596328/replacing-strings-in-various-xml-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!