Use XSLT to mark up text matching regex?

前端 未结 3 1305
梦谈多话
梦谈多话 2021-01-20 00:13

I am trying to use XSLT 2.0 (Saxon-PE 9.6) on an HTML document to create tags that surround all contiguous runs of characters from a specified non-Latin Unicode block (space

相关标签:
3条回答
  • 2021-01-20 00:39

    Complementing the previous answers, you might like to note that you can write \p{IsDevanagari} in place of [ऀ-ॿ]

    0 讨论(0)
  • 2021-01-20 00:40

    This should work (some comments after the code):

    XSLT 2.0

    <xsl:analyze-string select="$textValue" regex="([&#x0900;-&#x097f;]+)((\s+[&#x0900;-&#x097f;]+)*)">
        <xsl:matching-substring>
              <span xml:lang="hi-Deva"><xsl:value-of select="regex-group(1)"/><xsl:value-of select="regex-group(2)"/></span>
        </xsl:matching-substring>
        <xsl:non-matching-substring>
              <xsl:value-of select="."/>
        </xsl:non-matching-substring>
    </xsl:analyze-string>
    
    • the regex is the one from your second try (as it was correctly matching only the Hindi text fragments!), just with parentheses around the first part
    • the matching-substring branch puts the span around the Hindi text
    • the non-matching-substring branch just returns the unmodified "normal" text substring (you were returning the whole text!)
    0 讨论(0)
  • 2021-01-20 01:00

    I came up with http://xsltransform.net/jyH9rMo which just does

    <?xml version="1.0" encoding="UTF-8" ?>
    <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
        <xsl:output method="html" doctype-public="XSLT-compat" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
    
        <xsl:template match="/">
          <hmtl>
            <head>
              <title>New Version!</title>
            </head>
            <xsl:apply-templates/>
          </hmtl>
        </xsl:template>
    
        <xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="text()">
       <xsl:analyze-string select="." regex="([&#x0900;-&#x097f;]+)((\s+[&#x0900;-&#x097f;]+)*)">
    
        <xsl:matching-substring>
          <span xml:lang="hi-Deva"><xsl:value-of select="."/></span>
        </xsl:matching-substring>
    
        <xsl:non-matching-substring>
          <xsl:value-of select="."/>
        </xsl:non-matching-substring>
    
      </xsl:analyze-string>       
        </xsl:template>
    </xsl:transform>
    
    0 讨论(0)
提交回复
热议问题