XSL: Include some ASCII control chars when method=“text”

别等时光非礼了梦想. 提交于 2019-11-29 11:54:53
Martin Honnen

The Microsoft .NET framework does not support XML 1.1, that is true, but it has its own (not portable) way to use control characters in XML 1.0 documents, namely you can have as a numeric character reference if you set CheckCharacters to false on your XmlReaderSettings/XmlWriterSettings.

Here is an example stylesheet and some .NET code tested with .NET 3.5 that does not throw an illegal character exception:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="text"/>

  <xsl:template match="/">
    <xsl:text>&#x10;</xsl:text>
  </xsl:template>
</xsl:stylesheet>

 

XmlReaderSettings xrs = new XmlReaderSettings();
xrs.CheckCharacters = false;

XslCompiledTransform proc = new XslCompiledTransform();
using (XmlReader xr = XmlReader.Create(@"sheet.xslt", xrs))
{
    proc.Load(xr);
}

using (XmlReader xr = XmlReader.Create(new StringReader("<foo/>")))
{
    XmlWriterSettings xws = proc.OutputSettings.Clone();
    xws.CheckCharacters = false;

    using (XmlWriter xw = XmlWriter.Create(@"result.txt", xws))
    {
        proc.Transform(xr, null, xw);
        xw.Close();
    }
    xr.Close();
}

Since the XSLT file is an XML file, you cannot include that character reference. I don't think you can do this in a pure XSLT solution.

The ASCII character HEX 10/DEC 16 is the Data Link Escape (DLE) control character.

The XML Spec only allows the three whitespace(tab, carriage return, line feed) control characters.

Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646.

Everything else under 0x20 is not allowed.

Character Range 2 Char ::=
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

One option is to put a placeholder token value for that character in your output, and then use an external process to find/replace your token with the character.

If you can use XML 1.1 (which allows inserting such characters in an XML document as a character reference) then the following should work, at least it works for me with Sun Java 6 and Saxon 9.2:

<?xml version="1.1" encoding="UTF-8"?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0">

  <xsl:output method="text"/>

  <xsl:template name="main">
    <xsl:text>&#x10;</xsl:text>
  </xsl:template>

</xsl:stylesheet>

In the past, I have used this technique to enter a linefeed into an XHTML generated textarea. If I didn't put at least one character, the textarea would self close (causing browser issues). Notice the character is wrapped in <xsl:text>. Also, the original source was on one line, but I formatted for readability.

<textarea name="qry" rows="4" cols="50" id="query">
 <xsl:value-of select="$qry" /><xsl:text>&#x0A;</xsl:text>
</textarea>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!