why does xmltextreader convert html encoded utf8 characters to utf8 string automatically?

前端 未结 1 1937
长情又很酷
长情又很酷 2021-01-21 01:18

I receive an XML file with encoding \"ISO-8859-1\" (Latin-1)

Within the file (among other tags) I have Example "content" And &am

相关标签:
1条回答
  • 2021-01-21 02:02

    I do not believe this is a problem with the encoding. What you're seeing is the XML string being un-escaped.

    The problem is " is a XML escape character, so XMLTextReader will un-escape this for you.

    If you change this:

    <OtherText>Example &quot;content&quot; And &#9472;</OtherText>
    

    To this:

    <OtherText>Example &amp;quot;content&amp;quot; And &amp;#9472;</OtherText>
    

    Then

       XmlReader.Value = "&quot;content&quot; And &#9472;";
    

    You'll need to wrap your value in CDATA so it is ignored by the parser.

    Another option is to re-escape the string:

        using System.Security;
    ....
    ....
        string val = SecurityElement.Escape(xmlReader.Value);
    
    0 讨论(0)
提交回复
热议问题