How to return xml as UTF-8 instead of UTF-16

后端 未结 3 683
迷失自我
迷失自我 2020-12-29 07:15

I am using a routine that serializes . It works, but when downloaded to the browser I see a blank page. I can view the page source or open the download

相关标签:
3条回答
  • 2020-12-29 07:40

    You can use a StringWriter that will force UTF8. Here is one way to do it:

    public class Utf8StringWriter : StringWriter
    {
        // Use UTF8 encoding but write no BOM to the wire
        public override Encoding Encoding
        {
             get { return new UTF8Encoding(false); } // in real code I'll cache this encoding.
        }
    }
    

    and then use the Utf8StringWriter writer in your code.

    using (StringWriter writer = new Utf8StringWriter())
    {
        XmlSerializer xml = new XmlSerializer(typeof(T));
        xml.Serialize(writer, Data);
        httpContextBase.Response.Write(writer);
    }
    

    answer is inspired by Serializing an object as UTF-8 XML in .NET

    0 讨论(0)
  • 2020-12-29 07:46

    Encoding of the Response

    I am not quite familiar with this part of the framework. But according to the MSDN you can set the content encoding of an HttpResponse like this:

    httpContextBase.Response.ContentEncoding = Encoding.UTF8;
    

    Encoding as seen by the XmlSerializer

    After reading your question again I see that this is the tough part. The problem lies within the use of the StringWriter. Because .NET Strings are always stored as UTF-16 (citation needed ^^) the StringWriter returns this as its encoding. Thus the XmlSerializer writes the XML-Declaration as

    <?xml version="1.0" encoding="utf-16"?>
    

    To work around that you can write into an MemoryStream like this:

    using (MemoryStream stream = new MemoryStream())
    using (StreamWriter writer = new StreamWriter(stream, Encoding.UTF8))
    {
        XmlSerializer xml = new XmlSerializer(typeof(T));
        xml.Serialize(writer, Data);
    
        // I am not 100% sure if this can be optimized
        httpContextBase.Response.BinaryWrite(stream.ToArray());
    }
    

    Other approaches

    Another edit: I just noticed this SO answer linked by jtm001. Condensed the solution there is to provide the XmlSerializer with a custom XmlWriter that is configured to use UTF8 as encoding.

    Athari proposes to derive from the StringWriter and advertise the encoding as UTF8.

    To my understanding both solutions should work as well. I think the take-away here is that you will need one kind of boilerplate code or another...

    0 讨论(0)
  • 2020-12-29 07:49

    To serialize as UTF8 string:

        private string Serialize(MyData data)
        {
            XmlSerializer ser = new XmlSerializer(typeof(MyData));
            // Using a MemoryStream to store the serialized string as a byte array, 
            // which is "encoding-agnostic"
            using (MemoryStream ms = new MemoryStream())
                // Few options here, but remember to use a signature that allows you to 
                // specify the encoding  
                using (XmlTextWriter tw = new XmlTextWriter(ms, Encoding.UTF8)) 
                {
                    tw.Formatting = Formatting.Indented;
                    ser.Serialize(tw, data);
                    // Now we get the serialized data as a string in the desired encoding
                    return Encoding.UTF8.GetString(ms.ToArray());
                }
        }
    

    To return it as XML on a web response, don't forget to set the response encoding:

        string xml = Serialize(data);
        Response.ContentType = "application/xml";
        Response.ContentEncoding = System.Text.Encoding.UTF8;
        Response.Output.Write(xml);
    
    0 讨论(0)
提交回复
热议问题