How to put an encoding attribute to xml other that utf-16 with XmlWriter?

前端 未结 5 1023
花落未央
花落未央 2020-12-01 10:33

I\'ve got a function creating some XmlDocument:

public string CreateOutputXmlString(ICollection fields)
{
    XmlWriterSettings settings = new X         


        
相关标签:
5条回答
  • 2020-12-01 10:48

    Just some extra explanations to why this is so.

    Strings are sequences of characters, not bytes. Strings, per se, are not "encoded", because they are using characters, which are stored as Unicode codepoints. Encoding DOES NOT MAKE SENSE at String level.

    An encoding is a mapping from a sequence of codepoints (characters) to a sequence of bytes (for storage on byte-based systems like filesystems or memory). The framework does not let you specify encodings, unless there is a compelling reason to, like to make 16-bit codepoints fit on byte-based storage.

    So when you're trying to write your XML into a StringBuilder, you're actually building an XML sequence of characters and writing them as a sequence of characters, so no encoding is performed. Therefore, no Encoding field.

    If you want to use an encoding, the XmlWriter has to write to a Stream.

    About the solution that you found with the MemoryStream, no offense intended, but it's just flapping around arms and moving hot air. You're encoding your codepoints with 'windows-1252', and then parsing it back to codepoints. The only change that may occur is that characters not defined in windows-1252 get converted to a '?' character in the process.

    To me, the right solution might be the following one. Depending on what your function is used for, you could pass a Stream as a parameter to your function, so that the caller decides whether it should be written to memory or to a file. So it would be written like this:

    
            public static void WriteFieldsAsXmlDocument(ICollection fields, Stream outStream)
            {
                XmlWriterSettings settings = new XmlWriterSettings();
                settings.Indent = true;
                settings.Encoding = Encoding.GetEncoding("windows-1250");
    
                using(XmlWriter writer = XmlWriter.Create(outStream, settings)) {
                    writer.WriteStartDocument();
                    writer.WriteStartElement("data");
                    foreach (Field field in fields)
                    {
                        writer.WriteStartElement("item");
                        writer.WriteAttributeString("name", field.Id);
                        writer.WriteAttributeString("value", field.Value);
                        writer.WriteEndElement();
                    }
                    writer.WriteEndElement();
                }
            }
    
    0 讨论(0)
  • 2020-12-01 10:48

    I solved mine by outputting the string to a variable then replacing any references to utf-16 with utf-8 (my app needed UTF8 encoding). Since you're using a function, you could do something similar. I use VB.net mostly, but I think the C# would look something like this.

    return builder.ToString().Replace("utf-16", "utf-8");
    
    0 讨论(0)
  • 2020-12-01 10:49

    You need to use a StringWriter with the appropriate encoding. Unfortunately StringWriter doesn't let you specify the encoding directly, so you need a class like this:

    public sealed class StringWriterWithEncoding : StringWriter
    {
        private readonly Encoding encoding;
    
        public StringWriterWithEncoding (Encoding encoding)
        {
            this.encoding = encoding;
        }
    
        public override Encoding Encoding
        {
            get { return encoding; }
        }
    }
    

    (This question is similar but not quite a duplicate.)

    EDIT: To answer the comment: pass the StringWriterWithEncoding to XmlWriter.Create instead of the StringBuilder, then call ToString() on it at the end.

    0 讨论(0)
  • 2020-12-01 10:53
    MemoryStream memoryStream = new MemoryStream();
    XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
    xmlWriterSettings.Encoding = Encoding.UTF8;
    
    XmlWriter xmlWriter = XmlWriter.Create(memoryStream, xmlWriterSettings);
    xmlWriter.WriteStartDocument();
    xmlWriter.WriteStartElement("root", "http://www.timvw.be/ns");
    xmlWriter.WriteEndElement();
    xmlWriter.WriteEndDocument();
    xmlWriter.Flush();
    xmlWriter.Close();
    
    string xmlString = Encoding.UTF8.GetString(memoryStream.ToArray());
    

    From here

    0 讨论(0)
  • 2020-12-01 11:06

    I actually solved the problem with MemoryStream:

    public static string CreateOutputXmlString(ICollection<Field> fields)
            {
                XmlWriterSettings settings = new XmlWriterSettings();
                settings.Indent = true;
                settings.Encoding = Encoding.GetEncoding("windows-1250");
    
                MemoryStream memStream = new MemoryStream();
                XmlWriter writer = XmlWriter.Create(memStream, settings);
    
                writer.WriteStartDocument();
                writer.WriteStartElement("data");
                foreach (Field field in fields)
                {
                    writer.WriteStartElement("item");
                    writer.WriteAttributeString("name", field.Id);
                    writer.WriteAttributeString("value", field.Value);
                    writer.WriteEndElement();
                }
                writer.WriteEndElement();
                writer.Flush();
                writer.Close();
    
                writer.Flush();
                writer.Close();
    
                string xml = Encoding.GetEncoding("windows-1250").GetString(memStream.ToArray());
    
                memStream.Close();
                memStream.Dispose();
    
                return xml;
            }
    
    0 讨论(0)
提交回复
热议问题