xmlreader newline \n instead of \r\n

后端 未结 5 1632
不思量自难忘°
不思量自难忘° 2021-01-11 11:32

When I use XmlReader.ReadOuterXml(), elements are separated by \\n instead of \\r\\n. So, for example, if I have XmlDocument representatino of


<         


        
相关标签:
5条回答
  • 2021-01-11 11:33

    XmlReader will automatically normalize \r\n\ to \n. Although this seems unusual on Windows, it is actually required by the XML Specification (http://www.w3.org/TR/2008/REC-xml-20081126/#sec-line-ends).

    You can do a String.Replace:

    string s = reader.ReadOuterXml().Replace("\n", "\r\n");
    
    0 讨论(0)
  • 2021-01-11 11:34

    Solution 1: Write entitized XML

    Use a well configured XmlWriter with NewLineHandling.Entitize option so the XmlReader will not eliminate normalize the line endings.

    You can use such a custom XmlWriter even with XDocument:

    xDoc.Save(XmlWriter.Create(fileName, new XmlWriterSettings { NewLineHandling = NewLineHandling.Entitize }));
    

    Solution 2: Read non-entitized XML without normalization

    Solution 1 is the cleaner way; however, it is possible that you already have the non-entitized XML and you cannot modify the creation and still you want to prevent normalization. The accepted answer suggests a replace but that replaces every \n occurrences blindly even if it is not desirable. To retrieve all of the line endings as they are in the file you can try to use the legacy XmlTextReader class, which does not normalize XML files by default. You can use it with XDocument, too:

    var xDoc = XDocument.Load(new XmlTextReader(fileName));
    
    0 讨论(0)
  • 2021-01-11 11:35

    I had to write database data to an xml file and read it back from the xml file, using LINQ to XML. Some fields in a record were themselves xml strings complete with \r characters. These had to remain intact. I spent days trying to find something that would work, but it seems Microsoft was by design converting \r to \n.

    The following solution works for me:

    To write a loaded XDocument to the XML file keeping \r intact, where xDoc is an XDocument and filePath is a string:

    XmlWriterSettings xmlWriterSettings = new XmlWriterSettings 
        { NewLineHandling = NewLineHandling.None, Indent = true };
    using (XmlWriter xmlWriter = XmlWriter.Create(filePath, xmlWriterSettings))
    {
        xDoc.Save(xmlWriter);
        xmlWriter.Flush();
    }
    

    To read an XML file into an XElement keeping \r intact:

    using (XmlTextReader xmlTextReader = new XmlTextReader(filePath) 
       { WhitespaceHandling = WhitespaceHandling.Significant })
    {
         xmlTextReader.MoveToContent();
         xDatabaseElement = XElement.Load(xmlTextReader);
    }
    
    0 讨论(0)
  • 2021-01-11 11:39

    There's a quicker way if you're just trying to get to UTF-8. First create a writer:

    public class EncodedStringWriter : StringWriter
    {
        public EncodedStringWriter(StringBuilder sb, Encoding encoding)
            : base(sb)
        {
            _encoding = encoding;
        }
    
        private Encoding _encoding;
    
        public override Encoding Encoding
        {
            get
            {
                return _encoding;
            }
        }
    
    }
    

    Then use it:

    XmlDocument doc = new XmlDocument();
    doc.LoadXml("<foo><bar /></foo>");
    
    StringBuilder sb = new StringBuilder();
    XmlWriterSettings xws = new XmlWriterSettings();
    xws.Indent = true;
    
    using( EncodedStringWriter w = new EncodedStringWriter(sb, Encoding.UTF8) )
    {
        using( XmlWriter writer = XmlWriter.Create(w, xws) )
        {
            doc.WriteTo(writer);
        }
    }
    string xml = sb.ToString();
    

    Gotta give credit where credit is due.

    0 讨论(0)
  • 2021-01-11 11:42

    XmlReader reads files, not writes them. If you are getting \n in your reader it is because that's what's in the file. Both \n and \r are whitespace and are semantically the same in XML, it will not affect the meaning or content of the data.

    Edit:

    That looks like C#, not Ruby. As binarycoder says, ReadOuterXml is defined to return normalized XML. Typically this is what you want. If you want the raw XML you should use Encoding.UTF8.GetString(memStream.ToArray()), not XmlReader.

    0 讨论(0)
提交回复
热议问题