Best way to read, modify, and write XML

前端 未结 8 1916
灰色年华
灰色年华 2021-02-05 10:39

My plan is to read in an XML document using my C# program, search for particular entries which I\'d like to change, and then write out the modified document. However, I\'ve bec

相关标签:
8条回答
  • 2021-02-05 11:00

    One fairly easy approach would be to create a new XmlDocument, then use the Load() method to populate it. Once you've got the document, you can use CreateNavigator() to get an XPathNavigator object that you can use to find and alter elements in the document. Finally, you can use the Save() method on the XmlDocument to write the changed document back out.

    0 讨论(0)
  • 2021-02-05 11:02

    For the task in hand - (read existing doc, write, and modify in a formalised way) I'd go with XPathDocument run through an XslCompiledTransform.

    Where you can't formalise, don't have pre-existing docs or generally need more adaptive logic, I'd go with LINQ and XDocument like Skeet says.

    Basically if the task is transformation then XSLT, if the task is manipulation then LINQ.

    0 讨论(0)
  • 2021-02-05 11:08

    If it's actually valid XML, and will easily fit in memory, I'd choose LINQ to XML (XDocument, XElement etc) every time. It's by far the nicest XML API I've used. It's easy to form queries, and easy to construct new elements too.

    You can use XPath where that's appropriate, or the built-in axis methods (Elements(), Descendants(), Attributes() etc). If you could let us know what specific bits you're having a hard time with, I'd be happy to help work out how to express them in LINQ to XML.

    If, on the other hand, this is HTML which isn't valid XML, you'll have a much harder time - because XML APIs generalyl expect to work with valid XML documents. You could use HTMLTidy first of course, but that may have undesirable effects.

    For your specific example:

    XDocument doc = XDocument.Load("file.xml");
    foreach (var img in doc.Descendants("img"))
    {
        // src will be null if the attribute is missing
        string src = (string) img.Attribute("src");
        img.SetAttributeValue("src", src + "with-changes");
    }
    
    0 讨论(0)
  • 2021-02-05 11:09

    If you have smaller documents which fit in computers memory you can use XmlDocument. Otherwise you can use XmlReader to iterate through the document.

    Using XmlReader you can find out the elements type using:

    while (xml.Read()) {
       switch xml.NodeType {
         case XmlNodeType.Element:
          //Do something
         case XmlNodeType.Text:
          //Do something
         case XmlNodeType.EndElement:  
          //Do something
       }
    }
    
    0 讨论(0)
  • 2021-02-05 11:10

    Just start by reading the documentation of the Xml namespace on the MSDN. Then if you have more specific questions, post them here...

    0 讨论(0)
  • 2021-02-05 11:18

    My favorite tool for this kind of thing is HtmlAgilityPack. I use it to parse complex HTML documents into LINQ-queryable collections. It is an extremely useful tool for querying and parsing HTML (which is often not valid XML).

    For your problem, the code would look like:

    var htmlDoc = HtmlAgilityPack.LoadDocument(stringOfHtml);
    var images = htmlDoc.DocumentNode.SelectNodes("//img[id=lookforthis]");
    
    if(images != null)
    {
      foreach (HtmlNode node in images)  
      {  
          node.Attributes.Append("alt", "added an alt to lookforthis images.");  
      }  
    }
    
    htmlDoc.Save('output.html');
    
    0 讨论(0)
提交回复
热议问题