Best way to read, modify, and write XML

前端未结

关注

 8  1916

My plan is to read in an XML document using my C# program, search for particular entries which I\'d like to change, and then write out the modified document. However, I\'ve bec

相关标签:

8条回答

渐次进展

2021-02-05 11:00

One fairly easy approach would be to create a new XmlDocument, then use the Load() method to populate it. Once you've got the document, you can use CreateNavigator() to get an XPathNavigator object that you can use to find and alter elements in the document. Finally, you can use the Save() method on the XmlDocument to write the changed document back out.

0 讨论(0)
发布评论:

提交评论
- 加载中...
野的像风

2021-02-05 11:02

For the task in hand - (read existing doc, write, and modify in a formalised way) I'd go with XPathDocument run through an XslCompiledTransform.

Where you can't formalise, don't have pre-existing docs or generally need more adaptive logic, I'd go with LINQ and XDocument like Skeet says.

Basically if the task is transformation then XSLT, if the task is manipulation then LINQ.

0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2021-02-05 11:08
If it's actually valid XML, and will easily fit in memory, I'd choose LINQ to XML (XDocument, XElement etc) every time. It's by far the nicest XML API I've used. It's easy to form queries, and easy to construct new elements too.

You can use XPath where that's appropriate, or the built-in axis methods (Elements(), Descendants(), Attributes() etc). If you could let us know what specific bits you're having a hard time with, I'd be happy to help work out how to express them in LINQ to XML.

If, on the other hand, this is HTML which isn't valid XML, you'll have a much harder time - because XML APIs generalyl expect to work with valid XML documents. You could use HTMLTidy first of course, but that may have undesirable effects.

For your specific example:
```
XDocument doc = XDocument.Load("file.xml");
foreach (var img in doc.Descendants("img"))
{
    // src will be null if the attribute is missing
    string src = (string) img.Attribute("src");
    img.SetAttributeValue("src", src + "with-changes");
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
佛祖请我去吃肉

2021-02-05 11:09
If you have smaller documents which fit in computers memory you can use XmlDocument. Otherwise you can use XmlReader to iterate through the document.

Using XmlReader you can find out the elements type using:
```
while (xml.Read()) {
   switch xml.NodeType {
     case XmlNodeType.Element:
      //Do something
     case XmlNodeType.Text:
      //Do something
     case XmlNodeType.EndElement:  
      //Do something
   }
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
渐次进展

2021-02-05 11:10

Just start by reading the documentation of the Xml namespace on the MSDN. Then if you have more specific questions, post them here...

0 讨论(0)
发布评论:

提交评论
- 加载中...
深忆病人

2021-02-05 11:18
My favorite tool for this kind of thing is HtmlAgilityPack. I use it to parse complex HTML documents into LINQ-queryable collections. It is an extremely useful tool for querying and parsing HTML (which is often not valid XML).

For your problem, the code would look like:
```
var htmlDoc = HtmlAgilityPack.LoadDocument(stringOfHtml);
var images = htmlDoc.DocumentNode.SelectNodes("//img[id=lookforthis]");

if(images != null)
{
  foreach (HtmlNode node in images)  
  {  
      node.Attributes.Append("alt", "added an alt to lookforthis images.");  
  }  
}

htmlDoc.Save('output.html');
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页