Performance: XmlReader or LINQ to XML

筅森魡賤 提交于 2019-12-31 17:39:30

问题


I have a 150 MB XML file which is used as DB in my project. Currently I'm using XmlReader to read content from it. I want to know if it is better to use XmlReader or LINQ to XML for this scenario.

Note that I'm searching for an item in this XML and display search result, so it can take a long time or just a moment.


回答1:


If you want performance use XMLReader. It doesn't read the whole file and build the DOM tree in memory. It instead, reads the file from disk and gives you back each node it finds on the way.

With a quick google search I found a performance comparison of XMLReader, LinqToXML and XDocument.Load.

https://web.archive.org/web/20130517114458/http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html




回答2:


I would personally look at using Linq to Xml utilizing the streaming techniques outlined in the Microsoft help file: http://msdn.microsoft.com/en-us/library/system.xml.linq.xstreamingelement.aspx#Y1392

Here's a quick benchmark test reading from a 200mb xml file with a simple filter:

var xmlFilename = "test.xml";

//create test xml file
var initMemoryUsage = GC.GetTotalMemory(true);
var timer = System.Diagnostics.Stopwatch.StartNew();
var rand = new Random();
var testDoc = new XStreamingElement("root", //in order to stream xml output XStreamingElement needs to be used for all parent elements of collection so no XDocument
    Enumerable.Range(1, 10000000).Select(idx => new XElement("child", new XAttribute("id", rand.Next(0, 1000))))
);
testDoc.Save(xmlFilename);
var outStat = String.Format("{0:f2} sec {1:n0} kb //linq to xml ouput streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//linq to xml not streamed
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col1 = XDocument.Load(xmlFilename).Root.Elements("child").Where(e => (int)e.Attribute("id") < 10).Select(e => (int)e.Attribute("id")).ToArray();
var stat1 = String.Format("{0:f2} sec {1:n0} kb //linq to xml input not streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//xmlreader
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col2 = new List<int>();
using (var reader = new XmlTextReader(xmlFilename))
{
    while (reader.ReadToFollowing("child"))
    {
        reader.MoveToAttribute("id");
        int value = Convert.ToInt32(reader.Value);
        if (value < 10)
            res2.Add(value);
    }
}
var stat2 = String.Format("{0:f2} sec {1:n0} kb //xmlreader", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//linq to xml streamed
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col3 = StreamElements(xmlFilename, "child").Where(e => (int)e.Attribute("id") < 10).Select(e => (int)e.Attribute("id")).ToArray();
var stat3 = String.Format("{0:f2} sec {1:n0} kb //linq to xml input streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//util method
public static IEnumerable<XElement> StreamElements(string filename, string elementName)
{
    using (var reader = XmlTextReader.Create(filename))
    {
        while (reader.Name == elementName || reader.ReadToFollowing(elementName))
            yield return (XElement)XElement.ReadFrom(reader);
    }
}

And here's the processing time and memory usage on my machine:

11.49 sec 225 kb      // linq to xml ouput streamed

17.36 sec 782,312 kb  // linq to xml input not streamed
6.52 sec 1,825 kb     // xmlreader
11.74 sec 2,238 kb    // linq to xml input streamed



回答3:


Write a few benchmark tests to establish exactly what the situation is for you, and take it from there... Linq2XML introduces a lot of flexibility...



来源:https://stackoverflow.com/questions/2735434/performance-xmlreader-or-linq-to-xml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!