Is this an inefficient way to parse XML?

佐手、 提交于 2019-12-11 03:27:56

问题


I'm probably worrying about wrong optimization, but I have this nagging thought that it's parsing the xml tree over and over and over again, maybe I read it somewhere. Can't remember.

Anyways, here's what I'm doing:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;
using System.Net;

namespace LinqTestingGrounds
{
    class Program
    {
        static void Main(string[] args)
        {
            WebClient webClient = new WebClient();
            webClient.DownloadStringCompleted += new DownloadStringCompletedEventHandler(webClient_DownloadStringCompleted);
            webClient.DownloadStringAsync(new Uri("http://www.dreamincode.net/forums/xml.php?showuser=335389"));
            Console.ReadLine();
        }

        static void webClient_DownloadStringCompleted(object sender, DownloadStringCompletedEventArgs e)
        {
            if (e.Error != null)
            {
                return;
            }

            XDocument xml = XDocument.Parse(e.Result);

            User user = new User();
            user.ID = xml.Element("ipb").Element("profile").Element("id").Value;
            user.Name = xml.Element("ipb").Element("profile").Element("name").Value;
            user.Rating = xml.Element("ipb").Element("profile").Element("rating").Value;
            user.Photo = xml.Element("ipb").Element("profile").Element("photo").Value;
            user.Reputation = xml.Element("ipb").Element("profile").Element("reputation").Value;
            user.Group = xml.Element("ipb").Element("profile").Element("group").Element("span").Value;
            user.Posts = xml.Element("ipb").Element("profile").Element("posts").Value;
            user.PostsPerDay = xml.Element("ipb").Element("profile").Element("postsperday").Value;
            user.JoinDate = xml.Element("ipb").Element("profile").Element("joined").Value;
            user.ProfileViews = xml.Element("ipb").Element("profile").Element("views").Value;
            user.LastActive = xml.Element("ipb").Element("profile").Element("lastactive").Value;
            user.Location = xml.Element("ipb").Element("profile").Element("location").Value;
            user.Title = xml.Element("ipb").Element("profile").Element("title").Value;
            user.Age = xml.Element("ipb").Element("profile").Element("age").Value;
            user.Birthday= xml.Element("ipb").Element("profile").Element("birthday").Value;
            user.Gender = xml.Element("ipb").Element("profile").Element("gender").Element("gender").Element("value").Value;

            Console.WriteLine(user.ID);
            Console.WriteLine(user.Name);
            Console.WriteLine(user.Rating);
            Console.WriteLine(user.Photo);
            Console.WriteLine(user.Reputation);
            Console.WriteLine(user.Group);
            Console.WriteLine(user.Posts);
            Console.WriteLine(user.PostsPerDay);
            Console.WriteLine(user.JoinDate);
            Console.WriteLine(user.ProfileViews);
            Console.WriteLine(user.LastActive);
            Console.WriteLine(user.Location);
            Console.WriteLine(user.Title);
            Console.WriteLine(user.Age);
            Console.WriteLine(user.Birthday);
            Console.WriteLine(user.Gender);

            //Console.WriteLine(xml);            
        }
    }
}

Is this Good Enough™ or is there a much faster way to parse the things I need?

ps. I'm doing the bulk of the operations in the DownloadStringCompleted event, should I NOT be doing this? First time using this method. Thanks!


回答1:


Don't know about efficiency, but for readability use a profile variable instead of traversing the whole things over and over again:

 User user = new User();
 var profile = xml.Element("ipb").Element("profile");
 user.ID = profile.Element("id").Value;



回答2:


I believe xml serialization is the way to go for this type of problem. As long as your properties match the xml elements, this will be trivial. Otherwise you just need to map them using the XmlElement and XmlAttribute attribute classes. Here is some simple code for common xml deserialisation into a class:

public T Deserialise(string someXml)
    {   
        XmlSerializer reader = new XmlSerializer(typeof (T));
        StringReader stringReader = new StringReader(someXml);
        XmlTextReader xmlReader = new XmlTextReader(stringReader);
        return (T) reader.Deserialize(xmlReader);
    }



回答3:


I addition to Oded's answer: the other way to improve readability is use XPathSelectElement extension method.

So your code would look like:

user.ID = xml.XPathSelectElement("ipb/profile/id").Value;



回答4:


No, it's not parsing the XML over and over; just once, when you call

XDocument.Parse(e.Result);

The calls after that just access the tree structure in the xml object.

"Parsing" means analyzing an unstructured text string (such as comes from a file) and creating data structures (such as a tree) from it. Your ... .Element("foo") calls are not parsing but accessing parts of the data structure that was built by the XDocument.Parse() call.

If you're wondering whether your code is repeating some steps redundantly, and could be optimized, then yes, you're redundantly traversing ipb/profile. This isn't parsing, but the calls to Element("foo") do have to do some work comparing the string arguments to the names of child elements. @Oded's suggestion fixes this for readability reasons but also helps efficiency.



来源:https://stackoverflow.com/questions/4015971/is-this-an-inefficient-way-to-parse-xml

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!