I can never predict XMLReader behavior. Any tips on understanding?

后端 未结 2 1801
没有蜡笔的小新
没有蜡笔的小新 2021-01-17 10:07

It seems every time I use an XMLReader, I end up with a bunch of trial and error trying to figure out what I\'m about to read versus what I\'m reading versus what I just rea

相关标签:
2条回答
  • 2021-01-17 10:24

    My latest solution (which works for my current case) is to stick with Read(), IsStartElement(name) and GetAttribute(name) in implementing a state machine.

    using (System.Xml.XmlReader xr = System.Xml.XmlTextReader.Create(stm))
    {
       employeeSchedules = new Dictionary<string, EmployeeSchedule>();
       EmployeeSchedule emp = null;
       WeekSchedule sch = null;
       TimeRanges ranges = null;
       TimeRange range = null;
       while (xr.Read())
       {
          if (xr.IsStartElement("Employee"))
          {
             emp = new EmployeeSchedule();
             employeeSchedules.Add(xr.GetAttribute("Name"), emp);
          }
          else if (xr.IsStartElement("Unavailable"))
          {
             sch = new WeekSchedule();
             emp.unavailable = sch;
          }
          else if (xr.IsStartElement("Scheduled"))
          {
             sch = new WeekSchedule();
             emp.scheduled = sch;
          }
          else if (xr.IsStartElement("DaySchedule"))
          {
             ranges = new TimeRanges();
             sch.daySchedule[int.Parse(xr.GetAttribute("DayNumber"))] = ranges;
             ranges.Color = ParseColor(xr.GetAttribute("Color"));
             ranges.FillStyle = (System.Drawing.Drawing2D.HatchStyle)
                System.Enum.Parse(typeof(System.Drawing.Drawing2D.HatchStyle),
                xr.GetAttribute("Pattern"));
          }
          else if (xr.IsStartElement("TimeRange"))
          {
             range = new TimeRange(
                System.Xml.XmlConvert.ToDateTime(xr.GetAttribute("Start"),
                System.Xml.XmlDateTimeSerializationMode.Unspecified),
                new TimeSpan((long)(System.Xml.XmlConvert.ToDouble(xr.GetAttribute("Length")) * TimeSpan.TicksPerHour)));
             ranges.Add(range);
          }
       }
       xr.Close();
    }
    

    After Read, IsStartElement will return true if you just read a start element (optinally checking the name of the element read), and you can access all the attributes of that element immediately. If all you need to read is elements and attributes, this is pretty straightforward.

    Edit The new example posted in the question poses some other challenges. The correct way to read that XML seems to be like this:

    using (System.IO.StringReader sr = new System.IO.StringReader(input))
    {
       using (XmlTextReader reader = new XmlTextReader(sr))
       {
          reader.WhitespaceHandling = WhitespaceHandling.None;
    
          while(reader.Read())
          {
             if (reader.Name.Equals("machine") && (reader.NodeType == XmlNodeType.Element))
             {
                Console.Write("Machine code {0}: ", reader.GetAttribute("code"));
                Console.WriteLine(reader.ReadString());
             }
             if(reader.Name.Equals("part") && (reader.NodeType == XmlNodeType.Element))
             {
                Console.Write("Part code {0}: ", reader.GetAttribute("code"));
                Console.WriteLine(reader.ReadString());
             }
          }
       }
    }
    

    You have to use ReadString instead of ReadElementString in order to avoid reading the end element and skipping into the beginning of the next element (let the following Read() skip over the end element so it doesn't skip over the next start element). Still this seems somewhat confusing and potentially unreliable, but it works for this case.

    After some additional thought, my opinion is that XMLReader is just too confusing if you use any methods to read content other than the Read method. I think it's much simpler if you confine yourself to the Read method to read from the XML stream. Here's how it would work with the new example (once again, it seems IsStartElement, GetAttribute and Read are the key methods, and you end up with a state machine):

    while(reader.Read())
    {
       if (reader.IsStartElement("machine"))
       {
          Console.Write("Machine code {0}: ", reader.GetAttribute("code"));
       }
       if(reader.IsStartElement("part"))
       {
          Console.Write("Part code {0}: ", reader.GetAttribute("code"));
       }
       if (reader.NodeType == XmlNodeType.Text)
       {
          Console.WriteLine(reader.Value);
       }
    }
    
    0 讨论(0)
  • 2021-01-17 10:30

    Here's the thing... I've written a fair amount of serialization code (including a lot of xml processing), and I find myself in exactly the same boat as you. I have a very simple piece of guidance, therefore: don't.

    I'll happily use XmlWriter as a way to write xml quickly, but I'd walk over hot coals before choosing to implement IXmlSerializable another time - I'd simply write a separate DTO and map the data into that; it also means the schema (for "mex", "wsdl", etc) comes for free.

    0 讨论(0)
提交回复
热议问题