split xml document into chunks

后端 未结 3 549
慢半拍i
慢半拍i 2021-01-07 03:26

I have a large xml document that needs to be processed 100 records at a time

It is being done within a Windows Service written in c#.

The structure is as fo

相关标签:
3条回答
  • 2021-01-07 03:31

    Naive, iterative, but works [EDIT: in .NET 3.5 only]

        public List<XDocument> ChunkDocket(XDocument docket, int chunkSize)
        {
            var newDockets = new List<XDocument>();
            var d = new XDocument(docket);
            var orders = d.Root.Elements("order");
            XDocument newDocket = null;
    
            do
            {
                newDocket = new XDocument(new XElement("docket"));
                var chunk = orders.Take(chunkSize);
                newDocket.Root.Add(chunk);
                chunk.Remove();
                newDockets.Add(newDocket);
            } while (orders.Any());
    
            return newDockets;
        }
    
    0 讨论(0)
  • 2021-01-07 03:35

    Another naive solution; this time for .NET 2.0. It should give you an idea of how to go about what you want. Uses Xpath expressions instead of Linq to XML. Chunks a 100 order docket into 10 dockets in under a second on my devbox.

     public List<XmlDocument> ChunkDocket(XmlDocument docket, int chunkSize)
        {
            List<XmlDocument> newDockets = new List<XmlDocument>();
            //            
            int orderCount = docket.SelectNodes("//docket/order").Count;
            int chunkStart = 0;
            XmlDocument newDocket = null;
            XmlElement root = null;
            XmlNodeList chunk = null;
    
            while (chunkStart < orderCount)
            {
                newDocket = new XmlDocument();
                root = newDocket.CreateElement("docket");
                newDocket.AppendChild(root);
    
                chunk = docket.SelectNodes(String.Format("//docket/order[position() > {0} and position() <= {1}]", chunkStart, chunkStart + chunkSize));
    
                chunkStart += chunkSize;
    
                XmlNode targetNode = null;
                foreach (XmlNode c in chunk)
                {
                    targetNode = newDocket.ImportNode(c, true);
                    root.AppendChild(targetNode);
                }
    
                newDockets.Add(newDocket);
            } 
    
            return newDockets;
        }
    
    0 讨论(0)
  • 2021-01-07 03:55

    If the reason to process 100 orders at a time is for performance purposes, e.g. taking too much time and resource to open a big file, You can utilize XmlReader to process order element one at a time without degrading the performance.

    XmlReader reader = XmlReader.Create(@"c:\foo\Doket.xml")
    while( reader.Read())
    {
      if(reader.LocalName == "order")
      {
         // read each child element and its value from the reader.
         // or you can deserialize the order element by using a XmlSerializer and Order class
      }     
    }
    
    0 讨论(0)
提交回复
热议问题