Compare two xml and print the difference using LINQ

谁都会走 提交于 2019-12-18 15:54:38

问题


I am comparing two xml and I have to print the difference. How can I achieve this using LINQ. I know I can use XML diff patch by Microsoft but I prefer to use LINQ . If you have any other idea I will implement that

//First Xml

<Books>
 <book>  
  <id="20504" image="C01" name="C# in Depth">
 </book>  
 <book> 
  <id="20505" image="C02" name="ASP.NET">
 </book> 
 <book> 
  <id="20506" image="C03" name="LINQ in Action ">
 </book> 
 <book> 
  <id="20507" image="C04" name="Architecting Applications">
 </book> 
</Books>

//Second Xml

<Books>
  <book> 
    <id="20504" image="C011" name="C# in Depth">
  </book>
  <book> 
    <id="20505" image="C02" name="ASP.NET 2.0">
  </book>
  <book> 
    <id="20506" image="C03" name="LINQ in Action ">
  </book>
  <book> 
    <id="20508" image="C04" name="Architecting Applications">
  </book>
</Books>

I want to compare this two xml and print result like this.

Issued       Issue Type             IssueInFirst    IssueInSecond

1            image is different      C01              C011
2            name  is different      ASP.NET          ASP.NET 2.0
3            id  is different        20507            20508

回答1:


Here is the solution:

//sanitised xmls:
string s1 = @"<Books>
                 <book id='20504' image='C01' name='C# in Depth'/>
                 <book id='20505' image='C02' name='ASP.NET'/>
                 <book id='20506' image='C03' name='LINQ in Action '/>
                 <book id='20507' image='C04' name='Architecting Applications'/>
                </Books>";
string s2 = @"<Books>
                  <book id='20504' image='C011' name='C# in Depth'/>
                  <book id='20505' image='C02' name='ASP.NET 2.0'/>
                  <book id='20506' image='C03' name='LINQ in Action '/>
                  <book id='20508' image='C04' name='Architecting Applications'/>
                </Books>";

XDocument xml1 = XDocument.Parse(s1);
XDocument xml2 = XDocument.Parse(s2);

//get cartesian product (i think)
var result1 =   from xmlBooks1 in xml1.Descendants("book")
                from xmlBooks2 in xml2.Descendants("book")
                select new { 
                            book1 = new {
                                        id=xmlBooks1.Attribute("id").Value,
                                        image=xmlBooks1.Attribute("image").Value,
                                        name=xmlBooks1.Attribute("name").Value
                                      }, 
                            book2 = new {
                                        id=xmlBooks2.Attribute("id").Value,
                                        image=xmlBooks2.Attribute("image").Value,
                                        name=xmlBooks2.Attribute("name").Value
                                      } 
                             };

//get every record that has at least one attribute the same, but not all
var result2 = from i in result1
                 where (i.book1.id == i.book2.id 
                        || i.book1.image == i.book2.image 
                        || i.book1.name == i.book2.name) &&
                        !(i.book1.id == i.book2.id 
                        && i.book1.image == i.book2.image 
                        && i.book1.name == i.book2.name) 
                 select i;



foreach (var aa in result2)
{
    //you do the output :D
}

Both linq statements probably could be merged, but I leave that as an exercise for you.




回答2:


For fun, a general solution to grega g's reading of the problem. To illustrate my objection to this approach, I've introduced a "correct" entry for 'PowerShell in Action'.

string s1 = @"<Books>
     <book id='20504' image='C01' name='C# in Depth'/>
     <book id='20505' image='C02' name='ASP.NET'/>
     <book id='20506' image='C03' name='LINQ in Action '/>
     <book id='20507' image='C04' name='Architecting Applications'/>
     <book id='20508' image='C05' name='PowerShell in Action'/>
    </Books>";
string s2 = @"<Books>
     <book id='20504' image='C011' name='C# in Depth'/>
     <book id='20505' image='C02' name='ASP.NET 2.0'/>
     <book id='20506' image='C03' name='LINQ in Action '/>
     <book id='20508' image='C04' name='Architecting Applications'/>
     <book id='20508' image='C05' name='PowerShell in Action'/>
    </Books>";

XDocument xml1 = XDocument.Parse(s1);
XDocument xml2 = XDocument.Parse(s2);

var res = from b1 in xml1.Descendants("book")
          from b2 in xml2.Descendants("book")
          let issues = from a1 in b1.Attributes()
                       join a2 in b2.Attributes()
                         on a1.Name equals a2.Name
                       select new
                       {
                           Name = a1.Name,
                           Value1 = a1.Value,
                           Value2 = a2.Value
                       }
          where issues.Any(i => i.Value1 == i.Value2)
          from issue in issues
          where issue.Value1 != issue.Value2
          select issue;

Which reports the following:

{ Name = image, Value1 = C01, Value2 = C011 }
{ Name = name, Value1 = ASP.NET, Value2 = ASP.NET 2.0 }
{ Name = id, Value1 = 20507, Value2 = 20508 }
{ Name = image, Value1 = C05, Value2 = C04 }
{ Name = name, Value1 = PowerShell in Action, Value2 = Architecting Applications }

Note that the last two entries are the "conflict" between the 20508 typo and the otherwise correct 20508 entry.




回答3:


The operation you want here is a Zip to pair up corresponding elements in your two sequences of books. That operator is being added in .NET 4.0, but we can fake it by using Select to grab the books' indices and joining on that:

var res = from b1 in xml1.Descendants("book")
                         .Select((b, i) => new { b, i })
          join b2 in xml2.Descendants("book")
                         .Select((b, i) => new { b, i })
            on b1.i equals b2.i

We'll then use a second join to compare the values of attributes by name. Note that this is an inner join; if you did want to include attributes missing from one or the other you would have to do quite a bit more work.

          select new
          {
              Row = b1.i,
              Diff = from a1 in b1.b.Attributes()
                     join a2 in b2.b.Attributes()
                       on a1.Name equals a2.Name
                     where a1.Value != a2.Value
                     select new
                     {
                         Name = a1.Name,
                         Value1 = a1.Value,
                         Value2 = a2.Value
                     }
          };

The result will be a nested collection:

foreach (var b in res)
{
    Console.WriteLine("Row {0}: ", b.Row);
    foreach (var d in b.Diff)
        Console.WriteLine(d);
}

Or to get multiple rows per book:

var report = from r in res
             from d in r.Diff
             select new { r.Row, Diff = d };

foreach (var d in report)
    Console.WriteLine(d);

Which reports the following:

{ Row = 0, Diff = { Name = image, Value1 = C01, Value2 = C011 } }
{ Row = 1, Diff = { Name = name, Value1 = ASP.NET, Value2 = ASP.NET 2.0 } }
{ Row = 3, Diff = { Name = id, Value1 = 20507, Value2 = 20508 } }


来源:https://stackoverflow.com/questions/1470011/compare-two-xml-and-print-the-difference-using-linq

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!