How to compare XML files

后端 未结 5 1544
你的背包
你的背包 2021-02-19 11:50

I have two XML files (XSD) which are generated by some tool.
The tool doesn\'t preserve the order of elements so although the content is equal comparing it as text will resu

相关标签:
5条回答
  • 2021-02-19 12:30

    I had a similar problem and I eventually found: http://superuser.com/questions/79920/how-can-i-diff-two-xml-files

    That post suggests doing a canonical XML sort then doing a diff. The following should work for you if you are on Linux, Mac, or if you have Windows with something like Cygwin installed:

    $ xmllint --c14n FileA.xml > 1.xml
    $ xmllint --c14n FileB.xml > 2.xml
    $ diff 1.xml 2.xml
    
    0 讨论(0)
  • 2021-02-19 12:38

    The XML samples are fundamentally different. Even though the content and the hierarchy may be identical the relationships between peers is different. When XML is parsed it is parsed into a structure called a DOM where relationships between units is very important. If you want to discount the nature of relationships between peer entities then you will likely need custom software. I recommend finding some simple open-source XML aware diff tool and adding the additional requirements that you need. I wrote one at http://prettydiff.com/ but I suggest you look around to see what is available before making a decision, because editing somebody else's algorithms may require a bit of heavy lifting.

    0 讨论(0)
  • 2021-02-19 12:40

    You can use the perl module DifferenceMarkup http://metacpan.org/pod/XML::DifferenceMarkup or the xmldiff pecl.php.net/xmldiff extension in PHP. Both will produce a human readable XML diff document.

    0 讨论(0)
  • 2021-02-19 12:46

    Have a look at Using XSLT to Assist Regression Testing that describe a solution using xslt

    0 讨论(0)
  • 2021-02-19 12:50

    For what it's worth, I have created a java tool (or kotlin actually) for effecient and configurable canonicalization of xml files.

    It will always:

    • Sort nodes and attributes by name.
    • Remove namespaces (yes - it could - hypothetically - be a problem).
    • Prettyprint the result.

    In addition you can tell it to:

    • Remove a given list of node names - maybe you do not want to know that the value of a piece of metadata - say <RequestReceivedTimestamp> has changed.
    • Sort a given list of collections in the context of the parent - maybe you do not care that the order of <Contact> entries in <ListOfFavourites> has changed.

    It uses XSLT and does all the above efficiently using chaining.

    Limitations

    It does support sorting nested lists - sorting innermost lists before outer. But it cannot reliably sort arbitrary levels of recursively nested lists.

    If you have such needs you can - after having used this tool - compare the sorted byte arrays of the results. they will be equal if only list sorting issues remain.

    Where to get it

    You can get it here: XMLNormalize

    0 讨论(0)
提交回复
热议问题