问题
I had recently an issue: i have two XML files and i need to check if they are equal for the content. Both file have the same kind of element nodes but in a different order, and the same is for the attributes of the nodes. Take this example:
This is file1.xml
<Car name="Ferrari" speed="420">
<Engine>V12</Engine>
<Color name="Red"/>
</Car>
<Car name="Lamborghini" speed="380">
<Engine>SV</Engine>
<Color name="White"/>
</Car>
This is file2.xml
<Car speed="380" name="Lamborghini">
<Color name="White"/>
<Engine>SV</Engine>
</Car>
<Car speed="420" name="Ferrari">
<Color name="Red"/>
<Engine>V12</Engine>
</Car>
I need something that compares this two files and return true if they are "equals", otherwise it shows up the differences. (In the example it must return true)
Obviously this was an example, the files i have to check have like 50.000+ lines of elements inside.
What i'm looking for is everything: software, library to use, manual algorithms.
Thank you very much.
回答1:
First, I wrapped your samples into <R> ... </R>
to make XML documents from them.
Then, I used xsh to process the input files into canonical order of elements: I sorted all child elements by name and by their @name attribute.
my $F1 := open file1.xml ;
my $F2 := open file2.xml ;
my $nodes = ( $F1//* | $F2//* ) ;
for my $element in { reverse @$nodes } {
if ($element/*) {
xmove &{ sort :k concat(name(), '|', @name) $element/* }
append $element ;
}
}
save :f file1.out.xml $F1 ;
save :f file2.out.xml $F2 ;
It's crucial to walk the nodes in reversed order, because otherwise the sorting wouldn't work.
To compare the resulting XMLs, I used my old xmldiff bash script that uses xmllint:
#!/bin/bash
a=($@)
b=$#
f2=${a[$((--b))]}
f1=${a[$((--b))]}
diff "${a[@]:0:$b}" \
<(xmllint --c14n "$f1" |xmllint --format -) \
<(xmllint --c14n "$f2" |xmllint --format -)
来源:https://stackoverflow.com/questions/40735729/compare-two-xml-file-without-care-about-orders-of-elements-and-attributes