问题
Bizzaro-Diff!!!
Is there a away to do a bizzaro/inverse-diff that only displays the portions of a group of files that are the same? (I.E. way more than three files)
Odd question, I know...but I'm converting someone's ancient static pages to something a little more manageable.
回答1:
You want a clone detector. It detects similar code chunks across large source systems. See our ClonedR tool: http://www.semdesigns.com/Products/Clone/index.html
回答2:
You could try the comm command (for common). It'll only compare 2 files at a time, but you should be able to do 3+ with some clever scripting.
回答3:
You could try sim. Been a few years since I've used it, but I recall it being very useful when looking for similarities within a file or in many different files.
回答4:
This is a classic problem.
If I had to quick-and-dirty it, I'd probably do something like a diff -U 1000000 (assuming a version of diff that supports it), piped through sed to just get the lines in common (and strip the leading spaces). You'd have to loop through all the files, though.
Edit: I forgot there is also Tcl implementation that would be slightly more versatile, but would require more coding. You may be able to find an implementation for your language of choice.
来源:https://stackoverflow.com/questions/522221/using-diff-to-find-the-portions-of-many-files-that-are-the-same-bizzaro-diff