Are there any libraries out there for Java that will accept two strings, and return a string with formatted output as per the *nix diff command?
e.g. feed in
<You can use Apache Commons Text library to achieve this. This library provides 'diff' capability based on "very efficient algorithm from Eugene W. Myers".
This provides you ability to create your own visitor so that you can process the diff in the way you want & may be output to console or HTML etc. Here is one article which walks through nice & simple example to output side by side diff in HTML format using Apache Commons Text library & simple Java code.
I ended up rolling my own. Not sure if it's the best implementation, and it's ugly as hell, but it passes against test input.
It uses java-diff to do the heavy diff lifting (any apache commons StrBuilder and StringUtils instead of stock Java StringBuilder)
public static String diffSideBySide(String fromStr, String toStr){
// this is equivalent of running unix diff -y command
// not pretty, but it works. Feel free to refactor against unit test.
String[] fromLines = fromStr.split("\n");
String[] toLines = toStr.split("\n");
List<Difference> diffs = (new Diff(fromLines, toLines)).diff();
int padding = 3;
int maxStrWidth = Math.max(maxLength(fromLines), maxLength(toLines)) + padding;
StrBuilder diffOut = new StrBuilder();
diffOut.setNewLineText("\n");
int fromLineNum = 0;
int toLineNum = 0;
for(Difference diff : diffs) {
int delStart = diff.getDeletedStart();
int delEnd = diff.getDeletedEnd();
int addStart = diff.getAddedStart();
int addEnd = diff.getAddedEnd();
boolean isAdd = (delEnd == Difference.NONE && addEnd != Difference.NONE);
boolean isDel = (addEnd == Difference.NONE && delEnd != Difference.NONE);
boolean isMod = (delEnd != Difference.NONE && addEnd != Difference.NONE);
//write out unchanged lines between diffs
while(true) {
String left = "";
String right = "";
if (fromLineNum < (delStart)){
left = fromLines[fromLineNum];
fromLineNum++;
}
if (toLineNum < (addStart)) {
right = toLines[toLineNum];
toLineNum++;
}
diffOut.append(StringUtils.rightPad(left, maxStrWidth));
diffOut.append(" "); // no operator to display
diffOut.appendln(right);
if( (fromLineNum == (delStart)) && (toLineNum == (addStart))) {
break;
}
}
if (isDel) {
//write out a deletion
for(int i=delStart; i <= delEnd; i++) {
diffOut.append(StringUtils.rightPad(fromLines[i], maxStrWidth));
diffOut.appendln("<");
}
fromLineNum = delEnd + 1;
} else if (isAdd) {
//write out an addition
for(int i=addStart; i <= addEnd; i++) {
diffOut.append(StringUtils.rightPad("", maxStrWidth));
diffOut.append("> ");
diffOut.appendln(toLines[i]);
}
toLineNum = addEnd + 1;
} else if (isMod) {
// write out a modification
while(true){
String left = "";
String right = "";
if (fromLineNum <= (delEnd)){
left = fromLines[fromLineNum];
fromLineNum++;
}
if (toLineNum <= (addEnd)) {
right = toLines[toLineNum];
toLineNum++;
}
diffOut.append(StringUtils.rightPad(left, maxStrWidth));
diffOut.append("| ");
diffOut.appendln(right);
if( (fromLineNum > (delEnd)) && (toLineNum > (addEnd))) {
break;
}
}
}
}
//we've finished displaying the diffs, now we just need to run out all the remaining unchanged lines
while(true) {
String left = "";
String right = "";
if (fromLineNum < (fromLines.length)){
left = fromLines[fromLineNum];
fromLineNum++;
}
if (toLineNum < (toLines.length)) {
right = toLines[toLineNum];
toLineNum++;
}
diffOut.append(StringUtils.rightPad(left, maxStrWidth));
diffOut.append(" "); // no operator to display
diffOut.appendln(right);
if( (fromLineNum == (fromLines.length)) && (toLineNum == (toLines.length))) {
break;
}
}
return diffOut.toString();
}
private static int maxLength(String[] fromLines) {
int maxLength = 0;
for (int i = 0; i < fromLines.length; i++) {
if (fromLines[i].length() > maxLength) {
maxLength = fromLines[i].length();
}
}
return maxLength;
}
Busybox has a diff implementation that is very lean, should not be hard to convert to java, but you would have to add the two-column functionality.
http://c2.com/cgi/wiki?DiffAlgorithm I found this on Google and it gives some good background and links. If you care about the algorithm beyond just doing the project, a book on basic algorithm that covers Dynamic Programming or a book just on it. Algorithm knowledge is always good:)
The DiffUtils library for computing diffs, applying patches, generationg side-by-side view in Java
Diff Utils library is an OpenSource library for performing the comparison operations between texts: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.
Main reason to build this library was the lack of easy-to-use libraries with all the usual stuff you need while working with diff files. Originally it was inspired by JRCS library and it's nice design of diff module.
Main Features
- computing the difference between two texts.
- capable to hand more than plain ascci. Arrays or List of any type that implements hashCode() and equals() correctly can be subject to differencing using this library
- patch and unpatch the text with the given patch
- parsing the unified diff format
- producing human-readable differences