I\'d like to use \'diff\' to get a both line difference between and character difference. For example, consider:
File 1
abcde
ab
Coloured, character-level diff
ouput
Here's what you can do with the the below script and diff-highlight (which is part of git):
#!/bin/sh -eu
# Use diff-highlight to show word-level differences
diff -U3 --minimal "$@" |
sed 's/^-/\x1b[1;31m-/;s/^+/\x1b[1;32m+/;s/^@/\x1b[1;34m@/;s/$/\x1b[0m/' |
diff-highlight
(Credit to @retracile's answer for the sed
highlighting)
Python has convenient library named difflib
which might help answer your question.
Below are two oneliners using difflib
for different python versions.
python3 -c 'import difflib, sys; \
print("".join( \
difflib.ndiff( \
open(sys.argv[1]).readlines(),open(sys.argv[2]).readlines())))'
python2 -c 'import difflib, sys; \
print "".join( \
difflib.ndiff( \
open(sys.argv[1]).readlines(), open(sys.argv[2]).readlines()))'
These might come in handy as a shell alias which is easier to move around with your .${SHELL_NAME}rc
.
$ alias char_diff="python2 -c 'import difflib, sys; print \"\".join(difflib.ndiff(open(sys.argv[1]).readlines(), open(sys.argv[2]).readlines()))'"
$ char_diff old_file new_file
And more readable version to put in a standalone file.
#!/usr/bin/env python2
from __future__ import with_statement
import difflib
import sys
with open(sys.argv[1]) as old_f, open(sys.argv[2]) as new_f:
old_lines, new_lines = old_f.readlines(), new_f.readlines()
diff = difflib.ndiff(old_lines, new_lines)
print ''.join(diff)
cmp -l file1 file2 | wc
Worked well for me. The leftmost number of the result indicates the number of characters that differ.