I am trying to use difflib to produce diff for two text files containing tweets. Here is the code:
#!/usr/bin/env python
# difflib_test
import difflib
fil
Just parse output of diff like this (change '- ' to '+ ' if needed):
#!/usr/bin/env python
# difflib_test
import difflib
file1 = open('/home/saad/Code/test/new_tweets', 'r')
file2 = open('/home/saad/PTITVProgs', 'r')
diff = difflib.ndiff(file1.readlines(), file2.readlines())
delta = ''.join(x[2:] for x in diff if x.startswith('- '))
print delta
There are multiple diff styles and different functions exist for them in the difflib
library. unified_diff
, ndiff
and context_diff
.
If you don't want the line number summaries, ndiff
function gives a Differ-style delta:
import difflib
f1 = '''1
2
3
4
5'''
f2 = '''1
3
4
5
6'''
diff = difflib.ndiff(f1,f2)
for l in diff:
print(l)
Output:
1
- 2
3
4
5
+ 6
EDIT:
You could also parse the diff to extract only the changes if that's what you want:
>>>changes = [l for l in diff if l.startswith('+ ') or l.startswith('- ')]
>>>for c in changes:
print(c)
>>>
- 2
+ 6