问题
I'd like to compare differences between two lists of strings. For my purposes, whitespace is noise and these differences do not need to be shown. Reading into difflib's documentation, "the default [for charjunk
] is module-level function IS_CHARACTER_JUNK()
, which filters out whitespace characters". Perfect, except I don't see it working, or making much difference (<- pun!).
import difflib
A = ['3 4\n']
B = ['3 4\n']
print ''.join(difflib.ndiff(A, B)) # default: charjunk=difflib.IS_CHARACTER_JUNK
outputs:
- 3 4
? -
+ 3 4
I've tried a few other linejunk
options, but none that actually ignore the differences as a result of whitespace. Do I have the wrong interpretation for what charjunk
is for?
As a side note, I can side-step this limitation by pre-processing my strings to substitute multiple whitespace characters to single space characters using re.sub(r'\W+', ' ', 'foo\t bar')
.
回答1:
I ran into the same issue using difflib.Differ().
This link explains the issue with Differ(), I imagine the issue with ndiff() is similar.
来源:https://stackoverflow.com/questions/14696476/can-difflibs-charjunk-be-used-to-ignore-whitespace