Python - getting just the difference between strings

后端 未结 6 2092
心在旅途
心在旅途 2021-01-12 08:12

What\'s the best way of getting just the difference from two multiline strings?

a = \'testing this is working \\n testing this is working 1 \\n\'
b = \'testi         


        
相关标签:
6条回答
  • 2021-01-12 08:25
    import itertools as it
    
    
    "".join(y for x, y in it.zip_longest(a, b) if x != y)
    # ' testing this is working 2'
    

    Alternatively

    import collections as ct
    
    
    ca = ct.Counter(a.split("\n"))
    cb = ct.Counter(b.split("\n"))
    
    diff = cb - ca
    "".join(diff.keys())
    
    0 讨论(0)
  • 2021-01-12 08:29

    The easiest Hack, credits @Chris, by using split().

    Note : you need to determine which is the longer string, and use that for split.

    if len(a)>len(b): 
       res=''.join(a.split(b))             #get diff
    else: 
       res=''.join(b.split(a))             #get diff
    
    print(res.strip())                     #remove whitespace on either sides
    

    # driver values

    IN : a = 'testing this is working \n testing this is working 1 \n' 
    IN : b = 'testing this is working \n testing this is working 1 \n testing this is working 2'
    
    OUT : testing this is working 2
    

    EDIT : thanks to @ekhumoro for another hack using replace, with no need for any of the join computation required.

    if len(a)>len(b): 
        res=a.replace(b,'')             #get diff
    else: 
        res=b.replace(a,'')             #get diff
    
    0 讨论(0)
  • 2021-01-12 08:37
    a = 'testing this is working \n testing this is working 1 \n'
    b = 'testing this is working \n testing this is working 1 \n testing this is working 2'
    
    splitA = set(a.split("\n"))
    splitB = set(b.split("\n"))
    
    diff = splitB.difference(splitA)
    diff = ", ".join(diff)  # ' testing this is working 2, more things if there were...'
    

    Essentially making each string a set of lines, and taking the set difference - i.e. All things in B that are not in A. Then taking that result and joining it all into one string.

    Edit: This is a conveluded way of saying what @ShreyasG said - [x for x if x not in y]...

    0 讨论(0)
  • 2021-01-12 08:38

    You could use the following function:

    def __slave(a, b):
    
        for i, l_a in enumerate(a):
            if b == l_a:
                return i
        return -1
    
    def diff(a, b):
    
        t_b = b
        c_i = 0
        for c in a:
    
            t_i = __slave(t_b, c)
            if t_i != -1 and (t_i > c_i or t_i == c_i):
                c_i = t_i
                t_b = t_b[:c_i] + t_b[c_i+1:]
    
        t_a = a
        c_i = 0
        for c in b:
    
            t_i = __slave(t_a, c)
            if t_i != -1 and (t_i > c_i or t_i == c_i):
                c_i = t_i
                t_a = t_a[:c_i] + t_a[c_i+1:]
    
        return t_b + t_a
    

    Usage sample print diff(a, b)

    0 讨论(0)
  • 2021-01-12 08:44

    This is basically @Godron629's answer, but since I can't comment, I'm posting it here with a slight modification: changing difference for symmetric_difference so that the order of the sets doesn't matter.

    a = 'testing this is working \n testing this is working 1 \n'
    b = 'testing this is working \n testing this is working 1 \n testing this is working 2'
    
    splitA = set(a.split("\n"))
    splitB = set(b.split("\n"))
    
    diff = splitB.symmetric_difference(splitA)
    diff = ", ".join(diff)  # ' testing this is working 2, some more things...'
    
    0 讨论(0)
  • 2021-01-12 08:45

    Building on @Chris_Rands comment, you can use the splitlines() operation too (if your strings are multi-lines and you want the line not present in one but the other):

    b_s = b.splitlines()
    a_s = a.splitlines()
    [x for x in b_s if x not in a_s]
    

    Expected output is:

    [' testing this is working 2']
    
    0 讨论(0)
提交回复
热议问题