Get difference between two lists

前端 未结 27 2830
傲寒
傲寒 2020-11-21 11:36

I have two lists in Python, like these:

temp1 = [\'One\', \'Two\', \'Three\', \'Four\']
temp2 = [\'One\', \'Two\']

I need to create a third

相关标签:
27条回答
  • 2020-11-21 11:40

    This is another solution:

    def diff(a, b):
        xa = [i for i in set(a) if i not in b]
        xb = [i for i in set(b) if i not in a]
        return xa + xb
    
    0 讨论(0)
  • 2020-11-21 11:40

    I am little too late in the game for this but you can do a comparison of performance of some of the above mentioned code with this, two of the fastest contenders are,

    list(set(x).symmetric_difference(set(y)))
    list(set(x) ^ set(y))
    

    I apologize for the elementary level of coding.

    import time
    import random
    from itertools import filterfalse
    
    # 1 - performance (time taken)
    # 2 - correctness (answer - 1,4,5,6)
    # set performance
    performance = 1
    numberoftests = 7
    
    def answer(x,y,z):
        if z == 0:
            start = time.clock()
            lists = (str(list(set(x)-set(y))+list(set(y)-set(y))))
            times = ("1 = " + str(time.clock() - start))
            return (lists,times)
    
        elif z == 1:
            start = time.clock()
            lists = (str(list(set(x).symmetric_difference(set(y)))))
            times = ("2 = " + str(time.clock() - start))
            return (lists,times)
    
        elif z == 2:
            start = time.clock()
            lists = (str(list(set(x) ^ set(y))))
            times = ("3 = " + str(time.clock() - start))
            return (lists,times)
    
        elif z == 3:
            start = time.clock()
            lists = (filterfalse(set(y).__contains__, x))
            times = ("4 = " + str(time.clock() - start))
            return (lists,times)
    
        elif z == 4:
            start = time.clock()
            lists = (tuple(set(x) - set(y)))
            times = ("5 = " + str(time.clock() - start))
            return (lists,times)
    
        elif z == 5:
            start = time.clock()
            lists = ([tt for tt in x if tt not in y])
            times = ("6 = " + str(time.clock() - start))
            return (lists,times)
    
        else:    
            start = time.clock()
            Xarray = [iDa for iDa in x if iDa not in y]
            Yarray = [iDb for iDb in y if iDb not in x]
            lists = (str(Xarray + Yarray))
            times = ("7 = " + str(time.clock() - start))
            return (lists,times)
    
    n = numberoftests
    
    if performance == 2:
        a = [1,2,3,4,5]
        b = [3,2,6]
        for c in range(0,n):
            d = answer(a,b,c)
            print(d[0])
    
    elif performance == 1:
        for tests in range(0,10):
            print("Test Number" + str(tests + 1))
            a = random.sample(range(1, 900000), 9999)
            b = random.sample(range(1, 900000), 9999)
            for c in range(0,n):
                #if c not in (1,4,5,6):
                d = answer(a,b,c)
                print(d[1])
    
    0 讨论(0)
  • 2020-11-21 11:42

    The difference between two lists (say list1 and list2) can be found using the following simple function.

    def diff(list1, list2):
        c = set(list1).union(set(list2))  # or c = set(list1) | set(list2)
        d = set(list1).intersection(set(list2))  # or d = set(list1) & set(list2)
        return list(c - d)
    

    or

    def diff(list1, list2):
        return list(set(list1).symmetric_difference(set(list2)))  # or return list(set(list1) ^ set(list2))
    

    By Using the above function, the difference can be found using diff(temp2, temp1) or diff(temp1, temp2). Both will give the result ['Four', 'Three']. You don't have to worry about the order of the list or which list is to be given first.

    Python doc reference

    0 讨论(0)
  • 2020-11-21 11:42

    I wanted something that would take two lists and could do what diff in bash does. Since this question pops up first when you search for "python diff two lists" and is not very specific, I will post what I came up with.

    Using SequenceMather from difflib you can compare two lists like diff does. None of the other answers will tell you the position where the difference occurs, but this one does. Some answers give the difference in only one direction. Some reorder the elements. Some don't handle duplicates. But this solution gives you a true difference between two lists:

    a = 'A quick fox jumps the lazy dog'.split()
    b = 'A quick brown mouse jumps over the dog'.split()
    
    from difflib import SequenceMatcher
    
    for tag, i, j, k, l in SequenceMatcher(None, a, b).get_opcodes():
      if tag == 'equal': print('both have', a[i:j])
      if tag in ('delete', 'replace'): print('  1st has', a[i:j])
      if tag in ('insert', 'replace'): print('  2nd has', b[k:l])
    

    This outputs:

    both have ['A', 'quick']
      1st has ['fox']
      2nd has ['brown', 'mouse']
    both have ['jumps']
      2nd has ['over']
    both have ['the']
      1st has ['lazy']
    both have ['dog']
    

    Of course, if your application makes the same assumptions the other answers make, you will benefit from them the most. But if you are looking for a true diff functionality, then this is the only way to go.

    For example, none of the other answers could handle:

    a = [1,2,3,4,5]
    b = [5,4,3,2,1]
    

    But this one does:

      2nd has [5, 4, 3, 2]
    both have [1]
      1st has [2, 3, 4, 5]
    
    0 讨论(0)
  • 2020-11-21 11:42

    single line version of arulmr solution

    def diff(listA, listB):
        return set(listA) - set(listB) | set(listA) -set(listB)
    
    0 讨论(0)
  • 2020-11-21 11:43

    This can be solved with one line. The question is given two lists (temp1 and temp2) return their difference in a third list (temp3).

    temp3 = list(set(temp1).difference(set(temp2)))
    
    0 讨论(0)
提交回复
热议问题