Get difference between two lists

前端 未结 27 2835
傲寒
傲寒 2020-11-21 11:36

I have two lists in Python, like these:

temp1 = [\'One\', \'Two\', \'Three\', \'Four\']
temp2 = [\'One\', \'Two\']

I need to create a third

相关标签:
27条回答
  • 2020-11-21 11:55

    Here's a Counter answer for the simplest case.

    This is shorter than the one above that does two-way diffs because it only does exactly what the question asks: generate a list of what's in the first list but not the second.

    from collections import Counter
    
    lst1 = ['One', 'Two', 'Three', 'Four']
    lst2 = ['One', 'Two']
    
    c1 = Counter(lst1)
    c2 = Counter(lst2)
    diff = list((c1 - c2).elements())
    

    Alternatively, depending on your readability preferences, it makes for a decent one-liner:

    diff = list((Counter(lst1) - Counter(lst2)).elements())
    

    Output:

    ['Three', 'Four']
    

    Note that you can remove the list(...) call if you are just iterating over it.

    Because this solution uses counters, it handles quantities properly vs the many set-based answers. For example on this input:

    lst1 = ['One', 'Two', 'Two', 'Two', 'Three', 'Three', 'Four']
    lst2 = ['One', 'Two']
    

    The output is:

    ['Two', 'Two', 'Three', 'Three', 'Four']
    
    0 讨论(0)
  • 2020-11-21 11:55

    We can calculate intersection minus union of lists:

    temp1 = ['One', 'Two', 'Three', 'Four']
    temp2 = ['One', 'Two', 'Five']
    
    set(temp1+temp2)-(set(temp1)&set(temp2))
    
    Out: set(['Four', 'Five', 'Three']) 
    
    0 讨论(0)
  • 2020-11-21 11:56

    Here is an simple way to distinguish two lists (whatever the contents are), you can get the result as shown below :

    >>> from sets import Set
    >>>
    >>> l1 = ['xvda', False, 'xvdbb', 12, 'xvdbc']
    >>> l2 = ['xvda', 'xvdbb', 'xvdbc', 'xvdbd', None]
    >>>
    >>> Set(l1).symmetric_difference(Set(l2))
    Set([False, 'xvdbd', None, 12])
    

    Hope this will helpful.

    0 讨论(0)
  • 2020-11-21 11:57

    You could use a naive method if the elements of the difflist are sorted and sets.

    list1=[1,2,3,4,5]
    list2=[1,2,3]
    
    print list1[len(list2):]
    

    or with native set methods:

    subset=set(list1).difference(list2)
    
    print subset
    
    import timeit
    init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
    print "Naive solution: ", timeit.timeit('temp1[len(temp2):]', init, number = 100000)
    print "Native set solution: ", timeit.timeit('set(temp1).difference(temp2)', init, number = 100000)
    

    Naive solution: 0.0787101593292

    Native set solution: 0.998837615564

    0 讨论(0)
  • 2020-11-21 11:58

    The existing solutions all offer either one or the other of:

    • Faster than O(n*m) performance.
    • Preserve order of input list.

    But so far no solution has both. If you want both, try this:

    s = set(temp2)
    temp3 = [x for x in temp1 if x not in s]
    

    Performance test

    import timeit
    init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
    print timeit.timeit('list(set(temp1) - set(temp2))', init, number = 100000)
    print timeit.timeit('s = set(temp2);[x for x in temp1 if x not in s]', init, number = 100000)
    print timeit.timeit('[item for item in temp1 if item not in temp2]', init, number = 100000)
    

    Results:

    4.34620224079 # ars' answer
    4.2770634955  # This answer
    30.7715615392 # matt b's answer
    

    The method I presented as well as preserving order is also (slightly) faster than the set subtraction because it doesn't require construction of an unnecessary set. The performance difference would be more noticable if the first list is considerably longer than the second and if hashing is expensive. Here's a second test demonstrating this:

    init = '''
    temp1 = [str(i) for i in range(100000)]
    temp2 = [str(i * 2) for i in range(50)]
    '''
    

    Results:

    11.3836875916 # ars' answer
    3.63890368748 # this answer (3 times faster!)
    37.7445402279 # matt b's answer
    
    0 讨论(0)
  • 2020-11-21 11:59
    temp3 = [item for item in temp1 if item not in temp2]
    
    0 讨论(0)
提交回复
热议问题