Remove duplicated lists in list of lists in Python

后端 未结 4 1842
忘了有多久
忘了有多久 2021-01-13 12:10

I\'ve seen some questions here very related but their answer doesn\'t work for me. I have a list of lists where some sublists are repeated but their elements may be disorder

相关标签:
4条回答
  • 2021-01-13 12:54

    I would convert each element in the list to a frozenset (which is hashable), then create a set out of it to remove duplicates:

    >>> g = [[1, 2, 3], [3, 2, 1], [1, 3, 2], [9, 0, 1], [4, 3, 2]]
    >>> set(map(frozenset, g))
    set([frozenset([0, 9, 1]), frozenset([1, 2, 3]), frozenset([2, 3, 4])])
    

    If you need to convert the elements back to lists:

    >>> map(list, set(map(frozenset, g)))
    [[0, 9, 1], [1, 2, 3], [2, 3, 4]]
    
    0 讨论(0)
  • 2021-01-13 13:06

    What about using mentioned by roippi frozenset this way:

    >>> g = [list(x) for x in set(frozenset(i) for i in [set(i) for i in g])]
    
    [[0, 9, 1], [1, 2, 3], [2, 3, 4]]
    
    0 讨论(0)
  • 2021-01-13 13:11

    If you don't care about the order for lists and sublists (and all items in sublists are unique):

    result = set(map(frozenset, g))
    

    If a sublist may have duplicates e.g., [1, 2, 1, 3] then you could use tuple(sorted(sublist)) instead of frozenset(sublist) that removes duplicates from a sublist.

    If you want to preserve the order of sublists:

    def del_dups(seq, key=frozenset):
        seen = {}
        pos = 0
        for item in seq:
            if key(item) not in seen:
                seen[key(item)] = True
                seq[pos] = item
                pos += 1
        del seq[pos:]
    

    Example:

    del_dups(g, key=lambda x: tuple(sorted(x)))
    

    See In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order?

    0 讨论(0)
  • 2021-01-13 13:12

    (ab)using side-effects version of a list comp:

    seen = set()
    
    [x for x in g if frozenset(x) not in seen and not seen.add(frozenset(x))]
    Out[4]: [[1, 2, 3], [9, 0, 1], [4, 3, 2]]
    

    For those (unlike myself) who don't like using side-effects in this manner:

    res = []
    seen = set()
    
    for x in g:
        x_set = frozenset(x)
        if x_set not in seen:
            res.append(x)
            seen.add(x_set)
    

    The reason that you add frozensets to the set is that you can only add hashable objects to a set, and vanilla sets are not hashable.

    0 讨论(0)
提交回复
热议问题