Remove duplicate dict in list in Python

前端 未结 12 763
太阳男子
太阳男子 2020-11-22 09:10

I have a list of dicts, and I\'d like to remove the dicts with identical key and value pairs.

For this list: [{\'a\': 123}, {\'b\': 123}, {\'a\': 123}]<

相关标签:
12条回答
  • 2020-11-22 09:48

    You can use a set, but you need to turn the dicts into a hashable type.

    seq = [{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}, {'a': 123, 'b': 1234}]
    unique = set()
    for d in seq:
        t = tuple(d.iteritems())
        unique.add(t)
    

    Unique now equals

    set([(('a', 3222), ('b', 1234)), (('a', 123), ('b', 1234))])
    

    To get dicts back:

    [dict(x) for x in unique]
    
    0 讨论(0)
  • 2020-11-22 09:48

    i know it might not be as elegant as other answers,but how about trying this:

    arts = list of dicts
    
    arts_alt = []
    
    arts_alt = [arts_alt.append(art) for art in arts if art not in arts_alt]
    

    arts_alt is what you need

    0 讨论(0)
  • 2020-11-22 09:49

    Not so short but easy to read:

    list_of_data = [{'a': 123}, {'b': 123}, {'a': 123}]
    
    list_of_data_uniq = []
    for data in list_of_data:
        if data not in list_of_data_uniq:
            list_of_data_uniq.append(data)
    

    Now, list list_of_data_uniq will have unique dicts.

    0 讨论(0)
  • 2020-11-22 09:50

    If you are using Pandas in your workflow, one option is to feed a list of dictionaries directly to the pd.DataFrame constructor. Then use drop_duplicates and to_dict methods for the required result.

    import pandas as pd
    
    d = [{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}, {'a': 123, 'b': 1234}]
    
    d_unique = pd.DataFrame(d).drop_duplicates().to_dict('records')
    
    print(d_unique)
    
    [{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}]
    
    0 讨论(0)
  • 2020-11-22 09:53

    Other answers would not work if you're operating on nested dictionaries such as deserialized JSON objects. For this case you could use:

    import json
    set_of_jsons = {json.dumps(d, sort_keys=True) for d in X}
    X = [json.loads(t) for t in set_of_jsons]
    
    0 讨论(0)
  • 2020-11-22 09:55

    Try this:

    [dict(t) for t in {tuple(d.items()) for d in l}]
    

    The strategy is to convert the list of dictionaries to a list of tuples where the tuples contain the items of the dictionary. Since the tuples can be hashed, you can remove duplicates using set (using a set comprehension here, older python alternative would be set(tuple(d.items()) for d in l)) and, after that, re-create the dictionaries from tuples with dict.

    where:

    • l is the original list
    • d is one of the dictionaries in the list
    • t is one of the tuples created from a dictionary

    Edit: If you want to preserve ordering, the one-liner above won't work since set won't do that. However, with a few lines of code, you can also do that:

    l = [{'a': 123, 'b': 1234},
            {'a': 3222, 'b': 1234},
            {'a': 123, 'b': 1234}]
    
    seen = set()
    new_l = []
    for d in l:
        t = tuple(d.items())
        if t not in seen:
            seen.add(t)
            new_l.append(d)
    
    print new_l
    

    Example output:

    [{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}]
    

    Note: As pointed out by @alexis it might happen that two dictionaries with the same keys and values, don't result in the same tuple. That could happen if they go through a different adding/removing keys history. If that's the case for your problem, then consider sorting d.items() as he suggests.

    0 讨论(0)
提交回复
热议问题