I have a list of dicts, and I\'d like to remove the dicts with identical key and value pairs.
For this list: [{\'a\': 123}, {\'b\': 123}, {\'a\': 123}]
<
You can use a set, but you need to turn the dicts into a hashable type.
seq = [{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}, {'a': 123, 'b': 1234}]
unique = set()
for d in seq:
t = tuple(d.iteritems())
unique.add(t)
Unique now equals
set([(('a', 3222), ('b', 1234)), (('a', 123), ('b', 1234))])
To get dicts back:
[dict(x) for x in unique]
i know it might not be as elegant as other answers,but how about trying this:
arts = list of dicts
arts_alt = []
arts_alt = [arts_alt.append(art) for art in arts if art not in arts_alt]
arts_alt is what you need
Not so short but easy to read:
list_of_data = [{'a': 123}, {'b': 123}, {'a': 123}]
list_of_data_uniq = []
for data in list_of_data:
if data not in list_of_data_uniq:
list_of_data_uniq.append(data)
Now, list list_of_data_uniq
will have unique dicts.
If you are using Pandas in your workflow, one option is to feed a list of dictionaries directly to the pd.DataFrame
constructor. Then use drop_duplicates and to_dict methods for the required result.
import pandas as pd
d = [{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}, {'a': 123, 'b': 1234}]
d_unique = pd.DataFrame(d).drop_duplicates().to_dict('records')
print(d_unique)
[{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}]
Other answers would not work if you're operating on nested dictionaries such as deserialized JSON objects. For this case you could use:
import json
set_of_jsons = {json.dumps(d, sort_keys=True) for d in X}
X = [json.loads(t) for t in set_of_jsons]
Try this:
[dict(t) for t in {tuple(d.items()) for d in l}]
The strategy is to convert the list of dictionaries to a list of tuples where the tuples contain the items of the dictionary. Since the tuples can be hashed, you can remove duplicates using set
(using a set comprehension here, older python alternative would be set(tuple(d.items()) for d in l)
) and, after that, re-create the dictionaries from tuples with dict
.
where:
l
is the original listd
is one of the dictionaries in the listt
is one of the tuples created from a dictionaryEdit: If you want to preserve ordering, the one-liner above won't work since set
won't do that. However, with a few lines of code, you can also do that:
l = [{'a': 123, 'b': 1234},
{'a': 3222, 'b': 1234},
{'a': 123, 'b': 1234}]
seen = set()
new_l = []
for d in l:
t = tuple(d.items())
if t not in seen:
seen.add(t)
new_l.append(d)
print new_l
Example output:
[{'a': 123, 'b': 1234}, {'a': 3222, 'b': 1234}]
Note: As pointed out by @alexis it might happen that two dictionaries with the same keys and values, don't result in the same tuple. That could happen if they go through a different adding/removing keys history. If that's the case for your problem, then consider sorting d.items()
as he suggests.