Removing duplicates from a list of lists

前端未结

关注

 12  1274

I have a list of lists in Python:

k = [[1, 2], [4], [5, 6, 2], [1, 2], [3], [4]]

And I want to remove duplicate elements from it. Was if it

相关标签:

12条回答

广开言路

2020-11-22 10:53
Create a dictionary with tuple as the key, and print the keys.
- create dictionary with tuple as key and index as value
- print list of keys of dictionary
```
k = [[1, 2], [4], [5, 6, 2], [1, 2], [3], [4]]

dict_tuple = {tuple(item): index for index, item in enumerate(k)}

print [list(itm) for itm in dict_tuple.keys()]

# prints [[1, 2], [5, 6, 2], [3], [4]]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

借酒劲吻你

2020-11-22 10:53

Strangely, the answers above removes the 'duplicates' but what if I want to remove the duplicated value also?? The following should be useful and does not create a new object in memory!

def dictRemoveDuplicates(self):
    a=[[1,'somevalue1'],[1,'somevalue2'],[2,'somevalue1'],[3,'somevalue4'],[5,'somevalue5'],[5,'somevalue1'],[5,'somevalue1'],[5,'somevalue8'],[6,'somevalue9'],[6,'somevalue0'],[6,'somevalue1'],[7,'somevalue7']]


print(a)
temp = 0
position = -1
for pageNo, item in a:
    position+=1
    if pageNo != temp:
        temp = pageNo
        continue
    else:
        a[position] = 0
        a[position - 1] = 0
a = [x for x in a if x != 0]         
print(a)

and the o/p is:

[[1, 'somevalue1'], [1, 'somevalue2'], [2, 'somevalue1'], [3, 'somevalue4'], [5, 'somevalue5'], [5, 'somevalue1'], [5, 'somevalue1'], [5, 'somevalue8'], [6, 'somevalue9'], [6, 'somevalue0'], [6, 'somevalue1'], [7, 'somevalue7']]
[[2, 'somevalue1'], [3, 'somevalue4'], [7, 'somevalue7']]

0 讨论(0)

失恋的感觉

2020-11-22 10:53

A bit of a background, I just started with python and learnt comprehensions.

k = [[1, 2], [4], [5, 6, 2], [1, 2], [3], [4]]
dedup = [elem.split('.') for elem in set(['.'.join(str(int_elem) for int_elem in _list) for _list in k])]

0 讨论(0)

挽巷

2020-11-22 10:55

>>> k = [[1, 2], [4], [5, 6, 2], [1, 2], [3], [4]]
>>> k = sorted(k)
>>> k
[[1, 2], [1, 2], [3], [4], [4], [5, 6, 2]]
>>> dedup = [k[i] for i in range(len(k)) if i == 0 or k[i] != k[i-1]]
>>> dedup
[[1, 2], [3], [4], [5, 6, 2]]

I don't know if it's necessarily faster, but you don't have to use to tuples and sets.

0 讨论(0)

刺人心

2020-11-22 10:57
Even your "long" list is pretty short. Also, did you choose them to match the actual data? Performance will vary with what these data actually look like. For example, you have a short list repeated over and over to make a longer list. This means that the quadratic solution is linear in your benchmarks, but not in reality.

For actually-large lists, the set code is your best bet—it's linear (although space-hungry). The sort and groupby methods are O(n log n) and the loop in method is obviously quadratic, so you know how these will scale as n gets really big. If this is the real size of the data you are analyzing, then who cares? It's tiny.

Incidentally, I'm seeing a noticeable speedup if I don't form an intermediate list to make the set, that is to say if I replace
```
kt = [tuple(i) for i in k]
skt = set(kt)
```
with
```
skt = set(tuple(i) for i in k)
```
The real solution may depend on more information: Are you sure that a list of lists is really the representation you need?
0 讨论(0)
发布评论:

提交评论
- 加载中...
自闭症患者

2020-11-22 11:03
List of tuple and {} can be used to remove duplicates
```
>>> [list(tupl) for tupl in {tuple(item) for item in k }]
[[1, 2], [5, 6, 2], [3], [4]]
>>> 
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页