I have a python list where elements can repeat.
>>> a = [1,2,2,3,3,4,5,6]
I want to get the first n
unique elements from
a = [1,2,2,3,3,4,5,6]
from collections import defaultdict
def function(lis,n):
dic = defaultdict(int)
sol=set()
for i in lis:
try:
if dic[i]:
pass
else:
sol.add(i)
dic[i]=1
if len(sol)>=n:
break
except KeyError:
pass
return list(sol)
print(function(a,3))
output
[1, 2, 3]
Why not use something like this?
>>> a = [1, 2, 2, 3, 3, 4, 5, 6]
>>> list(set(a))[:5]
[1, 2, 3, 4, 5]
There are really amazing answers for this question, which are fast, compact and brilliant! The reason I am putting here this code is that I believe there are plenty of cases when you don't care about 1 microsecond time loose nor you want additional libraries in your code for one-time solving a simple task.
a = [1,2,2,3,3,4,5,6]
res = []
for x in a:
if x not in res: # yes, not optimal, but doesnt need additional dict
res.append(x)
if len(res) == 5:
break
print(res)
Here is a Pythonic approach using itertools.takewhile()
:
In [95]: from itertools import takewhile
In [96]: seen = set()
In [97]: set(takewhile(lambda x: seen.add(x) or len(seen) <= 4, a))
Out[97]: {1, 2, 3, 4}
You can adapt the popular itertools unique_everseen recipe:
def unique_everseen_limit(iterable, limit=5):
seen = set()
seen_add = seen.add
for element in iterable:
if element not in seen:
seen_add(element)
yield element
if len(seen) == limit:
break
a = [1,2,2,3,3,4,5,6]
res = list(unique_everseen_limit(a)) # [1, 2, 3, 4, 5]
Alternatively, as suggested by @Chris_Rands, you can use itertools.islice to extract a fixed number of values from a non-limited generator:
from itertools import islice
def unique_everseen(iterable):
seen = set()
seen_add = seen.add
for element in iterable:
if element not in seen:
seen_add(element)
yield element
res = list(islice(unique_everseen(a), 5)) # [1, 2, 3, 4, 5]
Note the unique_everseen
recipe is available in 3rd party libraries via more_itertools.unique_everseen or toolz.unique, so you could use:
from itertools import islice
from more_itertools import unique_everseen
from toolz import unique
res = list(islice(unique_everseen(a), 5)) # [1, 2, 3, 4, 5]
res = list(islice(unique(a), 5)) # [1, 2, 3, 4, 5]
Using set
with sorted+ key
sorted(set(a), key=list(a).index)[:5]
Out[136]: [1, 2, 3, 4, 5]