I have two flat lists where one of them contains duplicate values. For example,
array1 = [1,4,4,7,10,10,10,15,16,17,18,20]
array2 = [4,6,7,8,9,10]
I need to find values in array1 that are also in array2, KEEPING THE DUPLICATES in array1. Desired outcome will be
result = [4,4,7,10,10,10]
I want to avoid loops as actual arrays will contain over millions of values. I have tried various set and intersect combinations, but just couldn't keep the duplicates..
Any help will be greatly appreciated!
What do you mean you don't want to use loops? You're going to have to iterate over it one way or another. Just take in each item individually and check if it's in array2
as you go:
items = set(array2)
found = [i for i in array1 if i in items]
Furthermore, depending on how you are going to use the result, consider having a generator:
found = (i for i in array1 if i in array2)
so that you won't have to have the whole thing in memory all at once.
There following will do it:
array1 = [1,4,4,7,10,10,10,15,16,17,18,20]
array2 = [4,6,7,8,9,10]
set2 = set(array2)
print [el for el in array1 if el in set2]
It keeps the order and repetitions of elements in array1
.
It turns array2
into a set for faster lookups. Note that this is only beneficial if array2
is sufficiently large; if array2
is small, it may be more performant to keep it as a list.
Following on from @Alex's answer, if you also want to extract the indices for each token, then here's how:
found = [[index,i] for index,i in enumerate(array1) if i in array2]
来源:https://stackoverflow.com/questions/26663371/python-intersection-of-two-lists-keeping-duplicates