Python list intersection with non unique items

随声附和 提交于 2019-12-17 19:19:53

问题


I have two strings and I would like to have the intersection on them including duplicate items:

str_a = "aabbcc"
str_b = "aabd"

list(set(str_a) & set(str_b))
>> "ab"

I would like to have it return:

>> "aab"

Any ideas?


回答1:


Multisets are implemented in python 2.7 or later as (mutable) Counter objects. You can perform many of the same operations as you can for sets, such as union, intersection, difference (though counts can become negative), etc.:

from collections import Counter as mset

Solution:

(mset("aabbcc") & mset("aabd")).elements()

More details:

>>> intersection = mset("aabbcc") & mset("aabd")
Counter({'a': 2, 'b': 1})

>>> list(intersection.elements())
['a', 'a', 'b']

>>> ''.join(intersection.elements())
'aab'

You can use ''.join if you want a string, or list() if you want a list, though I would just keep it in iterable format as intersection.elements().




回答2:


Use collections.Counter for each word and use these as sets:

>>> from collections import Counter
>>> str_a, str_b = 'aabbcc', 'aabd'
>>> Counter(str_a) & Counter(str_b)
Counter({'a': 2, 'b': 1})
>>> ''.join((Counter(str_a) & Counter(str_b)).elements())
'aab'

The Counter is a dict subclass, but one that counts all the elements of a sequence you initialize it with. Thus, "aabbcc" becomes Counter({'a': 2, 'b': 2, 'c': 2}).

Counters act like multisets, in that when you use 2 in an intersection like above, their counts are set to the mimimum values found in either counter, ignoring anything whose count drops to 0. If you were to compute their union, the maximum counts would be used instead.



来源:https://stackoverflow.com/questions/12253361/python-list-intersection-with-non-unique-items

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!