Python intersection of arrays in dictionary

问题

I have dictionary of arrays as like:

y_dict= {1: np.array([5, 124, 169, 111, 122, 184]),
         2: np.array([1, 2, 3, 4, 5, 6, 111, 184]), 
         3: np.array([169, 5, 111, 152]), 
         4: np.array([0, 567, 5, 78, 90, 111]),
         5: np.array([]),
         6: np.array([])}

I need to find interception of arrays in my dictionary: y_dict. As a first step I cleared dictionary from empty arrays, as like

dic = {i:j for i,j in y_dict.items() if np.array(j).size != 0}

So, dic has the following view:

dic = { 1: np.array([5, 124, 169, 111, 122, 184]),
        2: np.array([1, 2, 3, 4, 5, 6, 111, 184]), 
        3: np.array([169, 5, 111, 152]), 
        4: np.array([0, 567, 5, 78, 90, 111])}

To find interception I tried to use tuple approach as like:

result_dic = list(set.intersection(*({tuple(p) for p in v} for v in dic.values())))

Actual result is empty list: [];

Expected result should be: [5, 111]

Could you please help me to find intersection of arrays in dictionary? Thanks

回答1:

The code you posted is overcomplex and wrong because there's one extra inner iteration that needs to go. You want to do:

result_dic = list(set.intersection(*(set(v) for v in dic.values())))

or with map and without a for loop:

result_dic = list(set.intersection(*(map(set,dic.values()))))

result

[5, 111]

iterate on the values (ignore the keys)
convert each numpy array to a set (converting to tuple also works, but intersection would convert those to sets anyway)
pass the lot to intersection with argument unpacking

We can even get rid of step 1 by creating sets on every array and filtering out the empty ones using filter:

result_dic = list(set.intersection(*(filter(None,map(set,y_dict.values())))))

That's for the sake of a one-liner, but in real life, expressions may be decomposed so they're more readable & commentable. That decomposition may also help us to avoid the crash which occurs when passed no arguments (because there were no non-empty sets) which defeats the smart way to intersect sets (first described in Best way to find the intersection of multiple sets?).

Just create the list beforehand, and call intersection only if the list is not empty. If empty, just create an empty set instead:

non_empty_sets = [set(x) for x in y_dict.values() if x.size]
result_dic = list(set.intersection(*non_empty_sets)) if non_empty_sets else set()

回答2:

You should be using numpy's intersection here, not directly in Python. And you'll need to add special handling for the empty intersection.

>>> intersection = None
>>> for a in y_dict.values(): 
...     if a.size: 
...         if intersection is None: 
...             intersection = a 
...             continue 
...         intersection = np.intersect1d(intersection, a) 
...
>>> if intersection is not None: 
...     print(intersection)
...
[  5 111]

For the case where intersection is None, it means that all of the arrays in y_dict had size zero (no elements). In this case the intersection is not well-defined, you have to decide for yourself what the code should do here - probably raise an exception, but it depends on the use-case.

来源：https://stackoverflow.com/questions/59269849/python-intersection-of-arrays-in-dictionary

标签

python

numpy

dictionary

set

interception