python reduce to find the union of sets

怎甘沉沦 提交于 2020-03-15 21:56:30

问题


I am trying to find the union of set of sets. Specifically I want the union of the list of nodes for each key in the dictionary of networkx graphs called periodic_gs. I would like to use the reduce function as it seems reasonable to take the union of all periodic_gs[x].nodes() where x is a key of periodic_gs.

Here is my attempt:

reduce(lambda x,y: set(periodic_gs[x].nodes()).union(set(periodic_gs[y].nodes())), periodic_gs.keys(), {})

To me, this says take the union of the nodes across each graph in the dictionary. For some reason, python tells me: TypeError: unhashable type: 'dict' I do not see this TypeError, because periodic_gs.keys() is a list of the keys (they are strings but I do not see how this would matter), and when substituted in for the arguments to the lambda function will work.

What is causing the type error and how do I fix it?


回答1:


You can use set.union like this:

>>> lis = [{1, 2, 3, 4}, {3, 4, 5}, {7, 3, 6}]
>>> set().union(*lis)
set([1, 2, 3, 4, 5, 6, 7])

It's possible to do this using reduce, but don't:

>>> reduce(set.union, lis)
set([1, 2, 3, 4, 5, 6, 7])

because this reduce takes quadratic time due to all the intermediate sets it builds and discards:

In [1]: from functools import reduce

In [2]: sets = [{x} for x in range(1000)]

In [3]: %timeit set().union(*sets)
40.9 µs ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [4]: %timeit reduce(set.union, sets)
4.09 ms ± 587 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

That's a 100x slowdown on this test case, and it can easily be even worse.

For your code, this should do it:

set().union(*(x.nodes() for x in periodic_gs.values()))



回答2:


{} is an empty dictionary, not a set. Use set() to create an empty set.

However, I think you are misinterpreting how reduce() works here; x is the previous return value of the lambda, and y is the next value from the sequence. Because you return a set, x is always a set here, and you cannot use that as a key to periodic_gs.

If you want the union of all nodes in the graph, use itertools.chain.from_iterable() and set():

from itertools import chain

set(chain.from_iterable(periodic_gs[key].nodes() for key in periodic_gs))

This creates one set from each of the nodes() calls.

To use reduce() you'd have to take into account that the first argument is always a set:

reduce(lambda res, key: res.union(periodic_gs[key].nodes()),  periodic_gs, set())

I am assuming here that periodic_gs is iterable (yielding keys) just like a regular dictionary; if not, use periodic_gs.keys().

A quick demo with a regular dictionary:

>>> example = {'foo': [1,2,3], 'bar': [3, 4, 1]}
>>> reduce(lambda res, key: res.union(example[key]), example, set())
set([1, 2, 3, 4])



回答3:


There are a couple of problems with your code. The initializer you supplied to reduce, {}, is an empty dict, and not a set as you seem to be assuming, thus leading to the first type error. However, even after you fix this, there's still a bigger problem: the lambda evaluates to a set object containing nodes, and this set is then used as the x value for the next call to the lambda, which tries to use it as as though it were a key of periodic_gs, thereby causing the second type error. To top it all off, there's no point in iterating over the keys of a dict if you're only going to use them to access the values; just iterate over the values in the first place! There's also no point in converting a bunch of lists to sets and then taking their union when you could just chain them all together and take the union of the result.

I would recommend rewriting the code entirely as:

set(n for val in periodic_gs.values() for n in val.nodes())



回答4:


You can use reduce to get the union of multiple sets as shown here:

>>> import operator
>>> a = set([1, 3, 5])
>>> b = set([2, 4, 6])
>>> c = set([0, 7, 8])
>>> reduce(operator.or_, [a, b, c])
set([0, 1, 2, 3, 4, 5, 6, 7, 8])



回答5:


You can't have a set of set because elements of set need to be hashable. You can make a set of frozenset however.

{frozenset(n) for val in periodic_gs.values() for n in val.nodes()}


来源:https://stackoverflow.com/questions/19965680/python-reduce-to-find-the-union-of-sets

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!