I have a \"combination\" problem to find a cluster of different keys for which I try to find a optimized solution:
I have this list of list \"l\":
l
This is a typical use case for the union-find algorithm / disjoint set data structure. There's no implementation in the Python library AFAIK, but I always tend to have one nearby, as it's so useful...
l = [[1, 5], [5, 7], [4, 9], [7, 9], [50, 90], [100, 200], [90, 100],
[2, 90], [7, 50], [9, 21], [5, 10], [8, 17], [11, 15], [3, 11]]
from collections import defaultdict
leaders = defaultdict(lambda: None)
def find(x):
l = leaders[x]
if l is not None:
leaders[x] = find(l)
return leaders[x]
return x
# union all elements that transitively belong together
for a, b in l:
leaders[find(a)] = find(b)
# get groups of elements with the same leader
groups = defaultdict(set)
for x in leaders:
groups[find(x)].add(x)
print(*groups.values())
# {1, 2, 4, 5, 100, 7, 200, 9, 10, 50, 21, 90} {8, 17} {3, 11, 15}
The runtime complexity of this should be about O(nlogn) for n nodes, each time requiring logn steps to get to (and update) the leader.