What I have so far is largely based off page 571 of \"Introduction To Algorithms\" by Cormen et al.
I have a Node class in python that represents a set:
I don't have the latest edition of the book, but this doesn't look quite like a disjoint-set forest.
I think your mistake is to think that the forest has to be stored in a collection and that you have to traverse this collection to do the operations on the nodes. Remove set
from Merge()
and Find()
and implement Find()
as
def Find(n):
if n != n.parent:
n.parent = Find(n.parent)
return n.parent
just like in the book. Then add a MakeSet()
that returns a single correctly initialized node, and maybe a SameSet()
function too:
def SameSet(n1, n2):
return Find(n1) == Find(n2)
You now have a working disjoint set implementation.
Presuming that each node is initialised to be its own parent:
def Find(node):
while node is not node.parent:
node = node.parent
return node
The Wikipedia page provides pseudocode for the basic operations on disjoint-set data structure. Here's a direct port to Python (employs path compression and union by rank):
class Node:
"""Represents an element of a set."""
def __init__(self, id):
self.id = id
self.parent = self
self.rank = 0
self.size = 1
def __repr__(self):
return 'Node({!r})'.format(self.id)
def Find(x):
"""Returns the representative object of the set containing x."""
if x.parent is not x:
x.parent = Find(x.parent)
return x.parent
def Union(x, y):
"""Combines the sets x and y belong to."""
xroot = Find(x)
yroot = Find(y)
# x and y are already in the same set
if xroot is yroot:
return
# x and y are not in same set, so we merge them
if xroot.rank < yroot.rank:
xroot, yroot = yroot, xroot # swap xroot and yroot
# merge yroot into xroot
yroot.parent = xroot
xroot.size += yroot.size
if xroot.rank == yroot.rank:
xroot.rank = xroot.rank + 1
Demo:
>>> a, b, c = map(Node, 'abc')
>>> Find(a), Find(b), Find(c)
(Node('a'), Node('b'), Node('c'))
>>> Find(a).size
1
>>> Union(a, b)
>>> Union(b, c)
>>> Find(a), Find(b), Find(c)
(Node('a'), Node('a'), Node('a'))
>>> Find(a).size
3
Have you looked at any other existing implementations?
Using this implementation as a starting point, I've created a new python class to handle disjoint sets, which also supports persistence using a MongoDb.
With my class you should be able, for example, to:
UnionFind()
, which uses python build-in dictionaries by default, and decide later to consolidate()
your results in a MongoDb collection You might want to check the code on github.
Cheers, Simone