问题
I'm implementing a disjoint set system in Python, but I've hit a wall. I'm using a tree implementation for the system and am implementing Find(), Merge() and Create() functions for the system.
I am implementing a rank system and path compression for efficiency.
The catch is that these functions must take the set of disjoint sets as a parameter, making traversing hard.
class Node(object):
def __init__(self, value):
self.parent = self
self.value = value
self.rank = 0
def Create(values):
l = [Node(value) for value in values]
return l
The Create function takes in a list of values and returns a list of singular Nodes containing the appropriate data.
I'm thinking the Merge function would look similar to this,
def Merge(set, value1, value2):
value1Root = Find(set, value1)
value2Root = Find(set, value2)
if value1Root == value2Root:
return
if value1Root.rank < value2Root.rank:
value1Root.parent = value2Root
elif value1Root.rank > value2Root.rank:
value2Root.parent = value1Root
else:
value2Root.parent = value1Root
value1Root.rank += 1
but I'm not sure how to implement the Find() function since it is required to take the list of Nodes and a value (not just a node) as the parameters. Find(set, value) would be the prototype.
I understand how to implement path compression when a Node is taken as a parameter for Find(x), but this method is throwing me off.
Any help would be greatly appreciated. Thank you.
Edited for clarification.
回答1:
Clearly merge
function should be applied to pair of nodes.
So find
function should take single node parameter and look like this:
def find(node):
if node.parent != node:
node.parent = find(node.parent)
return node.parent
Also wikipedia has pseudocode that is easily translatable to python.
回答2:
The implementation of this data structure becomes simpler when you realize that the operations union and find can also be implemented as methods of a disjoint set forest class, rather than on the individual disjoint sets.
If you can read C++, then have a look at my take on the data structure; it hides the actual sets from the outside world, representing them only as numeric indices in the API. In Python, it would be something like
class DisjSets(object):
def __init__(self, n):
self._parent = range(n)
self._rank = [0] * n
def find(self, i):
if self._parent[i] == i:
return i
else:
self._parent[i] = self.find(self._parent[i])
return self._parent[i]
def union(self, i, j):
root_i = self.find(i)
root_j = self.find(j)
if root_i != root_j:
if self._rank[root_i] < self._rank[root_j]:
self._parent[root_i] = root_j
elif self._rank[root_i] > self._rank[root_j]:
self._parent[root_j] = root_i
else:
self._parent[root_i] = root_j
self._rank[root_j] += 1
(Not tested.)
If you choose not to follow this path, the client of your code will indeed have to have knowledge of Node
s and Find
must take a Node
argument.
回答3:
Find is always done on an item. Find(item) is defined as returning the set to which the item belongs. Merger as such must not take nodes, merge always takes two items/sets. Merge or union (item1, item2) must first find(item1) and find(item2) which will return the sets to which each of these belong. After that the smaller set represented by an up-tree must be added to the taller. When a find is issued, always retrace the path and compress it.
A tested implementation with path compression is here.
来源:https://stackoverflow.com/questions/9488284/disjoint-set-forests-in-python-alternate-implementation