Implementing Disjoint Set System In Python
Solution 1
I don't have the latest edition of the book, but this doesn't look quite like a disjoint-set forest.
I think your mistake is to think that the forest has to be stored in a collection and that you have to traverse this collection to do the operations on the nodes. Remove set
from Merge()
and Find()
and implement Find()
as
def Find(n):
if n != n.parent:
n.parent = Find(n.parent)
return n.parent
just like in the book. Then add a MakeSet()
that returns a single correctly initialized node, and maybe a SameSet()
function too:
def SameSet(n1, n2):
return Find(n1) == Find(n2)
You now have a working disjoint set implementation.
Solution 2
Have you looked at any other existing implementations?
Solution 3
Presuming that each node is initialised to be its own parent:
def Find(node):
while node is not node.parent:
node = node.parent
return node
Jordan
Updated on August 22, 2022Comments
-
Jordan over 1 year
What I have so far is largely based off page 571 of "Introduction To Algorithms" by Cormen et al.
I have a Node class in python that represents a set:
class Node: def __init__(self, parent, rank = 0): self.parent = parent self.rank = rank
This implementation is going to use a
List
of Nodes as the forest (I am open to better ways to store the sets).Initialize()
returns a list of Nodes, that I will store in variableset
and pass into the other functions.Find
searches through the forest for the value and returns the set that it appears in. I chose to usefor s in range(len(set)):
so that in the recursion I could shrink the list of sets being passed in byset[s:]
.def Find(set, value): for s in range(len(set)): if value != set[s].parent: set[s].parent = Find(set[s:], set[s].parent) return set[s]
Merge
merges sets in the forest by finding them and promoting the higher ranked set.def Merge(set, value1, value2): set_val_1 = Find(set, value1) set_val_2 = Find(set, value2) if set_val_1.rank > set_val_2.rank: set_val_2.parent = set_val_1 else: set_val_1.parent = set_val_2 if set_val_1.rank == set_val_2.rank: set_val_2.rank += 1
I am getting some errors when I do
Find
s andMerge
s, namelyFind
is not returning the proper set, and so I am not sure ifMerge
is working properly as well. I would appreciate some help to make sure the functions are implemented properly. -
John Machin about 13 yearsIn a real-world "working disjoint set implementation", there'd be a facility to iterate over all the sets, a facility to return all the members of the set that includes a given node, and the Node class would have some kind of "payload" attribute.
-
Jordan about 13 years@John Machin: How might I iterate through my entire collection of sets to find which set node is in?
-
antonakos about 13 years@John By "working" I just mean that it works. Brandon links to two implementations, including one from a Python cookbook. Neither of these store payloads nor do they support iteration over the sets or retrieval of all members of a set. But they are still useful for the typical use-case of keeping track of connected components.
-
John Machin about 13 yearsnodes as defined by the OP have only a parent and a rank; what components does this keep track of??? The "one from a Python cookbook" implements the payload in the sample code (see "label"), and the mention of itertools.groupby suggests that it might be doing something useful. In Eppstein's code, the "hashable objects" are the "payload".
-
John Machin about 13 years@Yoel: If you mean the set leader:
leader = Find(node)
. If you mean a list of fellow member nodes:leader = Find(node); return [x for x in list_of_nodes if Find(x) is leader]
-
antonakos about 13 years@John My reading of Brandon's examples was a little sloppy, sorry. The nodes can carry a payload as in the Recipe demo, and Epstein's implementation (which wasn't from a real book, my mistake!) does allow iteration through all keys. But the disjoint set is about connectedness, not values. Store the set within the value, not the other way around. Then you can ask if two values are connected or join two values together. That's what is being kept track of.
-
antonakos about 13 yearsBeware that this does not do path compression.
-
Jordan about 13 years@John Machin: +1 for being the most helpful so far. What I meant by the collection of sets is, the list that contains all of the sets in my problem i.e. what
Initialize
returns. -
John Machin about 13 years@Yoel: Thanks for the +1. I know what you mean by the collection of sets; I called it "list_of_nodes". My question was: what ANSWER do you require to your "which set node is in" query? The target set's representative/leader node, or a container-load of the nodes in the target set?
-
Jordan about 13 years@John Machin: I need the representative/leader node.
-
John Machin about 13 years@Yoel: We seem to be going around in a loop. The whole point of my first comment was that you DON'T need to iterate through the collection to get the representative node of the set that z belongs to; it is simply
Find(z)
; that is the whole purpose of the Find() function. -
Jordan about 13 yearsI looked over my implementation and played around with a few things, the basic issue I was having was that I was comparing results of
Find
operations which returnNode
objects. Really I needed to returnNode.parent
which are ints (in my case) and actually make sense to compare (as opposed to the references ofNode
objects. Bottom line you get the accept. -
Jordan about 13 years@John Machin: Just wanted to say thank you for your help, ultimately I needed to tweek some things that @antonakos suggested but you were also really helpful.