Details
A merge-find set is a data structure that contains disjoint subsets of a master set indexed by integers 1…n. It supports both fast finding of canonical subset elements and fast merging of subsets.
A merge-find set is also called a disjoint-set or a union-find set.
Merge-find sets are used in graph partitioning algorithms to keep track of unconnected components as they are merged. One such application is in efficient implementations of Kruskal's minimum spanning tree algorithm.
ResourceFunction["MergeFindSet"] only works with explicit indices 1…n. A more general one can handle arbitrary set elements, though at the expense of speed and memory requirements.
The initialization can specify either a positive integer
n or a disjoint partition of
Range[n]. In the former case, the initial subsets are singletons containing the individual integers from 1 to
n.
Each subset in a merge-find set has a canonical leader element. Elements that are not leaders have parents.
A leader for a given element is located by following the parent path beginning at that element. A leader is its own parent and this gives a termination condition for locating an elements leader.
A merge-set with n elements is comprised of n pairs of integers. The kth element is {pk,sk}, where pk is the parent of k. When k is its own parent, sk gives the total size of its subset; it is not used in other cases.
In "Merge" operations, smaller subsets are merged into larger.
Whenever a "Find" is performed, elements in the parent path that do not point directly to the leader will be changed to point directly to it.
ResourceFunction["MergeFindSet"] with one argument will automatically assume the operation is "Initialize".
ResourceFunction["MergeFindSet"][mfs,j] for integer j will automatically assume the operation is "Find".
ResourceFunction["MergeFindSet"][mfs,j,k] for integers j and k will automatically assume the operation is "Merge".
The "Find" operation will thread over its second argument.
ResourceFunction["MergeFindSet"] has the attribure
HoldFirst.