Disjoint Sets in Java: A Comprehensive Tutorial


Introduction
In the field of computer science, Disjoint Sets, also known as Union-Find data structures, play a vital role in solving various problems efficiently. Whether it's for graph algorithms, network connectivity, or image processing, understanding Disjoint Sets is essential for software engineering students and professionals.
This tutorial aims to provide a comprehensive explanation of Disjoint Sets in Java, along with code examples to help you grasp the concept and implement it in your own projects.
What are Disjoint Sets?
A Disjoint Set is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. Each subset is represented by a representative element, which is typically chosen as the root of a tree-like structure.
The key operations on Disjoint Sets are:
- MakeSet(x): Creates a new set with a single element x.
- Union(x, y): Merges the sets containing elements x and y into a single set.
- Find(x): Returns the representative element of the set containing x.
Implementation of Disjoint Sets in Java
Let's dive into the implementation of Disjoint Sets in Java. We'll start by defining a class called DisjointSet
that represents the data structure.
public class DisjointSet {
private int[] parent;
private int[] rank;
public DisjointSet(int size) {
parent = new int[size];
rank = new int[size];
// Initialize each element as a separate set
for (int i = 0; i < size; i++) {
parent[i] = i;
rank[i] = 0;
}
}
public int find(int x) {
// Path compression: Make the parent of x directly point to the representative element
if (parent[x] != x) {
parent[x] = find(parent[x]);
}
return parent[x];
}
public void union(int x, int y) {
int rootX = find(x);
int rootY = find(y);
// Union by rank: Attach the smaller tree to the root of the larger tree
if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
The DisjointSet
class consists of two arrays: parent
and rank
. The parent
array stores the parent of each element, while the rank
array keeps track of the rank (or depth) of each set's representative element.
In the constructor, we initialize each element as a separate set by assigning itself as its parent and setting the rank to 0.
The find
method implements path compression, which optimizes the find
operation by making the parent of each visited node directly point to the representative element. This reduces the tree height and speeds up subsequent find
operations.
The union
method performs the union of two sets by attaching the smaller tree to the root of the larger tree. This helps maintain a balanced tree structure and ensures efficient find
operations.
Using Disjoint Sets
Now that we have implemented the Disjoint Set data structure, let's explore how to use it to solve a common problem: finding connected components in an undirected graph.
Consider the following graph:
0 -- 1 3 -- 4
| |
2 5
In this graph, the connected components are {0, 1, 2} and {3, 4, 5}. We can use Disjoint Sets to find these connected components efficiently.
Here's the Java code to find the connected components using Disjoint Sets:
public class ConnectedComponents {
public static List<List<Integer>> findConnectedComponents(int[][] edges, int n) {
List<List<Integer>> components = new ArrayList<>();
DisjointSet disjointSet = new DisjointSet(n);
// Perform union of all edges
for (int[] edge : edges) {
int u = edge[0];
int v = edge[1];
disjointSet.union(u, v);
}
// Group elements by their representative element
Map<Integer, List<Integer>> groups = new HashMap<>();
for (int i = 0; i < n; i++) {
int representative = disjointSet.find(i);
groups.computeIfAbsent(representative, k -> new ArrayList<>()).add(i);
}
// Convert groups to components
for (List<Integer> group : groups.values()) {
components.add(group);
}
return components;
}
}
The findConnectedComponents
method takes an array of edges and the number of vertices in the graph as input. It creates a new instance of the DisjointSet
class and performs the union of all edges using the union
method.
Next, it groups the elements by their representative element using a HashMap
. Finally, it converts the groups to components and returns the result.
Let's test the findConnectedComponents
method with the given graph:
public class Main {
public static void main(String[] args) {
int[][] edges = {{0, 1}, {1, 2}, {3, 4}, {3, 5}};
int n = 6;
List<List<Integer>> components = ConnectedComponents.findConnectedComponents(edges, n);
for (List<Integer> component : components) {
System.out.println(component);
}
}
}
The output of the above code will be:
[0, 1, 2]
[3, 4, 5]
As you can see, the connected components {0, 1, 2} and {3, 4, 5} are correctly identified using Disjoint Sets.
Conclusion
Disjoint Sets, or Union-Find data structures, are powerful tools for solving various graph-related problems efficiently. In this tutorial, we covered the basics of Disjoint Sets and provided a detailed implementation in Java. We also demonstrated how to use Disjoint Sets to find connected components in an undirected graph.
By understanding and implementing Disjoint Sets, software engineering students and professionals can enhance their problem-solving skills and tackle complex algorithms and data structures with confidence.
Now that you have a solid understanding of Disjoint Sets in Java, feel free to explore more advanced topics such as path compression optimizations, weighted union, or applying Disjoint Sets to other problem domains.
Happy coding!