Disjoint Sets in Java: A Comprehensive Tutorial

Jacob Jonas

2/16/20243 min read

Introduction

In the field of computer science, Disjoint Sets, also known as Union-Find data structures, play a vital role in solving various problems efficiently. Whether it's for graph algorithms, network connectivity, or image processing, understanding Disjoint Sets is essential for software engineering students and professionals.

This tutorial aims to provide a comprehensive explanation of Disjoint Sets in Java, along with code examples to help you grasp the concept and implement it in your own projects.

What are Disjoint Sets?

A Disjoint Set is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. Each subset is represented by a representative element, which is typically chosen as the root of a tree-like structure.

The key operations on Disjoint Sets are:

MakeSet(x): Creates a new set with a single element x.
Union(x, y): Merges the sets containing elements x and y into a single set.
Find(x): Returns the representative element of the set containing x.

Implementation of Disjoint Sets in Java

Let's dive into the implementation of Disjoint Sets in Java. We'll start by defining a class called DisjointSet that represents the data structure.

public class DisjointSet {
    private int[] parent;
    private int[] rank;
    
    public DisjointSet(int size) {
        parent = new int[size];
        rank = new int[size];
        
        // Initialize each element as a separate set
        for (int i = 0; i < size; i++) {
            parent[i] = i;
            rank[i] = 0;
        }
    }
    
    public int find(int x) {
        // Path compression: Make the parent of x directly point to the representative element
        if (parent[x] != x) {
            parent[x] = find(parent[x]);
        }
        return parent[x];
    }
    
    public void union(int x, int y) {
        int rootX = find(x);
        int rootY = find(y);
        
        // Union by rank: Attach the smaller tree to the root of the larger tree
        if (rank[rootX] < rank[rootY]) {
            parent[rootX] = rootY;
        } else if (rank[rootX] > rank[rootY]) {
            parent[rootY] = rootX;
        } else {
            parent[rootY] = rootX;
            rank[rootX]++;
        }
    }
}

The DisjointSet class consists of two arrays: parent and rank. The parent array stores the parent of each element, while the rank array keeps track of the rank (or depth) of each set's representative element.

In the constructor, we initialize each element as a separate set by assigning itself as its parent and setting the rank to 0.

The find method implements path compression, which optimizes the find operation by making the parent of each visited node directly point to the representative element. This reduces the tree height and speeds up subsequent find operations.

The union method performs the union of two sets by attaching the smaller tree to the root of the larger tree. This helps maintain a balanced tree structure and ensures efficient find operations.

Using Disjoint Sets

Now that we have implemented the Disjoint Set data structure, let's explore how to use it to solve a common problem: finding connected components in an undirected graph.

Consider the following graph:

  0 -- 1    3 -- 4
  |         |
  2         5

In this graph, the connected components are {0, 1, 2} and {3, 4, 5}. We can use Disjoint Sets to find these connected components efficiently.

Here's the Java code to find the connected components using Disjoint Sets:

public class ConnectedComponents {
    public static List<List<Integer>> findConnectedComponents(int[][] edges, int n) {
        List<List<Integer>> components = new ArrayList<>();
        DisjointSet disjointSet = new DisjointSet(n);
        
        // Perform union of all edges
        for (int[] edge : edges) {
            int u = edge[0];
            int v = edge[1];
            disjointSet.union(u, v);
        }
        
        // Group elements by their representative element
        Map<Integer, List<Integer>> groups = new HashMap<>();
        for (int i = 0; i < n; i++) {
            int representative = disjointSet.find(i);
            groups.computeIfAbsent(representative, k -> new ArrayList<>()).add(i);
        }
        
        // Convert groups to components
        for (List<Integer> group : groups.values()) {
            components.add(group);
        }
        
        return components;
    }
}

The findConnectedComponents method takes an array of edges and the number of vertices in the graph as input. It creates a new instance of the DisjointSet class and performs the union of all edges using the union method.

Next, it groups the elements by their representative element using a HashMap. Finally, it converts the groups to components and returns the result.

Let's test the findConnectedComponents method with the given graph:

public class Main {
    public static void main(String[] args) {
        int[][] edges = {{0, 1}, {1, 2}, {3, 4}, {3, 5}};
        int n = 6;
        
        List<List<Integer>> components = ConnectedComponents.findConnectedComponents(edges, n);
        
        for (List<Integer> component : components) {
            System.out.println(component);
        }
    }
}

The output of the above code will be:

[0, 1, 2]
[3, 4, 5]

As you can see, the connected components {0, 1, 2} and {3, 4, 5} are correctly identified using Disjoint Sets.

Conclusion

Disjoint Sets, or Union-Find data structures, are powerful tools for solving various graph-related problems efficiently. In this tutorial, we covered the basics of Disjoint Sets and provided a detailed implementation in Java. We also demonstrated how to use Disjoint Sets to find connected components in an undirected graph.

By understanding and implementing Disjoint Sets, software engineering students and professionals can enhance their problem-solving skills and tackle complex algorithms and data structures with confidence.

Now that you have a solid understanding of Disjoint Sets in Java, feel free to explore more advanced topics such as path compression optimizations, weighted union, or applying Disjoint Sets to other problem domains.

Happy coding!