1 definition

A parallel set is a very different tree structure, with children pointing to their parents. It is often used to answer the Connectivity Problem. For example, in the figure below, it is difficult for us to observe the connection between the point at the upper left corner and the point at the lower right corner. Therefore, such data structure is needed to help us solve the Problem.

The network is an abstract concept:

Users of social software, commodity information of shopping websites, traffic system and so on can all be abstracted into nodes in the network, so many practical problems can be solved with the help of this powerful data structure.

In addition to solving the connection problem, and search set is the concrete realization of the concept of set in mathematics, and search set “union” word is actually the concept of mathematical set union.

Two actions are supported:

// Merge the two data
union(p,q);
Copy the code

// check whether p and q are connected
isConnected(p,q);
Copy the code

The program framework (interface) is as follows:

public interface UF {
    int getSize(a);
    boolean isConnected(int p, int q);
    void unionElements(int p, int q);
}
Copy the code

Where p and q represent id, the index of the array

2 Quick Find implementation

Each data index P or Q can be assigned a numbered ID within a parallel lookup set

To determine whether P and Q belong to the same set, you just need to check whether the corresponding ids are the same.

We can define a function find(p) to find the grouping for which P corresponds, and the corresponding isConnected method calls find

The time complexity of this design is O(1), so it is called Quick Find

public class UnionFind1 implements UF {

    private int[] id;    // Is essentially an array

    public UnionFind1(int size) {
        id = new int[size];
        // Initialize, each id[I] refers to itself, no merged elements
        for (int i = 0; i < size; i++)
            id[i] = i;
    }

    @Override
    public int getSize(a){
        return id.length;
    }

    // find the set number corresponding to element p
    // O(1) complexity
    private int find(int p) {
        if(p < 0 || p >= id.length)
            throw new IllegalArgumentException("p is out of bound.");
        return id[p];
    }

    // Check whether elements P and q belong to a collection
    // O(1) complexity
    @Override
    public boolean isConnected(int p, int q) {
        return find(p) == find(q);
    }

    // Merge the set of elements p and q
    // O(n) complexity
    @Override
    public void unionElements(int p, int q) {
        int pID = find(p);
        int qID = find(q);
        if (pID == qID)
            return;
        // The merge process takes a walk through all elements and merges the collection numbers of the two elements
        for (int i = 0; i < id.length; i++)
            if(id[i] == pID) id[i] = qID; }}Copy the code

UnionElements (int p, int q); unionElements(int p, int q); unionElements(int p, int q)

3 Quick Union

But when we actually implement and look up sets, we treat each element as a node, and there’s a tree between the nodes, except that the tree is a child node pointing to the parent node.

For example, a tree with root 2:

If the other node 1 is merged with node 3, point node 1 to the root node of node 3

If you have another tree with root 5 where node 7 is merged with node 3, the result is that node 5 points to node 2

We can still use arrays in this case, because each node has only one pointer to someone else. Define a parent array where the value of parent[I] represents the node to which I points. The initial shape is as follows:

Now if you want to do the union(4,3) operation

Corresponds to what’s in the array

Corresponds to what’s in the array

In the corresponding array

When I do union(9, 4), I need to look it up, and the parent of 4 is 3, and the parent of 3 is 8, and 8 is the root, so 9 points to 8

Time complexity:

Union process: O(h), where h is the height of the tree and h << n. The time complexity of the corresponding query process is also O(h).

public class UnionFind2 implements UF {

    // Use an array to build a tree pointing to the parent node
    // parent[I] indicates the parent node to which the first element points
    private int[] parent;

    // constructor
    public UnionFind2(int size){
        parent = new int[size];
        // Each parent[I] points to itself, indicating that each element is a collection of its own
        for( int i = 0 ; i < size ; i ++ )
            parent[i] = i;
    }

    @Override
    public int getSize(a){
        return parent.length;
    }

    // Find the set number corresponding to element p
    // O(h) complexity, h is the height of the tree
    private int find(int p){
        if(p < 0 || p >= parent.length)
            throw new IllegalArgumentException("p is out of bound.");
        // Query your parent node until you reach the root node
        // Parent [p] == p
        while(p ! = parent[p]) p = parent[p];return p;
    }

    // Check whether elements P and q belong to a collection
    // O(h) complexity, h is the height of the tree
    @Override
    public boolean isConnected( int p , int q ){
        return find(p) == find(q);
    }

    // Merge the set of elements p and q
    // O(h) complexity, h is the height of the tree
    @Override
    public void unionElements(int p, int q){
        int pRoot = find(p);
        int qRoot = find(q);
        if( pRoot == qRoot )
            return; parent[pRoot] = qRoot; }}Copy the code

The rest of the code is optimized based on this.

4 Optimization based on size

In the above example, if you execute union(0, 1), union(0, 2), union(0, 3), union(0, 4)… A simple solution is to consider how many nodes there are in the current tree.

Suppose and look up the set as follows:

If you point 8 to 9, you’ll end up with a tree of depth 4, but if you point 9 to 8 you’ll end up with a tree of height 3.

Define an sZ array to record the number of elements in the set with root I. Union is to point the root node of the set with small sz to the root node of the set with large sz.

Core code:

public void unionElements(int p, int q){
        int pRoot = find(p);
        int qRoot = find(q);
        if(pRoot == qRoot)
            return;
            
        // Determine the merge direction based on the number of elements in the tree
        // Merge the set with fewer elements into the set with more elements
        if(sz[pRoot] < sz[qRoot]){
            parent[pRoot] = qRoot;
            sz[qRoot] += sz[pRoot];
        }
        else{ // sz[qRoot] <= sz[pRoot]parent[qRoot] = pRoot; sz[pRoot] += sz[qRoot]; }}Copy the code

5 Rank based optimization

Rank is the height of the tree.

For the following example:

Based on the code based on size optimization above, the tree with a small number of nodes points to the tree with a large number of nodes. Union (4, 2) will make 8 point to 7 and form a tree with a height of 4. A more reasonable combination scheme is to point 7 to 8 with a depth of 3.

So we should use depth instead of size to determine who points to whom when we actually merge.

Core code:

public void unionElements(int p, int q){
        int pRoot = find(p);
        int qRoot = find(q);

        if( pRoot == qRoot )
            return;

        // Determine the merge direction according to the rank of the tree where the two elements reside
        // merge the lower rank set into the higher rank set
        if(rank[pRoot] < rank[qRoot])
            parent[pRoot] = qRoot;
        else if(rank[qRoot] < rank[pRoot])
            parent[qRoot] = pRoot;
        else{ // rank[pRoot] == rank[qRoot]
            parent[pRoot] = qRoot;
            rank[qRoot] += 1;   // Maintain the rank value}}Copy the code

6 Path Compression

It is still possible for the above code to degenerate into linked lists

Since the height of the tree determines and looks up the performance of the set, consider using path compression to try to reduce the height of the tree.

Path compression:

The taller tree becomes the shorter tree.

The path compression process can be designed to perform path compression during the find operation. Path compression can be performed in the following ways:

parent[p] = parent[parent[p]];
Copy the code

Set p’s parent node to its grandparent node.

When find(4), set the parent of 4 to 2:

Further up, the parent of 2 becomes 0

The depth of the tree has been reduced from 5 to 3.

Core code:

private int find(int p){
        if (p < 0 || p >= parent.length)
            throw new IllegalArgumentException("p is out of bound");

        while(p ! = parent[p]){ parent[p] = parent[parent[p]];< span style = "max-width: 100%; clear: both; min-height: 1px
            p = parent[p];
        }
        return p;
    }
Copy the code

At this point, our optimization of the union lookup set is complete.

Complete code:

/* Quick Union optimized path compression based on rank */
public class UnionFind implements UF{
    private int[] parent;
    private int[] rank;   // sz[I] specifies the number of elements in the set whose root is I

    public UnionFind(int size){
        parent = new int[size];
        rank = new int[size];
        for (int i = 0; i < size; i ++){
            parent[i] = i;
            rank[i] = 1; }}@Override
    public int getSize(a){
        return parent.length;
    }

    // find the set number corresponding to p
    // O(h), h is the height of the tree
    private int find(int p){
        if (p < 0 || p >= parent.length)
            throw new IllegalArgumentException("p is out of bound");

        while(p ! = parent[p]){ parent[p] = parent[parent[p]];< span style = "max-width: 100%; clear: both; min-height: 1px
            p = parent[p];
        }
        return p;
    }

    @Override
    public boolean isConnected(int p, int q){
        return find(p) == find(q);
    }

    // Merge the collection to which pq belongs
    // O(h), h is the height of the tree
    @Override
    public void unionElements(int p, int q){
        int pRoot = find(p);
        int qRoot = find(q);

        if (pRoot == qRoot)
            return;
        // Set the root node of the tree with the lowest rank to the root node of the tree with the highest rank
        if (rank[pRoot] < rank[qRoot]){
            parent[pRoot] = qRoot;
            // The root of a tree with a lower height points to the root of a tree with a higher height
        }
        else if (rank[qRoot] < rank[pRoot]){
            parent[qRoot] = pRoot;
        }
        else {  // rank[qRoot] == rank[pRoot]
            parent[qRoot] = pRoot;
            rank[pRoot] += 1; }}}Copy the code