Demand analysis

Suppose you have N villages, some of which have roads connecting them, some of which have no roads connecting them

  • Query if there is a road between two villages
  • Connecting the two villages

And lookup sets are ideal for solving such “join” related problems

implementation

Parallel lookup Set, also called Disjoint Set, has two core operations:

  • Find: Find the collection of elements to Find which village the road is in
  • Union: Merge the set of two elements into one set, and merge the villages of two roads into one village

How to store data

If the data type being processed by the parallel lookup set is an integer, then an integer array can be used to store the data, as shown in the figure below:Thus, a look-up set is a tree structure that can be implemented using arrays.

Element initialization

When initialized, each element belongs to a separate collection, as shown below:Code:

    private int[] array;
    public UnionFind_QF(int capacity) {
        if (capacity < 0) {
            throw new IllegalArgumentException("Invalid parameter");
        }
        array = new int[capacity];
        for (int i = 0; i < capacity; i++) { array[i] = i; }}Copy the code

Quick Find

The process of the Union

For example, union(v1,v2) means that all elements of the set v1 refer to the root node of V2.

Code:

/** * union(v1,v2) */
    public void union(int v1, int v2) {
        int parentV1 = find(v1);
        int parentV2 = find(v2);
        if (parentV1 == parentV2) {
            return;
        }
        for (int i = 0; i < array.length; i++) {
            if(array[i] == parentV1) { array[i] = parentV2; }}}public int find(int v) {
        return array[v];
    }
Copy the code

Time complexity: O(n)

The process of the Find

Code:

    public int find(int v) {
        return array[v];
    }
Copy the code

Time complexity: O(1)

Quick Union

The process of the Union

So union of v1,v2, for example, means that the root of v1 points to the root of v2.Code:

     private int[] array;
    public QuickUnion_QU(int capacity) {
        if (capacity < 0) {
            throw new IllegalArgumentException("Invalid parameter");
        }

        array = new int[capacity];

        for (int i = 0; i < capacity; i++) { array[i] = i; }}public void union(int v1, int v2) {
        // find the root node of v1
        int p1 = find(v1);
        // Find the root node of v2
        int p2 = find(v2);
        if (p1 == p2) {
            return;
        }
        // Set v1 root to v2 root
        array[p1] = p2;
    }
Copy the code

Time complexity: O(logn)

The process of the Find

Code:

    private int find(int v) {
        while(v ! = array[v]) { v = array[v]; }return v;
    }
Copy the code

Time complexity: O(logn)

Quick Union optimization

In the process of Union, tree imbalance may occur and may degenerate into linked list, as shown in the figure:If reduced to a linked list, find() time complexity becomes O(n), which is inefficient. Two optimization schemes:

  1. Size-based optimization, grafting a few elements onto a tree with many elements
  2. Rank based optimization, grafting of low height onto high height tree
Quick Union- Size based optimization

Graft the branch with fewer elements onto the branch with more elements, as shown in the figure below:Code:

/** * QuickUnion size based optimization **@author jun.zhang6
 * @date2020/9/30 * /
public class QuickUnion_size {
    /**
     * 存数据的数组
     */
    private int[] array;

    /** * hold the size of the array */
    private int[] sizes;

    public QuickUnion_size(int capacity) {
        if (capacity < 0) {
            throw new IllegalArgumentException("Capacity is invalid");
        }
        array = new int[capacity];
        sizes = new int[capacity];
        for (int i = 0; i < capacity; i++) {
            array[i] = i;
            sizes[i] = 1; }}/** * find the root node */
    public int find(int v) {
        while(v ! = array[v]) { v = array[v]; }return v;
    }

    /** * graft the root of v1 onto the root of v2 */
    public void union(int v1, int v2) {
        int p1 = find(v1);
        int p2 = find(v2);
        if (p1 == p2) {
            return;
        }
        if (sizes[p1] < sizes[p2]) {
            array[p1] = p2;
            sizes[p2] += sizes[p1];
        } else{ array[p2] = p1; sizes[p1] += sizes[p2]; }}}Copy the code

However, size-based optimization still has the problem of tree imbalance, as shown in the figure:

Quick Union- Rank based optimization

Graft a low-height tree onto a high-height tree, as shown in the picture below:Code:

/** * QuickUnion based optimization **@author jun.zhang6
 * @date2020/9/30 * /
public class QuickUnion_ranking {
    /**
     * 存数据的数组
     */
    private int[] array;

    /** * hold the height of the array */
    private int[] ranks;

    public QuickUnion_ranking(int capacity) {
        if (capacity < 0) {
            throw new IllegalArgumentException("Capacity is invalid");
        }
        array = new int[capacity];
        ranks = new int[capacity];
        for (int i = 0; i < capacity; i++) {
            array[i] = i;
            ranks[i] = 1; }}public int find(int v) {
        while(v ! = array[v]) { v = array[v]; }return v;
    }

    /** * graft the root of v1 onto the root of v2 */
    public void union(int v1, int v2) {
        int p1 = find(v1);
        int p2 = find(v2);
        if (p1 == p2) {
            return;
        }
        if (ranks[p1] < ranks[p2]) {
            array[p1] = p2;
        } else if (ranks[p1] > ranks[p2]) {
            array[p2] = p1;
        } else{ array[p1] = p2; ranks[p2]++; }}}Copy the code
Quick Union- Path compression

Combining sets based on rank reduces the imbalance of the tree, but the more data is merged, the higher the tree is bound to be, resulting in a time-consuming find() operation that has to work its way up until the root node is found. Path compression: When find(), all nodes on the path point to the root node, reducing the height of the tree. As shown in figure:Code:

 /** * path compression, reduce the height of the tree, recursive */
    @Override
    public int find(int v) {
        if(v ! = array[v]) { array[v] = find(array[v]); }return array[v];
    }
Copy the code

Path compression is expensive to implement because all nodes on the find() path point to the root node, and can be further optimized using path splitting and path halving.

Quick Union- Path splitting

Make each node on the path point to its grandfather, as shown:Code:

  /** * Path split */
    @Override
    public int find(int v) {
        while(v ! = array[v]) {int parent = array[v];//parent = 2
            array[v] = array[parent];//array[1] = 3
            v = parent;//v = 2
        }
        return v;
    }
Copy the code
Quick Union- The path is halved

Make every other node on the path point to its grandfather node, as shown in the figure:Code:

 /**
     * 路径减半
     */
    @Override
    public int find(int v) {//v=1
        while(v ! = array[v]) { array[v] = array[array[v]];//array[1] = array[2]=3
            v = array[v];//v=3
        }
        return v;
    }
Copy the code

We usually use a combination of QuickUnion + Rank optimization + path splitting/path halving.

Custom type

Before we based on integer type to implement, if the custom type also want to use and look up set, how to implement? Directly on the code:

package com.ymm.zhangj.dataStructure;

import java.util.HashMap;
import java.util.Map;
import java.util.Objects;

/** * a custom query set **@author jun.zhang6
 * @date2020/9/30 * /
public class CustomUnionFind<V> {
    /** * Each value corresponds to a node */
    private Map<V, Node<V>> nodes = new HashMap<>();

    /** * initializes */
    public void init(V v) {
        if (nodes.containsKey(v)) {
            return;
        }
        nodes.put(v, new Node<>(v));
    }

    /** * find the root node of the node corresponding to v, the path is halved */
    public Node<V> findNode(V v) {
        // Get the node corresponding to v
        Node<V> node = nodes.get(v);
        if (node == null) {
            return null;
        }
        while(! Objects.equals(node.value, node.parent.value)) { node.parent = node.parent.parent; node = node.parent; }return node;
    }

    /** * find the value of v corresponding to the root node */
    public V find(V v) {
        Node<V> node = findNode(v);
        return node == null ? null : node.value;
    }

    public void union(V v1, V v2) {
        // find the root node corresponding to v1
        Node<V> p1 = findNode(v1);
        Node<V> p2 = findNode(v2);
        if (p1 == null || p2 == null) {
            return;
        }
        if (Objects.equals(p1.value, p2.value)) {
            return;
        }
        if (p1.rank < p2.rank) {
            p1.parent = p2;
        } else if (p1.rank > p2.rank) {
            p2.parent = p1;
        } else {
            p1.parent = p2;
            p2.rank += 1; }}/** * Determine whether two nodes belong to the same set */
    public boolean isSame(V v1, V v2) {
        return Objects.equals(find(v1), find(v2));
    }


    /** * Internal node object */
    private static class Node<V> {
        V value;
        Node<V> parent = this;
        int rank = 1;

        public Node(V value) {
            this.value = value; }}}Copy the code