Abstract
A parallel lookup set is a very efficient data structure for solving the problem of whether data sets are connected or not. For example, on a network, different terminals may be connected. If there are terminals A, B, and C, A connects TO B and A connects to C. So we can say that A, B, and C belong to the same connected component. This kind of connection relation analysis of parallel lookup set will be our common tools. Based on application scenarios and collection, the following problems can be solved
- Give me two points, and tell me are they connected? (If connected, there is no need to give a specific path, the case of a specific path needs to be based on DFS algorithm)
- Give two points and connect the two points
(Please go directly to the bottom of the article.)
Keywords: parallel search set, connectivity analysis, path compression
Thought analysis
Given a set of data, there are six points of difference, such as [[0,3],[1,2],[2,5], where each subarray represents a connected relationship between two points. The details are shown in the following figure
and | check | set |
---|---|---|
0_a | 1_b | 2_b |
3_a | 4_c | 5_b |
It can be easily seen from the figure that the six points are divided into three groups a, B and C according to the connection relation. So how can we represent this connection? So we need to build a data structure to solve the following problems
- Find the grouping of nodes in the lookup set
- Determine whether two nodes are connected
- Associate the two nodes
- Get the total number of groups
// To solve the problem of whether different nodes have the same root
class UnionFind {
constructor(n) {
this.parent = []
this.count = 0
this.init(n)
}
// Initialize a parallel set of size n
init(n) {}
// Find the grouping of nodes in the lookup set
find(node) {}
// Check whether two nodes are connected
some(left, right) {
return this.find(left) === this.find(right)
}
// Associate the two nodes
union(left, right) {
let index_l = this.find(left)
let index_r = this.find(right)
if(index_l ! == index_r) {/* 这里待补充将两个点连接起来的代码
.......
*/
this.count--
return true
}
return false
}
// Get the total number of groups
getCount() {
return this.count
}
}
Copy the code
Intuitively we can just use a one-dimensional array to represent the grouping of points. That [‘ a ‘, ‘b’, ‘b’, ‘a’, ‘c’, ‘b’]. We can also select the subscript of a point based on connectivity as a grouping marker, such as parent= [0,1,1,0,4,1]. That is, the parent of the I node is parent[I]. Indicates that I and parent[I] are connected. So our init() method can be designed like this
// Initializes a set and looks it up. Each node is associated with itself
init(n) {
this.parent.length = 0
for (let i = 0; i < n; i++) {
this.parent[i] = i
}
this.count=n
}
Copy the code
Next, we’ll discuss step by step how to design the other apis in this data structure. There are the following design methods
The quick – find algorithm
As you can see from the proposed UnionFind class above, the find() method is used so frequently that every critical method needs to use the find() method first. Therefore, we need to design find() as efficiently as possible. Intuitively, we are most efficient when the time complexity is O(1) each time we call find(). So we can design find() as follows.
find(node) {
return this.parent[node]
}
Copy the code
All of these calls to find() take O(1) time. The corresponding union() method is
union(left, right) {
let index_l = this.find(left)
let index_r = this.find(right)
if(index_l ! == index_r) {for(let i=0; i<this.parent.length; i++){// Change the grouping of all points belonging to the index_L group to index_r
if(this.parent[i]==index_l){
this.parent[i]=index_r
}
}
this.count--
return true
}
reurn false
}
Copy the code
The union() method above iterates through and looks up the set each time it is called, finding the nodes that need to be modified and changing the grouping that the nodes belong to. Therefore, if the number of new paths is M and the size of search set is N, the time complexity is O(m*n). When the data scale is large, the complexity of square order is not ideal, so we need to explore a more efficient union() method.
Quick-union && weighted quick-Union algorithm
According to the discussion of The Quick-union algorithm, we find that the time complexity of union () method in the Quick-union algorithm is too high, so another method, quick-union, is proposed here. This algorithm design mainly uses the concept of tree, each point stores its parent node. When looking for groups, keep looking up until the parent node is itself. If the parent = [0, 2). When we look up the group of the node whose index is 2, we first look up its parent node whose index is 1. Because the parent of a node with index 1 is not itself, you keep looking up until you find a node with index 0. In this way, we can determine that the nodes in index 2 are in the same group as those in index 0.
find(node){
while(this.parent[node]! ==node){ node=this.parent[node]
}
return node
}
Copy the code
union(left, right) {
let index_l = this.find(left)
let index_r = this.find(right)
if(index_l ! == index_r) {this.parent[index_l]=index_r
this.count--
return true
}
return false }
Copy the code
In the design of quick-union algorithm, union () is very efficient, but the efficiency of find () is obviously problematic. The problem is that if the height of the tree is high, the while loop inside find() will be called many times. So is there a way to reduce the height of the tree so that the whole tree is balanced. Looking at the union () method, we see that the code is hard-coded in that we always set the parent of index_L to index_R. That is, we always treat the left tree as a subtree attached to the right tree. Consider the following (left image)
When the number of left trees is large, if we still connect the left tree as a subtree to the right tree. So the height of the whole tree is going to be higher. Another way to think about it is that every time we do a merge, we choose to use it based on the size of each tree
This.parent [index_l]=index_r or this.parent[index_r]= index_L. That is, we always connect small trees as subtrees to large trees. As shown on the right in the figure above. The height of the entire tree will be balanced, which will improve the efficiency of find (). We can use a size array at initialization to store how many nodes are under each node. The initial value is 1. Int () and union () are encoded as follows
// initialize a set and look up the set
init(n) {
this.parent.length = 0
for (let i = 0; i < n; i++) {
this.parent[i] = i
}
this.count=n
this.size = new Array(n).fill(1)}Copy the code
// Associate two nodes, that is, they share a root node. The sum and union join the root nodes of the two nodes
union(left, right) {
let l = this.find(left)
let r = this.find(right)
if(l ! = r) {// The left side is smaller, so merge the left side into the right tree
if (this.size[l] < this.size[r]) {
this.parent[l] = r
this.size[r] += this.size[l]
} else {
this.parent[r] = l
this.size[l] += this.size[r]
}
this.count--
return true
}
return false
}
Copy the code
Path to the compression
After the above analysis, weighted Quick-Union algorithm is already a good design. Not only does this make union () less complex, but it also flattens the tree of relationships between nodes to help find () execute faster. Based on this, we should think that the best case is that each tree has two layers, which is flattened into a tree of height 2, so that we can use O (1) complexity quickly every time we find(). So how do you do that? It’s very simple. Look at the code
// Find the root node of the node in the parallel lookup set
find(node) {
while(node ! =this.parent[node]) {
// Path compression, set the parent of the child node to the parent node on each lookup. This will continuously flatten the query tree.
this.parent[node] = this.parent[this.parent[node]]
node = this.parent[node]
}
return node
}
Copy the code
The important third line is that every time we look for a parent node we set the parent node of the current node to its grandfather node. So it’s directly connected to grandpa. This reduces the height by one layer. When implemented to a certain extent. The whole tree must have only two floors. This is much more efficient for find ().
Conclusion code, js and lookup set
class UnionFind {
// To solve the problem of whether different nodes have the same root
constructor(n) {
this.parent = [] / / and set
this.size = [] // Total number of nodes under each node
this.count = 0
this.init(n)
}
// initialize a set and look up the set
init(n) {
this.parent.length = 0
for (let i = 0; i < n; i++) {
this.parent[i] = i
}
this.count = n
this.size = new Array(n).fill(1)}// Find the root node of the node in the parallel lookup set
find(node) {
while(node ! =this.parent[node]) {
// Path compression, set the parent of the child node to the parent node on each lookup. This will continuously flatten the query tree.
this.parent[node] = this.parent[this.parent[node]]
node = this.parent[node]
}
return node
}
// Check whether two nodes have the same root node
some(left, right) {
return this.find(left) == this.find(right)
}
// Associate two nodes, that is, they share a root node. The sum and union join the root nodes of the two nodes
union(left, right) {
let l = this.find(left)
let r = this.find(right)
if(l ! = r) {// The left side is smaller, so merge the left side into the right tree
if (this.size[l] < this.size[r]) {
this.parent[l] = r
this.size[r] += this.size[l]
} else {
this.parent[r] = l
this.size[l] += this.size[r]
}
// The connected component decreases by 1
this.count--
return true
}
return false
}
// Get the total number of groups
getCount() {
return this.count
}
}
Copy the code
The resources
Dm_vincent’s blog post