1. Introduction
In our actual development, we often have the need to find a specified data in a pile of data, and the common implementation methods to support efficient search algorithm are as follows:
- An array of
- The list
- Binary search tree
- Balanced binary tree
- Red and black tree
- Jump table
Scenario analysis
If we wanted to maintain an ordered set of ints with a data structure that could be inserted, deleted, found, and so on as quickly as possible, what data structure would you use?
The following efficient query algorithms are compared and analyzed
- Ordered array: This type of storage structure, the advantage is to support random access to data, and can use binary search algorithm to reduce the complexity of the search operation. The disadvantage is also obvious. When inserting or deleting data, it requires a lot of data movement to keep the elements in order.
- Binary search tree: If you need a data structure that supports both efficient binary search algorithm and fast insertion and deletion operations, the first is the binary search tree. The disadvantage is that in some extreme cases, the binary lookup tree may become a linear linked list.
- Balanced binary tree: binary tree is not satisfied, so based on the advantages of binary search tree, its disadvantages are improved, and the concept of balance is introduced. According to the different balancing algorithms, the specific implementation of AVL Tree, B Tree (B-tree), B+Tree (B+Tree), red black Tree and so on. But the realization of balanced binary tree is more complicated and difficult to understand.
- Skip list: Also supports the efficient search of data, insert and delete data operation is relatively simple, most importantly, to achieve a more balanced binary tree is a few orders of magnitude light. The disadvantage is that there is some data redundancy.
Table 2. The jump
A SkipList is a data structure that can replace a balanced tree. Skip lists allow sorted data to be distributed in a multi-layer linked list structure. By default, Key values are arranged in ascending order, with random values ranging from 0 to 1 to determine whether a data can climb to a high-level linked list. By allowing a certain amount of data redundancy, it achieves the purpose of “space for time”.
Quickly understand skip lists
2.1 find
Because the complexity of inserting and deleting lists is O(1), but the search speed is still O(n). So to improve the search speed of linked lists can use skip lists. For the list belowIf we want to look for element 9, we’re going to have to go through eight nodes from the beginning to find element 9. Is there some strategy that will allow us to find element 9 within 5 iterations? Why don’t you take a minute to think about how to do that?
Because the elements are ordered, we can add some paths to speed up the search. For example,This way, we only need to iterate 5 times to find element 9 (the red line is the path).
Can we speed up the search?
The answer is yes, add one more layer, so it only needs 4 times to find, it is like when we take the subway, to a station, there are fast line and slow line several routes, through the fast line + slow line collocation, we can get to a station faster.
Of course, we can add one more layer. Based on this method, for a list of n elements, we can take the form of ** (logn + 1) layer pointer path **, and we can find a target element in order of logn time. This data structure is also called skip list. Skip list can also be a deformation of the linked list, but it has the function of binary search.
Insert and Delete
In the example above, we have 9 nodes and 4 layers, which is an ideal skip list, but as we insert/delete nodes into the skip list, the number of nodes on the skip list changes, which means that the number of layers on the skip list changes dynamically.
Here we have a problem, which is how many layers should the newly inserted node span?
This problem have been solved for us for sure, flip a coin to decide the new strategy is through the insert node across the number of layers: every time we have to insert a node, to flip a coin, if thrown out is positive, continued to cast, until a negative, statistical appeared in the process of the positive number, the number of times a node across the number of layers.
In this way, you can get as close to the ideal number of layers as possible. And you can think about why that is?
2.2 insert
For example, if we want to insert nodes 3 and 4, we know by flipping a coin that 3 and 4 span 0, 2 layers (the number of layers starts from 0), then the insertion process is as follows:
Insert 3, across layer 0.
Insert 4, spanning 2 levels.
2.3 delete
Now that we’re done inserting, let’s look at deletion, and deletion is a little bit easier, so if we want to delete 4, let’s just delete 4 and all the layers that it spans.
Skip lists are as efficient as AVL, and search, add, insert, and delete operations can all be performed in order (LogN) complexity. With that said, let’s jump right into the subject and take a detailed look at how skip lists are implemented.
3. Implementation of skip list
The figure above is an example of a skip list. First, describe the construction characteristics of a skip list:
- A skip list should consist of several Level linked lists.
- The lowest linked list in the skip list contains all the data; At each level of the list, the data is ordered;
- If an element X appears at layer I, then all layers smaller than I contain element X;
- Elements at level I pass a pointer to elements at the next level that have the same value;
- In each layer, the elements -∞ and +∞ occur (INT_MIN and INT_MAX, respectively);
- The head pointer points to the first element at the highest level.
3.1 Define the node model in the linked list
Java code implementation is as follows:
public class SkipListEntry<T> {
// data
public Integer key;
public T value;
// links
public SkipListEntry up;
public SkipListEntry down;
public SkipListEntry left;
public SkipListEntry right;
// constructor
public SkipListEntry(Integer key, T value) {
this.key = key;
this.value = value;
}
// methods...
}
Copy the code
As you can see, the node model is divided into two main parts.
-
The data section contains the actual stored data. To avoid any other clutter, use Integer as the type of key and Object as the type of value.
-
The Links section contains four Pointers — up, Down, left, and Right — that you can tell by their names.
3.2 Skip list model
Public class SkipList {// Number of nodes public int n; Public int h; // The first node SkipListEntry head; // The last node SkipListEntry tail; public Random r; }Copy the code
An instance object r of the Random class is used to determine whether a newly added node can ascend to a higher level of the list.
3.3 Initializing skip Lists
The constructor will initialize an empty skip list that looks like this:
The Java code for the constructor:
Public SkipList() {// create the head node this.head = new SkipListEntry(integer.min_value, null); This. tail = new SkipListEntry(integer.max_value, null); // Put the right pointer of the head node to the tail node this.head.right = tail; // Set the left pointer of the tail node to the head node this.tail.left = head; // Set the left pointer of the tail node to the head node this.tail.left = head; this.h = 0; this.n = 0; this.r = new Random(); }Copy the code
3.4 Basic Operations on skip Lists
Skip lists need to implement the basic operations of find, insert, and remove:
- Get (Integer key) : Searches for an element based on the key value
- Put (Integer key, Object value) : Inserts a new element. If the element already exists, the operation is a modification operation
- Remove (Integer key) : deletes an element based on the key value
These may seem like three different operations, but in essence, to do all three operations, you have to either find an element or locate an element so that you can insert a new element at the next location. So, let’s implement this findEntry method first.
The diagram above shows the process of finding the key value 50 in a SkipList with a purple arrow. The brief is as follows:
- Start from head, because head points to the start node of the top level list, which is equivalent to searching from the top level.
- Move to the node pointed to by the right pointer of the current node until the key value of the right node is greater than the key value of the search;
- If there is a lower level of the list, it moves to the next level of the current node (down), if it is already at the lowest level, it exits;
Repeat steps 2 and 3 until the node where the key is located is found, or it does not exist and exit the search.
Java code implementation is as follows:
Private SkipListEntry findEntry(Integer key) {// look up SkipListEntry from the head node p = head; While (true) {// look from left to right until the key value of the right node is greater than the key value to look for while (p.ray.key <= key) {p = p.ray.key; } // Move to lower level if (p.own! = null) { p = p.down; } else { break; } // return p,! (p <= key) return p; }Copy the code
Note the following:
- If the passed key value exists in the skip list, the findEntry returns the object’s underlying node;
- If the passed key does not exist in the skip list, findEntry returns the bottom node whose key value is less than the key and whose key value is the least different from the key value.
For example, looking for an element node with key = 42 in a skip list returns a node with key = 39. As shown in the figure below:
With the findEntry method, we can easily implement some of the operations described above.
- The get method
public Object get(Integer key) { SkipListEntry p = findEntry(key); if (p.key.equals(key)) { return p.value; } else { return null; }}Copy the code
- Put method
Some steps to watch out for:
- If the key value of put exists in the skip list, the modification operation is performed.
- If the key value of put does not exist in the jump list, the operation of adding a node needs to be performed, and the height (maximum level) of the new node needs to be determined by the random number.
- When the height of the newly added node reaches the maximum level of the skip list, a blank layer needs to be added (no other nodes except -oo and +oo).
Let’s see the process of inserting a node step by step through the diagram:
The first step is to find suitable slots for insertion
The second step is to insert the new node q after the found node P
In the third step, repeat the following, using a random number to determine the height of the new node
- Start with the p node and move to the left until you find a node with a higher level node.
- Move the p pointer up one level;
- Create a node that is the same as the q node data. Insert the node to the right of p and above q in the skip list.
- Until the random number does not satisfy the condition of ascending;
The picture is as follows:
As long as the random number satisfies the condition, the node with key = 42 will climb until its level is equal to the height of the skip list. At this time, we need to add a blank layer at the top of the skip list, and the jump list height+1, to meet the next operation to add a node.Java code implementation is as follows:
public Object put(Integer key, Object value) { SkipListEntry p, q; int i = 0; P = findEntry(key); If (p.key.equals(key)) {Object oldValue = p.value; // If (p.key.equals(key)) {Object oldValue = p.value; p.value = value; return oldValue; }} q = new SkipListEntry(key, value); q.left = p; q.right = p.right; p.right.left = q; p.right = q; While (r.nextdouble () < 0.5) {if (I >= h) {addEmptyLevel(); } while (p.up= = null) {p = p.left; } p = p.up; = new SkipListEntry(key, null); // Add a new node object that has the same key value as the q pointer. z.left = p; z.right = p.right; p.right.left = z; p.right = z; z.down = q; q.up = z; q = z; i = i + 1; } n = n + 1; // Return null, no old node value return null; }Copy the code
- The remove method
This is the end of the principle and implementation of skip lists.
It is also important to note that the result of each run of the skip list is different, which is why the skip list is a randomized data structure. (Caused by the existence of Random)
4. Application of skip list in Java
- ConcurrentSkipListMap: corresponds to HashTable, HashMap, TreeMap in function;
- ConcurrentSkipListSet: functionally corresponds to HashSet;
To be precise, SkipList is more like Java’s TreeMap, which is based on a red-black tree (a self-balancing binary lookup tree) with an average time complexity of order log n.
HashMap is implemented based on a hash table, with an average lookup time of O(1). ConcurrentSkipListMap is based on skip lists, with an average search time of order log n.
ConcurrentSkipListMap has SkipList properties and is suitable for concurrent access to large amounts of data. Multiple threads can safely perform insert, remove, update, and access operations concurrently. It has an advantage over other data structures with locking mechanisms under tremendous pressure.
TreeMap insert data to balance the tree using strict rotation operations (such as balancing binary trees have left and right rotation) to ensure balance, so SkipList is easier to achieve, and compared to the balance tree has a higher efficiency.
Related articles
- Skip List Principles and Implementation (Java)
- Juejin. Cn/post / 684490…