Wechat search “Bigsai” to follow this interesting programmer, your three likes must be very important to me! The article has been included in my Github Bigsai-algorithm welcome star
preface
Skip table is a data structure often asked in interviews. It has been applied in many middleware and languages, and we are familiar with the Redis skip table. And in many scenarios of the interview may be asked, and occasionally will ask you to write it by hand (skipping table may make writing by hand, red black tree is not possible), this is not, to restore the scene:
But don’t panic. Don’t be afraid to meet a mushroom head, because you’re reading this article (proud 😏), and you’re not as embarrassed as a panda.
For a data structure or algorithm, the number of people in the crowd tends to dip from hearing the name, understanding the rationale, knowing the execution process, and being able to write by hand. The core principles of many data structures and algorithms may be simple, but it takes brains to understand the implementation process. However, if it can be written out, it is necessary to design and implement it step by step. It may take a long time to actually write it, and it may require a lot of research.
This article in the front of the introduction of the table, the following part of the detailed design and implementation of the table, to understand the table, this is really enough.
Quickly understand the table
Skip lists were invented in 1989 by William Pugh, an American computer scientist. In his paper Skip Lists: A Probabilistic Alternative to Balanced Trees, he introduced in detail the data structure of Skip tables and the operations such as insertion and deletion.
SkipList (SkipList) is a data structure used for quick search and lookup of ordered element sequence. SkipList is a random data structure, which is in essence an ordered linked list that can carry out binary search. In order to achieve fast search by index, a multilevel index is added to the original list. Skip tables can not only improve the performance of search, but also improve the performance of insert and delete operations. It’s on par with red-black and AVL trees in terms of performance, but the principle is very simple and the implementation is much simpler than red-black trees.
Here you can see some keywords: linked list (ordered list), index, binary search. You’ve got a rough idea in your head, but you’re probably not sure how diao this “skipping list” is, and you might even wonder what randomization has to do with it. You’ll find out soon enough below!
Looking back at linked lists, we know that linked lists and ordered lists (arrays) usually love and kill each other. They come in pairs, and each has its pros and cons. The advantage of linked lists is more efficient insertion and deletion. The pain point is the query is very slow! Each query is an O(n) complexity of the operation, the list estimated their own gas want to cry 😢.
This is a linked list of leading nodes (which is a fixed entry point and does not store any meaningful values). Each lookup requires an enumeration, which is quite slow. Can we optimize it a bit and make it skip a bit? The answer is yes, we know that many algorithms and data structures in space for time, we add a layer of index above, so that some nodes in the upper layer can be directly located, so that the query time of the list is almost halved, although the list itself is not happy, but it wants to cry face.
In this way, when querying a node, it will first quickly locate a range where the node is located from the upper layer. If the specific range is found, the search cost is very small. Of course, a downward index (pointer) will be added in the structural design of the table to find and determine the underlying node. The average search speed is O(n/2). But when you have a large number of nodes, it’s still very, very slow. We all know that binary search cuts the search range in half every time, and it would be perfect if ordered lists could do the same. Yes, skip table can make the list has close to the efficiency of binary search of a data structure, the principle is still to add a number of layers of index, optimize the search speed.
As you can see from the figure above, searching for an ordered list with such a data structure can achieve nearly binary performance. You maintain multiple levels of indexes on top, first looking for the last position less than the element you are looking for in the highest-level index, and then moving to the second-level index until you get to the lowest level, which is very close to the element you are looking for (if the element exists). Because more than one element can be skipped at a time according to the index, skip searches are faster.
For an ideal hop table, the number of inodes at each upper layer is 1/2 of the number at the next layer. So if n nodes increase the number of nodes (1/2+1/4+…) < n. And the low level of the search effect has little impact. But with such a structure, you might wonder, does such a perfect structure really exist? Large probability does not exist, because as a linked list, it is necessary to add and delete some of the operation of the search. Deletion and insertion may change the entire structure, so these are ideal structures. Whether the upper index is added at the time of insertion is a matter of probability (1/2 probability), which will be explained in detail later.
Jump table to add, delete and check
Above a little understanding of what the jump table is, so here to talk about the jump table to add, delete and check the process. In order to facilitate the operation, we set the key of the head node (head) of the hop table to the minimum value of int (must satisfy the left small and the right large to facilitate comparison).
For each node, set it to the SkipNode class. To prevent beginners from confusing next down or right, set the right and Down Pointers directly.
class SkipNode<T>
{
int key;
T value;
SkipNode right,down;// The pointer in the next right direction
public SkipNode (int key,T value) {
this.key=key;
this.value=value; }}Copy the code
The structure and initialization of the hop table are also important. Its main parameters and initialization method are:
public class SkipList <T> {
SkipNode headNode;// Head node, entry
int highLevel;// The number of index layers in the current hop table
Random random;// Used to flip a coin
final int MAX_LEVEL = 32;// The largest layer
SkipList(){
random=new Random();
headNode=new SkipNode(Integer.MIN_VALUE,null);
highLevel=0;
}
// Other methods
}
Copy the code
Query operation
A lot of times a linked list can be linked like this just by a certain element or a key as a criterion for order. So it’s possible that there are some values inside the list. But modify and query are actually one operation, find the key number. And the search process is very simple, set a temporary node team=head. When team is not null, the process is roughly as follows:
(1) Start from team node, if the key of the current node is equal to the key of the query, then return the current node (if it is a modification operation, then continue to modify the value downward).
(2) If the keys are not equal and the right side is null, then the proof can only be down (the result may appear in the lower right direction), in which case team=team.down
(3) If the keys are not equal, the right node key is not NULL, and the right node key is smaller than the key to be queried. So you can go to the right, so team=team.right
(4) (otherwise) If the keys are not equal, the right node key is not NULL, and the right node key is larger than the key to be queried. If there is a result, it will be between this index and the next index, where team=team.down.
The correct node or null will be returned after this step.
For example, to query 12 nodes in the figure above, the first step is to start from head and find that the right side is not empty, and 7<12, go to the right. The second step is null down on the right; Step 3 If 10 is less than 12 on the right of node 7, go to the right. In step 4, the right side of step 10 is null. Step five, 12 on the right is less than or equal to the right. Step 6 Start discover equality return end of node.
And the code for this is pretty easy:
public SkipNode search(int key) {
SkipNode team=headNode;
while(team! =null) {
if(team.key==key)
{
return team;
}
else if(team.right==null)// The right side is gone
{
team=team.down;
}
else if(team.right.key>key)// Need to drop to find
{
team=team.down;
}
else // The right side is smaller to the right{ team=team.right; }}return null;
}
Copy the code
Delete operation
The delete operation is slightly more complex than the query, but simpler than the insert operation. Deletion changes the structure of the list so you need to handle the connections between nodes. Here are some things to keep in mind about deleting:
(1) Delete the current node and the node before and after the relationship
(2) After the node of the current layer is deleted, the node of the key at the next layer should also be deleted, until the node of the lowest layer is deleted
If the current node is found, how to find the node before it? You can’t go through this again! Some use a four-direction pointer (up, down, left, right) to find the left node. Yes, but we can do something special here. Instead of directly determining and operating the node, first find the node to the left of the node to be deleted. This node is used to complete the deletion, and then the node directly goes down to the next layer of left node to be deleted. Set a temporary node team=head, when team is not null, the specific loop process is:
(1) if the right side of team is null, then team=team.down
(2) If the right side of team is not null, and the right side key is equal to the key to be deleted, then delete the node first, then team down team=team.down in order to delete the lower layer node.
(3) If the right side of team is not null, and the right key is less than the key to be deleted, then team=team.right.
(4) If the right side of team is not NULL, and the right side key is greater than the key to be deleted, then team goes down to team=team.down, and continue to search for and delete nodes in the lower layer.
For example, if you delete 10 nodes in the figure above, first team=head starts from team, and 7<10 goes to the right (team=team.right omitted); Step 2: If the right side is null, you can only go down; The right side of part 3 is 10. Delete 10 nodes at the current layer and continue to search for 10 nodes at the next layer. Step 4:8<10 to the right; Step 5 Delete the node with 10 on the right and team down. If team is NULL, the deletion is complete and the loop exits.
Delete operation to achieve the code is as follows:
public void delete(int key)// Delete without considering the number of layers
{
SkipNode team=headNode;
while(team! =null) {
if (team.right == null) {// If the right side is missing, this layer is found
team=team.down;
}
else if(team.right.key==key)// Find the node. The node to be deleted is on the right
{
team.right=team.right.right;// Delete the node on the right
team=team.down;// Continue to search for delete
}
else if(team.right.key>key)// The right side is not possible
{
team=team.down;
}
else { // The node is still on the rightteam=team.right; }}}Copy the code
The insert
Insert operations are the most cumbersome to implement and require the most consideration. Review queries without indexing; Review deletions, if any, for each index level. But insert is different. Insert needs to consider whether to insert index, how many levels to insert and so on. It is impossible to maintain a perfectly ideal index structure because of the insertion and deletion required, because it is too expensive. But we use randomization to determine whether to insert the index at the top. That is, a random number [0-1] is generated. If the number is less than 0.5, the index is inserted upward. After the insertion, the random number is used to determine whether the index is inserted upward. With good luck, this value may have multiple indexes, and with bad luck, only the lowest level is inserted (which is 100% inserted). However, the height of the index must be limited. We usually set the maximum value of the index. If the value is greater than this, the index will not be added.
Let’s take a step by step analysis of how to do it. What’s the process
(1) First, find the left node to be inserted through the above method. If you insert, you must insert at the lowest level, so you insert the node through the linked list.
(2) After this layer is inserted, we need to consider whether the previous layer is inserted. First, judge the current index level. If it is greater than the maximum value, then stop (for example, it has reached the highest index level). Otherwise, set a random number with a probability of 1/2 to insert a layer of index up (because the ideal state is to create an index node up every 2).
(3) Continue the operation of (2) until the probability exits or the number of index layers is greater than the maximum index layer.
There are essentially very important details to consider when it comes to the specific upward insertion. How do I first find the upper-layer nodes to be inserted?
This implementation method may be different, if there are left and up Pointers, then we can find the upper node to insert to the left and up, but if there are only right and down Pointers, we can also use the clever query process to record the falling nodes. Since the node that has been dropped in reverse order is the node that needs to be inserted, the lowest level is no exception (because no match will drop to null to end the loop). In this case, I’m using a data structure called a stack, but I can also use a List. Here is an illustration of an insert.
Second, what if this layer is the highest level index and the index needs to be built up?
First of all, the jump table must have no index at first, and then slowly add nodes to create layer 1 and layer 2 indexes, but what if the node is added to the index beyond the current level?
Create a new ListNode as the new head, set its down point to the old head, and add this head node to the stack. If you are lucky enough to create a new layer of nodes, It’s going to look something like this.
When inserting the upper layer, make sure that all nodes are created (copied), except for the “right” pointing down. The “down” pointing to the previous node can be used as a temporary precursor node. If the number of layers exceeds the current highest level, the head node (entry) needs to change.
More details about this section are explained in the code comment, which reads:
public void add(SkipNode node)
{
int key=node.key;
SkipNode findNode=search(key);
if(findNode! =null)// If there is a node with this key
{
findNode.value=node.value;
return;
}
Stack<SkipNode>stack=new Stack<SkipNode>();// Stores the downward nodes, which may insert nodes on the right
SkipNode team=headNode;// Look for the node to be inserted.
while(team! =null) {// Perform the lookup operation
if(team.right==null)// The right side is gone
{
stack.add(team);// Make a note of the nodes that went down
team=team.down;
}
else if(team.right.key>key)// Need to drop to find
{
stack.add(team);// Make a note of the nodes that went down
team=team.down;
}
else / / to the right{ team=team.right; }}int level=1;// The current number of layers is added from the first layer (the first layer must be added, add first to judge)
SkipNode downNode=null;// Keep the precursors (i.e., down points, initially null)
while(! stack.isEmpty()) {// Insert node into this layer
team=stack.pop();// Throw the left node to be inserted
SkipNode nodeTeam=new SkipNode(node.key, node.value);// The node needs to be recreated
nodeTeam.down=downNode;// Handle the vertical direction
downNode=nodeTeam;// Mark the new node for next use
if(team.right==null) {// If the right side is NULL, insert at the end
team.right=nodeTeam;
}
// Horizontal processing
else {// There is also a node on the right, inserted between the two
nodeTeam.right=team.right;
team.right=nodeTeam;
}
// Consider whether you need to move up
if(level>MAX_LEVEL)// The highest level node has been reached
break;
double num=random.nextDouble();/ / [0, 1] random Numbers
if(num>0.5)// Bad luck ends
break;
level++;
if(level>highLevel)// The head node is higher than the current maximum height but still within the allowed range
{
highLevel=level;
// A new node needs to be created
SkipNode highHeadNode=new SkipNode(Integer.MIN_VALUE, null);
highHeadNode.down=headNode;
headNode=highHeadNode;/ / change the head
stack.add(headNode);// Throw the head next time}}}Copy the code
conclusion
For that, the complete analysis of skip lists is over. Of course, you may see implementations of different types of skip lists, as well as arrays representing the relationships between the upper and lower levels, which can also work, but this article only defines the right and Down directions of the list for a more pure explanation of skip lists.
For jumpers and their competitors, red-black trees, why does Redis use jumpers for zsets? Because a skip table is about as efficient as a red-black tree in terms of look-and-insert maintenance, it’s a linked list, and it can determine the range, and the range problem might not be so easy to query on a tree. And the JDK skip list ConcurrentSkipListSet and ConcurrentSkipListMap. Interested also can consult the source code.
For learning, the complete code is very important, here I posted the complete code, you need to take.
import java.util.Random;
import java.util.Stack;
class SkipNode<T>
{
int key;
T value;
SkipNode right,down;// Left, right, up and down four directions of the pointer
public SkipNode (int key,T value) {
this.key=key;
this.value=value; }}public class SkipList <T> {
SkipNode headNode;// Head node, entry
int highLevel;/ / layer
Random random;// Used to flip a coin
final int MAX_LEVEL = 32;// The largest layer
SkipList(){
random=new Random();
headNode=new SkipNode(Integer.MIN_VALUE,null);
highLevel=0;
}
public SkipNode search(int key) {
SkipNode team=headNode;
while(team! =null) {
if(team.key==key)
{
return team;
}
else if(team.right==null)// The right side is gone
{
team=team.down;
}
else if(team.right.key>key)// Need to drop to find
{
team=team.down;
}
else // The right side is smaller to the right{ team=team.right; }}return null;
}
public void delete(int key)// Delete without considering the number of layers
{
SkipNode team=headNode;
while(team! =null) {
if (team.right == null) {// If the right side is missing, this layer is found
team=team.down;
}
else if(team.right.key==key)// Find the node. The node to be deleted is on the right
{
team.right=team.right.right;// Delete the node on the right
team=team.down;// Continue to search for delete
}
else if(team.right.key>key)// The right side is not possible
{
team=team.down;
}
else { // The node is still on the rightteam=team.right; }}}public void add(SkipNode node)
{
int key=node.key;
SkipNode findNode=search(key);
if(findNode! =null)// If there is a node with this key
{
findNode.value=node.value;
return;
}
Stack<SkipNode>stack=new Stack<SkipNode>();// Stores the downward nodes, which may insert nodes on the right
SkipNode team=headNode;// Look for the node to be inserted.
while(team! =null) {// Perform the lookup operation
if(team.right==null)// The right side is gone
{
stack.add(team);// Make a note of the nodes that went down
team=team.down;
}
else if(team.right.key>key)// Need to drop to find
{
stack.add(team);// Make a note of the nodes that went down
team=team.down;
}
else / / to the right{ team=team.right; }}int level=1;// The current number of layers is added from the first layer (the first layer must be added, add first to judge)
SkipNode downNode=null;// Keep the precursors (i.e., down points, initially null)
while(! stack.isEmpty()) {// Insert node into this layer
team=stack.pop();// Throw the left node to be inserted
SkipNode nodeTeam=new SkipNode(node.key, node.value);// The node needs to be recreated
nodeTeam.down=downNode;// Handle the vertical direction
downNode=nodeTeam;// Mark the new node for next use
if(team.right==null) {// If the right side is NULL, insert at the end
team.right=nodeTeam;
}
// Horizontal processing
else {// There is also a node on the right, inserted between the two
nodeTeam.right=team.right;
team.right=nodeTeam;
}
// Consider whether you need to move up
if(level>MAX_LEVEL)// The highest level node has been reached
break;
double num=random.nextDouble();/ / [0, 1] random Numbers
if(num>0.5)// Bad luck ends
break;
level++;
if(level>highLevel)// The head node is higher than the current maximum height but still within the allowed range
{
highLevel=level;
// A new node needs to be created
SkipNode highHeadNode=new SkipNode(Integer.MIN_VALUE, null);
highHeadNode.down=headNode;
headNode=highHeadNode;/ / change the head
stack.add(headNode);// Throw the head next time}}}public void printList(a) {
SkipNode teamNode=headNode;
int index=1;
SkipNode last=teamNode;
while(last.down! =null){
last=last.down;
}
while(teamNode! =null) {
SkipNode enumNode=teamNode.right;
SkipNode enumLast=last.right;
System.out.printf("%-8s"."head->");
while(enumLast! =null&&enumNode! =null) {
if(enumLast.key==enumNode.key)
{
System.out.printf("%-5s",enumLast.key+"- >");
enumLast=enumLast.right;
enumNode=enumNode.right;
}
else{
enumLast=enumLast.right;
System.out.printf("%-5s".""); } } teamNode=teamNode.down; index++; System.out.println(); }}public static void main(String[] args) {
SkipList<Integer>list=new SkipList<Integer>();
for(int i=1; i<20; i++) { list.add(new SkipNode(i,Awesome!));
}
list.printList();
list.delete(4);
list.delete(8); list.printList(); }}Copy the code
Put the meter to the test and it looks perfect (brag).
Original is not easy, Bigsai please dig friends to help with two things:
-
Like, watch, share and support, you must be the source of my creation power.
-
Wechat search “Bigsai”, follow my public account, not only free to send you e-books, I will also the first time to share knowledge and technology in the public account. Can also pull you into the force buckle punch group together to punch LeetCode.
See you next time!