preface

Ordered sets are very common in our daily life, such as ranking students by their grades, or ranking gamers by their scores. For the underlying implementation of ordered collections, we can use arrays, linked lists, balanced trees, and so on. Arrays are not easy to insert or remove elements; Linked list query efficiency is low, need to traverse all elements; Structures such as balanced trees or red-black trees are efficient but complex to implement.

Therefore, Redis adopted a new data structure – skip list. Skip lists are as efficient as red-black trees, but they are far simpler to implement than red-black trees.

Let’s start our learning journey today!

Basic knowledge of

In accordance with the convention, we will be involved in the following concepts, here to do a brief introduction to facilitate later learning.

&0xFFFF

0x is an identifier used to indicate hexadecimal.

F is the hexadecimal 15, whose binary representation is 1111.

FFFF is 1111 1111 1111 1111 1111.

&0xFFFF is a bit operation with 0xFFFF, just the lower 16 bits.

Introduction to the

Skip lists are the underlying data structures of zsets (ordered sets). Skip list can efficiently keep the elements in order, and achieve a relatively simple and intuitive balanced tree. Redis’s Skip lists are based on the algorithm described by William Pugh in Skip Lists: A Probabilistic Alternative to Balanced Trees. A few changes have been made:

  1. Repeated scores are allowed;
  2. Comparisons involve not only the keys, but also possibly node data (if the keys are equal).
  3. There is a backward pointer, so it is a two-way linked list for easy implementationzrevrangeSuch as the order.

Evolution of skip lists

Skiplist, first of all it’s a list. In fact, it was developed on the basis of ordered linked lists.

Ordinary ordered linked list

Let’s first look at an ordered list. An ordered list is a data structure in which all elements are arranged in ascending or descending order, with each node having a next pointer to the next node and the last node having a NEXT pointer to NULL. An increasing ordered chain is represented as follows:

As shown in the figure, if we want to query the element whose value is 61, we need to start from the first element and search and compare backward in order of 1 -> 11 -> 21 -> 31 -> 41 -> 51 -> 61, with a total of 7 comparisons, and the time complexity is O(N). The insertion and deletion of the ordered linked list need to find the appropriate position before modifying the next pointer. The modification operation basically does not consume time, so the time of insertion, deletion and modification of the ordered linked list mainly lies in the search of elements.

The first evolution of an ordinary ordered list

If we add a pointer to each of the two adjacent nodes, let the pointer point to the next node, as shown in the figure below:

The new pointer connects to a new list, but it contains only half as many nodes (1,21,41,61). Now when we want to look for 61, we look along the new list (in the direction of the green pointer). The search order is 1 -> 21 -> 41 -> 61, a total of 4 comparisons, the number of comparisons is about half of the original.

The second evolution of ordinary ordered linked lists

In the same way, we can continue to add a pointer to each of the two adjacent nodes on the newly generated list at the upper level to view the third level list, as shown in the figure below:

The newly added pointer connects to a new list that contains only half the number of nodes (1,41) of the second layer. Now when we want to find 61, we look along the new list (in the direction of the red pointer). The next pointer to 41 points to null, and the next pointer to 41 points to null. The next pointer to 41 points to null, and the next pointer to 41 points to null. That’s one less search than the last one. When the data volume is large, this advantage is even more obvious.

Regular ordered chain show becomes Redis skiplist

Skiplist was inspired by this idea of a multilayer linked list.

In the way the list is generated above, the number of nodes in each layer of the list is half of the number of nodes in the next layer, so that the search process is very similar to a binary search, so that the time complexity of the search can be reduced to O(logN).

But the insertion of a new node disrupts the strict 2:1 relationship between the number of nodes on the list. If this correspondence is to be maintained, all nodes following the newly inserted node (including the newly inserted node) must be readjusted, which reduces the time complexity back to O(N). The same goes for deleting data.

To avoid this problem, Skiplist does not require a strict mapping between the number of nodes in the two adjacent layers of the linked list. Instead, it randomly selects a level for each node. The newly inserted node will determine whether the node is on the linked list of this layer according to its own level.

Skip list nodes and structures

As we can see from the above, the skip list is made up of multiple nodes, each node is made up of many layers, and each layer has a pointer to the next node in the layer.

The skip list mainly involves server.h and t_zset.c, in which the data structure of the skip list is defined in server.h, and the section operation of the skip list is defined in t_zset.c.

Next, let’s take a look at how skip lists are implemented.

Skip list node

typedef struct zskiplistNode {
    sds ele;
    double score;
    struct zskiplistNode *backward;
    struct zskiplistLevel {
        struct zskiplistNode *forward;
        unsigned long span;
    } level[];
} zskiplistNode;
Copy the code

This structure contains the following properties:

  • Ele: SDS type, used to store string data.
  • Score: The score used to store the sort.
  • Backward: Backward pointer that points only to the node before the lowest level of the current node, the head node, and the first node — Backward points to NULL and is used to traverse the skip list from back to forward.
  • Level: the number of layers of nodes, which is a flexible array. Each node has a different array length (because the number of layers is different). When generating skip list nodes, values ranging from 1 to 64 are randomly generated, and the higher the value, the lower the probability of occurrence.

Each item in the level array contains the following two elements:

  • Forward: points to the next node in this layer. The forward of the tail node points to NULL.
  • Span: the number of elements between the node pointed to by forward and this node. The greater the span value, the more nodes to be skipped. (Between two adjacent nodes, the span of the previous node is 1)

Skip list is one of the underlying implementations of Redis ordered collection. Therefore, ele of each node stores the member member value of the ordered collection, and Score stores the member Score value. The points of all nodes are sorted in descending order. When the members of an ordered collection have the same points, the nodes are sorted in lexicographical order of members.

Skip list structure

typedef struct zskiplist {
    struct zskiplistNode *header, *tail;
    unsigned long length;
    int level;
} zskiplist;
Copy the code

The structure contains the following properties:

  • Header: Points to the skip list header node. The head node is a special node in the skip list that has 64 elements in the level array. The head node does not store any member and Score values in the ordered collection. The ele value is NULL and the score value is 0. It doesn’t count the total skip list length. When the header node is initialized, 64 elements have a forward value of NULL and a span value of 0.
  • Tail: indicates the tail node of the skip table.
  • Length: indicates the length of the skip list, indicating the total number of nodes except the head node.
  • Level: indicates the height of the skip list.

Through the attributes of skip list structure, we can see that the program can quickly obtain the head node, tail node, length and height of skip list under the time complexity of O(1).

Basic operation

Now that we know the definition of skip list nodes and skip list structures, let’s look at the create, insert, find, and delete operations.

Creating skip Lists

zskiplist *zslCreate(void) {
    int j;
    zskiplist *zsl;

    zsl = zmalloc(sizeof(*zsl)); // Initialize the memory space
    zsl->level = 1; // Set the number of layers to a minimum of 1
    zsl->length = 0; // Set length to 0
    zsl->header = zslCreateNode(ZSKIPLIST_MAXLEVEL,0.NULL); ZSKIPLIST_MAXLEVEL=64; // Create skip list header node with ZSKIPLIST_MAXLEVEL=64
    for (j = 0; j < ZSKIPLIST_MAXLEVEL; j++) { // Assign a value to each layer of the header node in turn
        zsl->header->level[j].forward = NULL;
        zsl->header->level[j].span = 0;
    }
    zsl->header->backward = NULL; // Set the fallback pointer of the header to NULL
    zsl->tail = NULL; // Set the last node to NULL
    return zsl;
}
Copy the code

As you can see, the skip list is created as follows:

First declare a memory space of size sizeof(zskiplist).

Then set the layer level to 1 and skip list length to 0. Then create the header node, where ZSKIPLIST_MAXLEVEL is defined as follows:

#define ZSKIPLIST_MAXLEVEL 64 
Copy the code

The highest number of layer nodes is 64, and our head node is the highest number of layers.

The head node is a special node that does not store member information for ordered collections. The header node is the first node inserted into the skip list, and each entry in the level array has a NULL forward value and a span value of 0.

We then set the backward pointer backward and tail of the head node to NULL.

It makes sense to initialize memory and then set the default values for each member of the skip list structure in turn.

Create a skip list node

Create skip list node code as follows:

zskiplistNode *zslCreateNode(int level, double score, sds ele) {
    zskiplistNode *zn =
        zmalloc(sizeof(*zn)+level*sizeof(struct zskiplistLevel)); // Apply zskiplistNode + flexible array size space
    zn->score = score; // Set node branch
    zn->ele = ele; // Set node data
    return zn;
}
Copy the code

The code for creating skip list nodes is also easy to understand.

First, allocate the memory space, which is the size of the zskiplistNode and the size of the level array.

The last element of the structure of zskiplistNode is a flexible array. The size of the flexible array needs to be specified when applying for memory. The memory size occupied by a node is the sum of the memory size of zskiplistNode and the memory size of each zskiplistLevel.

Then assign the score and ele of the node respectively.

Insert the node

The insertion node is a little bit more important, and a little bit more difficult, so let’s look at it carefully.

First attach the insert node code.

zskiplistNode *zslInsert(zskiplist *zsl, double score, sds ele) {
    zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x; // update[] the array is used to store the previous node in each layer of the node to be inserted
    unsigned int rank[ZSKIPLIST_MAXLEVEL]; // The rank[] array records the step size of the current layer from the header node to the update[I] node.
    inti, level; serverAssert(! isnan(score)); x = zsl->header;// Iterates over the nodes, as a result of finding the previous node on each layer of the inserted node
    for (i = zsl->level- 1; i >= 0; i--) { // Iterate from top to bottom
        rank[i] = i == (zsl->level- 1)?0 : rank[i+1]; // Assign the initial value of the array to rank[]. The top layer starts with the header node, so 0, and the bottom layer starts with the last node, so rank[I +1].
        while (x->level[i].forward &&
                (x->level[i].forward->score < score ||
                    (x->level[i].forward->score == score &&
                    sdscmp(x->level[i].forward->ele,ele) < 0))) / / forward rules exist and forward node (score less than to be inserted into the forward node score | | (score equal to be inserted into the forward node score && forward node element dictionary value is less than to be inserted into the node element dictionary))
        {
            rank[i] += x->level[i].span; // plus the span of x
            x = x->level[i].forward; // The node moves forward
        }
        update[i] = x; // Record the previous node in the current layer of the inserted node in the update[] array
    }
    level = zslRandomLevel(); // Generate a random layer
    if (level > zsl->level) { // The layer height of the new generated node is significantly higher than the current jump surface
        for (i = zsl->level; i < level; i++) { // Update only the higher parts
            rank[i] = 0; // It's 0 because it's a head node
            update[i] = zsl->header; // This layer only has header nodes
            update[i]->level[i].span = zsl->length; // Since forward points to NULL, span should be all nodes of the skip list, so span is the length of the skip list
        }
        zsl->level = level; // Update the height of the skip list
    }
    x = zslCreateNode(level,score,ele); // x is assigned to the newly created node
    for (i = 0; i < level; i++) {
        x->level[i].forward = update[i]->level[i].forward; // Update the forward pointer in the level[I] layer of the x node
        update[i]->level[i].forward = x; // Update the forward pointer in the level[I] layer of the update[I] node

        x->level[i].span = update[i]->level[i].span - (rank[0] - rank[i]); // Update the level[I] layer span of the x node
        update[i]->level[i].span = (rank[0] - rank[i]) + 1; Update [I] node level[I] layer span
    }

    for (i = level; i < zsl->level; i++) { // When the height of the newly inserted node is smaller than the height of the skip list, the span of the update[] node needs to be updated with fewer layers, i.e. +1
        update[i]->level[i].span++;
    }

    x->backward = (update[0] == zsl->header) ? NULL : update[0]; // Update x backward pointer, NULL if update[0] is the head node, otherwise update[0]
    if (x->level[0].forward) // Update the x node with a subsequent node at layer 0, then backward points to x node, otherwise x node is the last node, and then the tail pointer is pointed to x node
        x->level[0].forward->backward = x;
    else
        zsl->tail = x;
    zsl->length++; // Skip list length +1
    return x;
}
Copy the code

Let’s take a look at the skip list implementation diagram:

Let’s say we want to insert a node 45. We first need to find the insertion location, and then change the locations affected by the node insertion, such as the level of the skip list, the forward pointer at each level of the previous node, and so on.

In the image below, I’ve highlighted in red which areas are affected and need to be modified.

Therefore, the steps for inserting a node are as follows:

  1. Find the position to insert;
  2. Adjust skip list height;
  3. Insert the node and adjust the forward pointer and span for each layer of the affected node;
  4. Adjust the backward.

Now let’s consider the following questions:

  1. Why do you need to find where to insert first and then adjust the height of the skip list?

    Because we look for nodes based on the height of the skip list, we first have to find the highest level, and then go down layer by layer until we find the node. If the level of the newly inserted node is larger than the level of the jump list, we will adjust the height of the jump list first, then we will start from the adjusted height, and then look backwards, but the forward pointer of the layer points to NULL, we will not find the node.

  2. How do I adjust the forward pointer and span for each layer of the affected node and the newly inserted node?

    1. We should search for the affected nodes by layer, that is, the last node in each layer before the node is inserted. We need to record these nodes for later modification, which is recorded as the update[] array in the code.
    2. The forward pointer of each layer of the newly inserted node points to the same node as the forward pointer of the previous node in that layer.
    3. The forward pointer of the affected node at each level points to the newly inserted node.
    4. We need to be able to calculate the distance between the update[] node and the newly inserted node X. If this is not easy to calculate, we will calculate the distance between the update[] node and the node X minus 1 before the newly inserted node X, plus 1 to get the distance to X.
      1. To calculate the distance between update[I] and update[0], we can calculate the distance between update[I] and update[0] based on header.

With that in mind, let’s step through the insert node code.

Variable definitions

    zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x;
    unsigned int rank[ZSKIPLIST_MAXLEVEL];
    int i, level;
Copy the code

Define two arrays:

  • Update [] : When a node is inserted, the previous node on each layer of the node to be inserted needs to be updated. Since nodes are updated differently at each layer, the nodes to be updated at each layer are recorded in Update [I].
  • Rank [] : records the step of the current layer from the header node to the update[I] node. This is used when updating the span of update[I] and setting the span of new inserted nodes.

Find insertion position

    x = zsl->header;
    for (i = zsl->level- 1; i >= 0; i--) { // Start at the top and go down
        rank[i] = i == (zsl->level- 1)?0 : rank[i+1]; / / statistics
        while (x->level[i].forward &&
                (x->level[i].forward->score < score ||
                    (x->level[i].forward->score == score &&
                    sdscmp(x->level[i].forward->ele,ele) < 0)))
        {
            rank[i] += x->level[i].span;
            x = x->level[i].forward;
        }
        update[i] = x;
    }
Copy the code

According to the above code logic, the search route for nodes with values 25, 35 and 45 to find the insertion position is shown in the figure below:

Let’s go through the code step by step.

    for (i = zsl->level- 1; i >= 0; i--) 
Copy the code

For (ZSL ->level-1); for (ZSL ->level-1);

        rank[i] = i == (zsl->level- 1)?0 : rank[i+1];
Copy the code

The rank[] array is used to record the step from the header node to the Update [I] node in the current layer.

From the figure above, we can see that the node lookup route is “right -> down -> right -> down”.

Rank [ZSL ->level-1] [ZSL ->level-1] [ZSL ->level-1]

When we go down, we actually look up the last node from the upper layer, for example, when we look up the 45 node in the figure above, and when we go down from the fourth layer to the third layer, we look up the 41 node, and the value of rank[2] is the same as the value of rank[3] at the fourth layer.

        while (x->level[i].forward &&
                (x->level[i].forward->score < score ||
                    (x->level[i].forward->score == score &&
                    sdscmp(x->level[i].forward->ele,ele) < 0)))
        {
            rank[i] += x->level[i].span;
            x = x->level[i].forward;
        }
        update[i] = x;
Copy the code

This code illustrates our two comparison principles for finding the insertion location of a node:

  1. The next node exists on the current node && The score of the next node is smaller than that of the node to be inserted
  2. There is a next node at the current node layer && the score of the next node is equal to the score of the node to be inserted && the field sort of the value of the next node is less than the dictionary sort of the value of the node to be inserted

That’s what we’re talking about, comparing scores when they’re not equal, comparing scores when they’re equal, lexicographical sorting of values, so there’s no such thing as two things being equal.

Rank [I] = = rank[I] = = = = = = = = = = = = = = = = = = = = = = = = = = =

The node moves forward to the next node.

When a layer completes the loop, two conditions should be satisfied:

  1. x->forward == NULL
  2. x->forward ! = NULL && (x->forward.score > score || (x->forward.score == score && sdscmp(x->level[i].forward->ele,ele) > 0))

Update [I] = x; update[I] = x;

Loop until the end of the first layer, at which point we have found the position to insert, and record the previous node of each layer of node insertion in the update[] array, as well as the step of each node in the Update [] array to the header node.

Select * from update where length=3 and level=2; select * from update where rank =3 and level=2;

Obtain the layer height of the new node

    level = zslRandomLevel();
Copy the code

The height of each node is randomly generated, that is, the so-called probability balance, rather than forced balance, so the insertion and deletion of nodes is more concise and efficient than the traditional balanced tree algorithm.

The generation method is as follows:

int zslRandomLevel(void) {
    int level = 1;
    while ((random()&0xFFFF) < (ZSKIPLIST_P * 0xFFFF)) / / ZSKIPLIST_P = 0.25
        level += 1;
    return (level<ZSKIPLIST_MAXLEVEL) ? level : ZSKIPLIST_MAXLEVEL;
}
Copy the code

In the above code, level starts with a value of 1. Through the while loop, we generate a random value each time. We take the lower 16 bits of this value as x, and increse the value by 1 when x is less than 0.25 times 0XFFFFFF. Otherwise exit the while loop, eventually returning the minimum of both level and ZSKIPLIST_MAXLEVEL.

The expected height of the node is calculated below. Suppose p = ZSKIPLIST_P;

  1. The probability of node layer 1 is (1-p).
  2. The probability of node layer 2 is P (1-p).
  3. The probability of node layer 3 is P ^2(1-p).
  4. The probability that the node layer is N is p^n-1(1-p).

Therefore, the expected height of the node is:

When p=0.25, the expected height of skip table nodes is 1/(1-0.25)≈1.33.

Update jump surface height and the update[] and rank[] arrays

    if (level > zsl->level) {
        for (i = zsl->level; i < level; i++) {
            rank[i] = 0;
            update[i] = zsl->header;
            update[i]->level[i].span = zsl->length;
        }
        zsl->level = level;
    }
Copy the code

This operation is performed only when the height of the node to be inserted is higher than that of the current skip list.

zsl->level = level; It is ok to assign the height of the skip list to the highest height.

Let’s take this graph as an example:

I = ZSL ->level; I = ZSL ->level; , start at the second floor. Layer 2 only needs to update the header node, so update[I] = ZSL ->header. Rank [I] is 0.

Update [2]->level[2]. Span = span; update[2]->level[2]. Update [2]->level[2].span = update[2]->level[2].span = update[2]->level[2].span

As to whyupdate[2]->level[2].spanThe value of is set to the total length of the skip list that we can start fromspanTo think about the definition of theta.spanIs the number of elements between the node pointed to by forward and this node. whileupdate[2]->level[2].forwardIt points to the NULL node and is separated by all nodes of the skip list, so it is assigned to the total length of the skip list.

The jump list after adjusting the height is shown in the figure below:

Insert the node

    x = zslCreateNode(level,score,ele);
    for (i = 0; i < level; i++) {
        x->level[i].forward = update[i]->level[i].forward; 
        update[i]->level[i].forward = x;

        x->level[i].span = update[i]->level[i].span - (rank[0] - rank[i]);
        update[i]->level[i].span = (rank[0] - rank[i]) + 1;
    }
    
    for (i = level; i < zsl->level; i++) {
        update[i]->level[i].span++;
    }
Copy the code

The change to the forward value is easy to understand as simply inserting nodes into a linked list.

Update [I]->level[I]. Span – (rank[0] -rank [I]) + 1 update[I]->level[I]. Span – (rank[0] -rank [I]) + 1

Let’s look at the figure below to get a deeper understanding of the span assignment process.

First, we should know something about the following points:

  1. Rank [I] indicates the step from the header node to the Update [I] node at the current layer.
  2. Rank [0] indicates the step from the header node to the update[0] node at layer 0.
  3. Rank [1] indicates the step from the header node to the update[1] node at layer 1.
  4. The span in level[0] should always be 1 (meaning that 1 refers to the nodes pointed to by forward, but not the nodes themselves).

Let’s take the Update [1] node as an example, and the same goes for other nodes.

So, rank [0] – rank the update [1] [1] is actually node to update [0] distance (score = 1 nodes to score = 21 node distance)

Update [1]->level[1]. Span = “update[1]->level[1]. Span”; The remaining distance is between the newly inserted node and the update[1]->level[1]. Forward node.

Update [0] -rank [1] + 1 = update[1]->level[1].span = rank[0] -rank [1] + 1 = update[1]->level[1].span = = update[1]->level[1].span

Let’s abstract the problem:

Let’s say I have nodes A and B, and I insert X between them,

  • rank[0] - rank[i]It’s the distance from A to the node before X, X minus 1;
  • update[i]->level[i].spanIt’s the distance from A to B;
  • update[i]->level[i].span - (rank[0] - rank[i])We just calculate the distance from X to B.
  • update[i]->level[i].span = (rank[0] - rank[i]) + 1It’s the distance from A to X minus 1 plus 1, so it’s the distance from A to X.

The principle of calculation is to open left and close right.

Using the above algorithm, let’s actually walk through the insertion process. The value of level is 3, so the for loop can be executed three times. The insert process is as follows:

  1. The first for loop

    1. The forward of level[0] of x is the forward node of level[0] of update[0], that is, x->level[0]. The forward node is score=41.
    2. The next node in level[0] of update[0] is the newly inserted node.
    3. Rank [0]-rank[0]=0, update[0]->level[0].span=1, so x->level[0].span=1.
    4. The update [0] – > level [0]. Span = 0 + 1 = 1.

    The jump list after inserting the node and updating Layer 0 looks like this:

  2. Second for loop

    1. The forward of level[1] of x is the forward node of level[1] of update[1], that is, x->level[1]. The forward node is NULL.
    2. The next node of level[1] of update[1] is the newly inserted node.
    3. Rank [0]-rank[1]=1, update[1]->level[1].span=2, so x->level[1].span=1.
    4. The update [1] – > level [1]. The span = 1 + 1 = 2.

    The jump list after inserting the node and updating layer 1 looks like this:

  3. Third for loop

    1. The forward of level[2] of x is the forward node of level[2] of update[2], that is, x->level[2]. The forward node is NULL.
    2. The next node in level[2] of update[2] is the newly inserted node.
    3. Update [2]->level[2].span=3, so x->level[2].span=1.
    4. The update [2] – > level [2]. The span = 2 + 1 = 3.

    After inserting the node and updating layer 2, the skip list looks like this:

The height of the newly inserted node is greater than the height of the original skip list, so the following code will not run. Update [I] = ZSL ->level-1; update[I] = ZSL ->level-1; update[I] = ZSL ->level-1 Just add one to the span of each level.

    for (i = level; i < zsl->level; i++) {
        update[i]->level[i].span++;
    }
Copy the code

Adjust the backward

    x->backward = (update[0] == zsl->header) ? NULL : update[0];
    if (x->level[0].forward)
        x->level[0].forward->backward = x;
    else
        zsl->tail = x;
    zsl->length++;
    return x;
Copy the code

According to the assignment process of update, the previous node of the newly inserted node must be update[0]. Since there is only one backward pointer for each node, which has nothing to do with the number of layers of this node, when the inserted node is not the last node, the backward of the inserted node needs to be updated to point to update[0]. If the new insert node is the last node, the end node of the skip list needs to be updated to the new insert node. After insertion and order, the length of the update skip list is increased by 1.

The jump list after the new node is inserted looks like the figure below:

Remove nodes

With the learning from inserting nodes above, it should be easier to understand how to delete nodes.

We can simply delete a node in two steps:

  1. Find the node to be deleted.
  2. Set span and forward.

Delete node code as follows:

void zslDeleteNode(zskiplist *zsl, zskiplistNode *x, zskiplistNode **update) {
    int i;
    for (i = 0; i < zsl->level; i++) {
        if (update[i]->level[i].forward == x) { Update [I]. Level [I] = level[I]
            update[i]->level[i].span += x->level[i].span - 1;
            update[i]->level[i].forward = x->level[i].forward;
        } else {// update[I]. Level [I] forward node is not x, only update span
            update[i]->level[i].span -= 1; }}if (x->level[0].forward) { // Update backward if x is not a tail node
        x->level[0].forward->backward = x->backward;
    } else { // Otherwise update the tail node
        zsl->tail = x->backward;
    }
    while(zsl->level > 1 && zsl->header->level[zsl->level- 1].forward == NULL)
        zsl->level--; // Update skip table level
    zsl->length--; // Update skip list length
}
Copy the code

Find the node that you want to delete

To find the node to be deleted, we need to use the update array. The assignment method of the array is the same as the assignment method of the Update in the inserted node. Update [2]=header, update[1] = score=1, update[0] =21 The jump list before deleting a node is as follows:

Set span and forward

The code for setting span and forward is as follows:

    for (i = 0; i < zsl->level; i++) {
        if (update[i]->level[i].forward == x) {
            update[i]->level[i].span += x->level[i].span - 1;
            update[i]->level[i].forward = x->level[i].forward;
        } else {
            update[i]->level[i].span -= 1; }}Copy the code

Let’s look at the span assignment first. The value assigned to span when deleting a node is as shown in the figure below:

Suppose we want to delete nodes with score=21, then update[0] and UPDATE [1] should be nodes with Score =1, and update[2] should be the head node.

There are two cases of updating the span and forward of the node:

  1. Update [I]->level[0] ->level[0]

    1. update[0].level[0].spanSelect * from update[0];
    2. x.level[0].spanIs the distance between x and x. lever [0].forward;
    3. update[0].level[0].span + x.level[0].spanIs the distance between update[0] and x.leeve [0]. Forward;
    4. update[0].level[0].span + x.level[0].span - 1Is the distance between update[0] and x.reuel [0]. Forward after deleting x;
    5. update[0].level[0].forwardIs the x.l evel [0]. Forward.
  2. Update [I]->level[1]

    1. Update [I]. Level [I]. Forward = NULL;
    2. Note The height of update[I] is higher than that of the x node. Therefore, you do not need to change the forward value, but only need to change the span -1.

With span and forward set, the skip list looks like this:

After the update node is completed, update the backward pointer, jump list height and length, and if x is not the last node, assign the value of backward of the node after level 0 to x. Otherwise, simply point the tail pointer of the skip list to x’s backward node. The code is as follows:

    if (x->level[0].forward) {
        x->level[0].forward->backward = x->backward;
    } else {
        zsl->tail = x->backward;
    }
Copy the code

When the deleted x node is the highest node in the skip list, and no other node has the same height as the X node, the height of the skip list needs to be reduced by 1.

Since a node has been deleted, the length of the skip list needs to be subtracted by 1.

The jump list after the node is deleted is as follows:

Delete skip list

Deleting skip lists is a little easier. After the skip list object is acquired, it starts at layer 0 of the node and iterates backwards through the forward pointer, freeing the memory without encountering a node. When the memory of all nodes is released, the skip list object is released, that is, the hop list deletion operation is completed. The following code

void zslFree(zskiplist *zsl) {
    zskiplistNode *node = zsl->header->level[0].forward, *next;

    zfree(zsl->header);
    while(node) {
        next = node->level[0].forward;
        zslFreeNode(node);
        node = next;
    }
    zfree(zsl);
}
Copy the code

Skip list application

In Redis, skip lists are primarily applied to the underlying implementation of ordered collections (another implementation of ordered collections is compressed lists).

There are two configurations for the underlying implementation of ordered collections in redis.conf:

zset-max-ziplist-entries 128 // Zset specifies the maximum number of elements in a compressed list. The default value is 128.
zset-max-ziplist-value 64 // Zset specifies the maximum length of each element in a compressed list. The default is 64.
Copy the code

The main logic for adding elements to zset is in the zaddGenericCommand function in t_zset.c. When zset inserts the first element, it determines two conditions:

  • zset-max-ziplist-entriesWhether the value of is equal to 0;
  • zset-max-ziplist-valueIs less than the string length of the element to be inserted.

If any condition is met, Redis will adopt skip list as the underlying implementation, otherwise adopt compressed list as the underlying implementation.

        if (server.zset_max_ziplist_entries == 0 ||
            server.zset_max_ziplist_value < sdslen(c->argv[scoreidx+1]->ptr))
        {
            zobj = createZsetObject(); // Create a skip list structure
        } else {
            zobj = createZsetZiplistObject(); // Create a compressed list structure
        }
Copy the code

Normally, zset_max_ziplist_entries are not configured to be 0, and elements do not have long strings, so when creating ordered collections, the default is to have an underlying implementation of compressed lists. When zset inserts a new element, it determines two conditions:

  • The number of elements in the Zset is greater thanzset_max_ziplist_entries;
  • The length of the string to insert the element is greater thanzset_max_ziplist_value.

When slow as any condition, Redis will change the underlying implementation of zset from compressed list to skip list, code as follows:

            if (zzlLength(zobj->ptr) > server.zset_max_ziplist_entries ||
                sdslen(ele) > server.zset_max_ziplist_value)
                zsetConvert(zobj,OBJ_ENCODING_SKIPLIST);
Copy the code

It is worth noting that after the zset is converted to a skip list, it does not return to a compressed list even if the elements are gradually removed.

conclusion

In this chapter, we introduce the evolution, basic principles, and implementation of skip lists.

The evolution process is to extract some points at interval on the basis of the linked list and form a new linked list at the top, which is similar to dichotomy and achieves the effect of half time, but different from dichotomy, because the layer height of newly inserted nodes is randomly generated, which is the so-called probability balance. This ensures that the average complexity of skip list query, insert and delete is O(logN).

The implementation of skip lists, we focused on insertion nodes, where we introduced two arrays, the Update [] and the Rank [] arrays, which we need to understand in order to understand the insertion process.

With that in mind, let’s ask ourselves a few questions:

  1. What is skip list? How did skip lists evolve from ordered lists? What’s the time complexity?
  2. How do skip lists maintain the balance of linked lists? (Key point: random functions generate layers, the higher the number of layers, the lower the probability)
  3. How do skip lists insert nodes? (Key points: update[] and rank[] array, update[] array records each layer of nodes before insertion, rank[] array records the distance between the head node and the update[] node)
  4. Skip list structure? (Key points: Length, level, header, tail)
  5. Skip list node structure? (Key points: score, backward, ele, level)
  6. What data structures are used in the Implementation of Zset in Redis? When do you use a compressed list? When do you use skip lists? (Key: Entries =0, value>128)
  7. Why did Redis choose skip lists over other balance structures? (Key: As efficient as a red-black tree, but easier to implement)

If you can figure out the answers to these questions, you should be familiar with skip lists.

Reference documentation

  • Redis skiplist (skiplist) implementation
  • Redis 5 Design and source code Analysis – edited by Lei Chen
  • Design and Implementation of Redis. By Huang Jianhong