Briefly record the basic data structure of REDis and the principle of internal implementation. In fact, each structure can be written separately, there is time to break down the details.

String

Equivalent to an ArrayList in Java. Inside is a character array (dynamic array).

C defines an array of strings “redis”. The structure of the array is as follows:

[' r ', 'e', 'd', 'I', 's', '\0']Copy the code

The last ‘\0’ is a null character, indicating the end of the string.

Redis does not directly use the string structure of C, but encapsulates it, using the abstract type of Simple Dynamic String (SDS).

In Redis, SDS is its own string by default. C strings are used only in places where string variation is unlikely.

SDS

The definition is as follows:

struct sdshdr {
    // The length of the string
    int len;
    // The number of free bytes
    int free;
    // A byte array to store strings
    char buf[];
};
Copy the code

It’s got len and it’s got free.

Note: the size of buf is len+free+1, where the extra byte is used to store ‘\0’.

The structure of the storage is:

Why not use THE C language string implementation instead of SDS? What are the benefits of this implementation?

  • Constant complexity gets the length of the string. Get the length directly from len. Getting too long arrays is not a redis bottleneck.
  • Prevent buffer overflows. When the string is modified, use free to check whether the space is sufficient. If not, expand the space.
  • Reduce the number of memory reallocations for modifying strings.
    • C does not record the length of the string. Therefore, if you want to modify the string, you must reallocate the memory (free and then apply). If you do not reallocate the string, the memory buffer will overflow when the string length increases, and the memory will leak when the string length decreases.
    • For SDS, due to the existence of len attribute and free attribute, two strategies of space pre-allocation and lazy space release are implemented for SDS
      • Space preallocation: When you expand a string, you expand more memory than you need. This reduces the number of memory reallocations required for successive string growth operations.
      • Lazy space free: When a string is shortened, the program does not immediately use memory reallocation to reclaim the shortened extra bytes. Instead, it uses the free property to record the number of bytes for later use.
  • Binary security, the C string is marked by a null character as the end of the string. Some special binary file contents may contain null characters, so the C string cannot be accessed correctly. SDS is judged by LEN to be final.
  • Compatible with some C string functions
    • Although SDS is binary safe, it also follows the convention that every string ends in an empty string, which makes it possible to reuse some functions from the C library

      .

capacity

If the data is less than 1 MB, double the current space.

If the data is larger than 1 MB, expand 1 MB space at a time.

The maximum length of the string is 512 MB, that is, the maximum length can be expanded to 512 MB.

List

Similar to LinkedList in Java. Inside is a bidirectional linked list.

Linked list nodes are defined as follows:

typedef  struct listNode{
       // The front node
       struct listNode *prev;
       // back node
       struct listNode *next;
       // Node value
       void *value;  
}listNode
Copy the code

Bidirectional linked list definition:

typedef struct list{
     // Table header node
     listNode *head;
     // Table end node
     listNode *tail;
     // The number of nodes in the linked list
     unsigned long len;
     // Node value copy function
     void (*free) (void *ptr);
     // Node value release function
     void (*free) (void *ptr);
     // Node value comparison function
     int (*match) (void *ptr,void *key);
}list
Copy the code

There’s nothing to be said for bidirectional lists, but Redis actually stores more than a simple LinkedList internally. It’s a QuickList structure. And when the list has fewer elements, it is optimized to a ZipList (compressed list) structure.

ZipList (Compressed list)

In the case of a small number of elements, to save space, a contiguous memory space is used to store the list. This is called ziplist.

A ZIplist is a sequential storage structure composed of a series of specially encoded contiguous memory blocks. Similar to an array, a ZIplist is stored contiguously in memory, but unlike an array, the size of memory occupied by each element of a ziplist can be different to save memory.

Ziplist structure is shown as follows:

  • Zlbytes: indicates the ziplist length (in bytes), which is a 32-bit unsigned integer

  • Zltail: The offset of the last ziplist node, useful when traversing the ziplist or pop tail node backwards.

  • Zllen: indicates the number of ziplist entries

  • Entry: node. A place to store elements.

  • Zlend: the value 0xFF is used to mark the end of a ziplist

The most important implementation is the implementation of Entry.

entry

The structure of entry is shown in the figure.

  • The previous_entry_length field indicates the length of the previous element in 1 or 5 bytes.
    • If the length of the current element is less than 254 bytes, the previous_entry_length field is represented by one byte.
    • If the length of an element is greater than or equal to 254 bytes, it cannot be represented by a single byte.

  • Encoding: The encoding stores the content type and length of the node’s content. There are two types of encoding, one is a byte array and the other is an integer. The length of the encoding field can be 1, 2, or 5 bytes.
  • Content: The content field is used to hold the content of the node. The content type and length of the node are determined by Encoding.

QuickList

QuickList is simply a series of ziplists linked together using bidirectional Pointers.

Its internal structure is as follows:

Hash

Similar to a HashMap in Java. Inside is the array + zipper way. It’s just that the dictionary in Redis can only be strings.

The Hash type is used to store objects in Redis.

The hash table is defined as follows

typedef struct dictht{
     // Hash table array
     dictEntry **table;
     // Hash table size
     unsigned long size;
     // Hash table size mask, used to calculate index values
     // always equal to size-1
     unsigned long sizemask;
     // The number of existing nodes in the hash table
     unsigned long used;
 
}dictht
Copy the code

Hash table consists of array table, in which each element points to dict.h/dictEntry structure, which is defined as follows:

typedef struct dictEntry{
     void *key;
     union{
          void *val;
          uint64_tu64;
          int64_ts64;
     }v;
     / / the next
     struct dictEntry *next;
}dictEntry
Copy the code

The internal structure is shown as follows:

Here are some points that are as important as hashMap.

The hash algorithm

#1Hash = dict->type->hashFunction(key); #2Sizemask index = hash & dict->ht[x].sizemask;Copy the code

Arithmetic and mask the hash value of the key and the length of the array.

Expansion & contraction

capacity

Trigger condition: The program will extend the hash table if any of the following conditions are met:

  • The server is not currently running bgsave or bgrewriteaof, and the hash table load factor is >=1
  • The server is currently running bgsave or bgrewriteaof with hash table load factor >=5

Each extension creates another hash table by doubling the space already used by the original hash table.

shrinkage

Trigger condition: When the load factor is less than 0.1, the program will shrink the hash table

Each shrink creates a new hash table based on doubling the size of the used space.

Incremental rehash

This is a very different point from hashMap. Redis uses progressive hash.

Because rehash is a time-consuming operation, if the dictionary is large it will block subsequent operations during rehash. Progressive rehash preserves both the old and new hash structures. In the subsequent scheduled tasks and hash operation commands, data from the old hash table is slowly migrated to the new hash structure.

The hash process is shown below.

Create a new Hash table ht[1].

Data from the old table is migrated step by step to the new table, and the old table is automatically reclaimed.

Ht [0], HT [1],ht[0],ht[1]

  • If a key is found, it is first searched in HT [0] and then in HT [1].
  • New key-value pairs added to the dictionary are stored in HT [1] instead of HT [0].

Set

Similar to a HashSet in Java. The inside is like a special dictionary, except that the values of the dictionary are null.

ZSet

Similar to Java HashMap and SortSet combination.

ZSet ensures that the elements of the set are unique and can guarantee a certain order. ZSet assigns a score to each value, and we sort by score.

There are two underlying storage structures for Zsets in Redis: Ziplist and skiplist.

ziplist

Ziplist is used to store elements when the following two conditions are met.

  • The number of saved elements is less than 128
  • The length of all saved elements is less than 64 bytes

Since we are inserting into the linked list in fractional order, space and time are compromised, with the advantage of space > advantage of time.

skiplist

Skiplist is used when the above two conditions are not met (space advantage < time advantage).

The core of zset’s Skiplist consists primarily of a dict object and a Skiplist object.

Structure definition:

typedef struct zset{
     / / jump table
     zskiplist *zsl;
     / / a dictionary
     dict *dice;
} zset
Copy the code
  • The dictionary is used to store the mapping between value and score.

  • Skiplist is a linked list (skiplist) of ordered elements.

The two data structures share the members and points of the same element through Pointers, so there are no duplicate members and points, resulting in a waste of memory.

Structure diagram of skip table:

Skip table is a convenient data structure for searching ordered linked list. The time complexity of finding a single key, skiplist, and balanced tree is O(log n), roughly the same.

Procedure for creating a hop table:

reference

  • Redis Design Implementation
  • Redis Deep Adventure
  • Some images source: link