Introduction to redis database

Redis is a key-value database. Redis is also known as a non-relational database, as opposed to relational databases such as MySQL. For relational databases like MySQL, the table structure is complex and contains many fields. You can use SQL statements to achieve very complex query requirements. Redis only contains two parts: “key” and “value”. You can only query “value” by “key”. Because of this simple storage structure, Redis is also very efficient in reading and writing.

In addition, Redis is primarily used as an in-memory database, that is, data is stored in memory. Although it is often used as an in-memory database, it also supports storing data on hard disk, known as persistence.

In Redis, the data type of key is string. However, in order to enrich the way of data storage and facilitate the use of developers, there are many data types of value. The commonly used data types are string, column table, dictionary, set and ordered set.

String (string)

Let’s take a look at the data structure of Redis RedisObject, as follows:

typedef struct redisObject {
    // External type string list set Hash zset 4 bits
    unsigned type:4;
    // The underlying storage mode is 4bit
    unsigned encoding:4;
    // LRU time 24bit
    unsigned lru:LRU_BITS;
    // Reference count 4 bytes
    int refcount;
    // The pointer to the object is 8byte
    void *ptr;
} robj;
Copy the code

As we all know, Redis is written in C. In C, the standard string form is terminated by the null character \0, but the string in Redis does not use the C string directly. The main reason is that the standard function strlen can be called in C language to obtain the string length. The time complexity of this function is O(N). Since Redis is single-threaded, the time complexity cost is a little high.

Redis uses different types to store different objects. There are different types of encoding stored for the same type. For strings, there are three basic encoding methods: int, embstr, and RAW.

  • Int: If the string to be stored is full of numbers, the string is stored in int mode.
  • Embstr: If the string contains less than 44 characters, embstr is used to store the string.
  • Raw: When the string length is greater than 44 characters, the raw mode is used to store data.

Use object Encoding key to view the encoding type of the key, as shown below:

For the embstr and RAW encoding types, the storage is not quite the same. For the EMbSTR type, the RedisObject header and SDS are linked together, but for the RAW type, the memory addresses are not consecutive. As shown below:

SDS

When we introduced the string storage type, we said that embSTR and RAW are stored differently, but PTR Pointers end up pointing to an SDS structure. So what is SDS? The strings in Redis are called Simple Dynamic Strings, or SDS for short.

Compared with the original string structure of ordinary C language, SDS has more header information of SDSHDR. The basic SDSHDR data structure is shown as follows:

struct sdsshr<T>{
    T len;// Array length
    T alloc;// Array capacity
    unsigned  flags;/ / SDSHDR type
    char buf[];// The contents of the array
}
Copy the code

As you can see, the structure of SDS is somewhat similar to ArrayList in Java. Buf [] represents the actual stored string contents, alloc represents the length of the allocated array, len represents the actual length of the string, and thanks to len, Redis can get the array length in O(1) time complexity.

Advantages of SDS:

  • Length Time complexity O(1);
  • Avoid frequent memory allocation;
  • Binary security;

List

Let’s look at the list first. The list data type supports storing a set of data. This data type corresponds to two implementations, one is a ziplist (ziplist) and the other is a bidirectional circular list (actually optimized for QuickList).

When the amount of data stored in the list is relatively small, the list can be implemented in the form of compressed list. The following two conditions should be met:

  • The single data stored in the list (possibly as a string) is less than 64 bytes;
  • The number of data in the list is less than 512.

Let’s take a look at the structure of redis list with C language:

// Define the structure of the list node
typedf struct listNode{
    // The previous node
    struct listNode *prev;
    // The last node
    struct listNode *next;
    // A pointer to the value of the current node
    void *value;
}listNode;

Copy the code

ziplist

I’ll explain a little bit about compressed lists. It is not an underlying data structure, but a data storage structure designed by Redis itself. It’s kind of like an array, where you store data in a contiguous piece of memory. However, it differs from arrays in that it allows different sizes of data to be stored. The specific storage structure is also very simple, as you can see in the picture I draw below.

Ziplist structure source code as follows:

typedf struct ziplist<T>{
    // The number of characters to compress the list
    int32 zlbytes;
    // The offset from the start of the last element, used to quickly locate the last node
    int32 zltail_offset;
    // Number of elements
    int16 zllength;
    // Element content
    T[] entries;
    // End bit 0xFF
    int8 zlend;
}ziplist
Copy the code

Now let’s see, how to understand the word “compression” in the compression list? When you hear the word “compression,” the immediate response is to save memory. The reason why this storage structure saves memory is compared to the array storage idea. As we know, arrays require each element to be the same size, so if you want to store strings of different lengths, you need to use the maximum string size as the element size (say 20 bytes). When you store strings less than 20 bytes long, you waste some storage space. It sounds like a mouthful, so let me draw a picture to explain it.

On the one hand, compressed list can save memory, on the other hand, it can support the storage of different types of data. Also, because the data is stored in a contiguous memory space, it is very efficient to read data with list values by keys.

quicklist

When the amount of data stored in a list is too large to satisfy both conditions, the list is implemented using a two-way loop (qiuckList) table.

In the linked list, we talked about the data structure of a two-way circular linked list, so if you don’t remember, you can go back and review it. Here we focus on Redis bidirectional linked list encoding implementation.

Redis this kind of bidirectional linked list implementation way, is worth learning from. In addition, it defines a list structure to organize the first and last Pointers of the list, as well as the length. In this way, it will be very convenient when using.

QucikList is a hybrid of zipList and bidirectional linkedList. It splits the linkedList into segments, each of which is stored compactly using a zipList, and connects multiple Ziplists using a bidirectional pointer. The schematic diagram is as follows:

The QuickList structure is defined as follows:

typedf struct quicklist{
    // point to the header
    quicklistNode* head;
    // point to the tail node
    quicklistNode* tail;
    // Total number of elements
    long count;
    // Number of QuickListNodes
    int nodes;
    // Compress algorithm depth
    intcompressDepth; . }quickListCopy the code

Dict

Dictionary types are used to store a set of data pairs. Each data pair contains two key and value parts. Dictionary types can also be implemented in two ways. One is the compressed list that I just talked about, and the other is the hash table.

Similarly, Redis uses compressed lists to implement dictionary types only when the amount of data stored is small. Two conditions need to be met:

  • The keys and values stored in the dictionary should be smaller than 64 bytes;
  • The number of key-value pairs in the dictionary should be less than 512.

When both conditions are not met, Redis implements dictionary types using hash tables. Redis uses MurmurHash2, a fast and random hash algorithm, as a hash function. Redis uses linked lists to solve hash conflicts. Redis also supports dynamic expansion and shrinkage of hash tables.

As data increases dynamically, the load factor of the hash table keeps growing. To prevent performance degradation of the hash table, when the load factor is greater than 1, Redis triggers expansion, expanding the hash table to about twice its original size (the exact value needs to be calculated, you can read the source code if you are interested).

When data is dynamically reduced, to save memory, Redis triggers scaling down to approximately twice the size of the dictionary when the load factor is less than 0.1 (this value is also calculated, you can also read the source code if you are interested).

Dict data structure is shown as follows:

Set

Collection is a data type used to store a non-repeating set of data. There are also two implementations of this data type, one based on ordered arrays and one based on hash tables.

When the data to be stored meets the following two conditions at the same time, Redis uses ordered arrays to realize the data type of collection.

  • The data stored is all integers;
  • A maximum of 512 data elements are stored.

When both conditions cannot be met, Redis uses hash tables to store the data in the collection.

Ordered set (Zset)

Ordered sets are a data type that we’ve talked about in detail in the skip chart. It is used to store a set of data, and each data is given a score. By the size of the score, we organize the data into a data structure such as a jump table to support the rapid acquisition of data by score value and score interval.

In fact, as with other Redis data types, ordered collections are not just implemented as hoppers. When the amount of data is small, Redis uses compressed lists to implement ordered collections. Specifically, the premise of using compressed lists to achieve ordered sets is as follows:

  • All data should be less than 64 bytes in size;
  • The number of elements must be less than 128.

conclusion

The above is the data structure introduction of the five data types of Redis, and some mind brain maps are also summarized. Due to the limited space of this article, only list mind brain maps are included here. If you want to obtain other summarized mind brain maps, you can follow the public account [Go Keyboard Man] and reply to “brain maps”.