concept

Redis is an open source non-relational database written in C, based on excellent CRUD efficiency, often used for the cache of software systems, it itself provides the following five data formats:

  • String: string
  • List the list:
  • Hash: hash table
  • Set: unordered set
  • Zset: Ordered set

Next, we will analyze the underlying structure of these five data structures

The version chosen here is Redis-5.0.4, so there may be a lot of differences with other blogs on the web today, which I will point out in this post

string

Because Redis uses C language development, so naturally there is no Java and C ++ those String class library, in Redis, its own definition of a String format, called SDS (Simple Dynamic String), that is, Simple Dynamic String

This structure is defined in SDS.h:

typedef char *sds;
Copy the code

But this SDS type is only used as a parameter and return value, and is not really used as an operation type. The real core part is the following classes:

struct __attribute__(((packed__)) sdshdr5 {
    unsigned char flags; 
    char buf[];
};
struct __attribute__(((packed__)) sdshdr8 {
    uint8_t len; 
    uint8_t alloc; 
    unsigned char flags; 
    char buf[];
};
struct __attribute__(((packed__)) sdshdr16 {
    uint16_t len;
    uint16_t alloc; 
    unsigned char flags;
    char buf[];
};
struct __attribute__(((packed__)) sdshdr32 {
    uint32_t len;
    uint32_t alloc; 
    unsigned char flags; 
    char buf[];
};
struct __attribute__(((packed__)) sdshdr64 {
    uint64_t len; 
    uint64_t alloc;
    unsigned char flags; 
    char buf[];
};
Copy the code

Excluding the first structure (deprecated), SDS specific types of structures can be divided into the following sections:

  • Len: Used length, that is, the actual length of the string
  • Alloc: Length after removing the header and terminator (‘\0’)
  • Flags: The lower 3 bits indicate the string type, the remaining 5 bits are unused (I can’t see where Redis uses this property yet)
  • Buf [] : stores character data

Here is a comparison with the old version, because I only have 4.x and 5.x versions at hand, their SDS implementation is consistent, but according to others, the implementation of the previous VERSION of SDS is different, I will download it sometime and have a look, which divides the string into the following parts:

  • Len: The length already held in buf (represents the actual length of this string)
  • Free: unused buffer length in BUF
  • Buf [] : where the string data is actually stored

Redis has also written and rewritten a large number of methods related to SDS types, so why redis put so much effort into it? There are four advantages:

  • Reduced the time complexity of getting string length to O(1)
  • Reduced the number of memory reallocations when modifying strings
  • Compatibility with C strings while improving the efficiency of some string tool methods
  • Binary security (data is written in the same format as it is read)

list

If we look at the source file, we can see that there are two lists, one is ziplist, which literally means compressed list, and the other is quicklist, which literally means fast list, and in Redis we use quicklist directly, but let’s look at ziplist first

ziplist

Ziplist is not a class name, but is structured like this:











The meanings of each part are as follows:

  • Zlbytes: 4 bytes (32bits), indicating the total number of bytes occupied by ziplist
  • Zltail: 4 bytes (32bits), indicating the offset bytes of the last ziplist node in ziplist
  • Entries: two bytes (16bits), indicating the number of elements in a ziplist
  • Entry: Indicates data in ziplist with variable length
  • Zlend: 1 byte (8bits) for the end flag, which is fixed to ff (255)

The data is small endian storage, so some people may not be able to see the binary stream of data and its meaning, but actually read the data in the wrong way

Ziplist is stored in the way of data compression, which is not the key. From a macro point of view, Ziplist is similar to a encapsulated array. Zltail can easily append and delete tail data, and use entries to easily calculate the length

But it still has the disadvantage of arrays, which cause frequent data movement when inserting and deleting data, hence the QuickList data type

quicklist

Its core data structure is as follows:

typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;        /* The number of ziplist nodes */
    unsigned long len;          /* Number of QuickListNodes */
    int fill : 16;              /* Fill factor for a single node */
    unsigned int compress : 16; /* Compress the depth of the end node */
} quicklist;
Copy the code

We can obviously see that QuickList is a bidirectional linked list structure, but ziplist is involved inside. We can say that, on the macro level, QuickList is a bidirectional linked list, and on the micro level, each quicklist node is a Ziplist

In redis.conf, you can use the following two parameters to optimize:

  • List-max-ziplist-size: indicates the byte size of each quicklistNode. The default value is 2, indicating 8KB
  • List-compressed-depth: indicates whether to compress the quicklistNode node. The default value is 0, indicating no compression

The advantages of this storage approach are the same as the advantages of linked lists: high efficiency of insertion and deletion, and the efficiency of linked list queries is compensated by Ziplist, so QuickList becomes the preferred list data structure

hash

Hash is most commonly used in Redis, where hash is represented in two ways: zipmap and dict

zipmap

The zipmap format looks like this:


“foo”


“bar”

“hello”


“world”






The meanings of each part are as follows:

  • Zmlen: 1 byte, indicating the total number of bytes of zipMap
  • Len: 1 to 5 bytes, indicating the length of the string to be stored next
  • Free: 1 byte, an unsigned 8-digit number that represents the number of free unused bytes after the string, resulting from modifying the value corresponding to the key

The two strings next to each other are the key and the value, such as “foo” => “bar” and “hello” => “world” in the example above

The downside of this approach is that the time complexity of the lookup is O(n), so it can only be used as a lightweight Hashmap

dict

This is suitable for storing large amounts of data in the following format:

typedef struct dict {
    dictType *type;	/* A pointer to a custom type that can store various types of data */
    void *privdata; /* Pointer to private data */
    dictht ht[2];	/* Two hash tables, usually only valid for h[0], h1[1] only valid for rehash */
    long rehashidx; /* -1: no rehash is being performed. If the value is greater than or equal to 0, it indicates the number of rehash steps */
    unsigned long iterators; 	/* Number of iterators being traversed */
} dict;
Copy the code

If we don’t want to go any further than that, the actual data store is the dictEntry structure, as follows:

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;
Copy the code

It’s obviously a linked list, and we know that the chain structure is enough for storage

This approach consumes a lot of memory, so lightweight ZipMap is generally used when data is low

set

In Redis, we can look at the intset.h file, which is a collection of stored integers with the following structure:

typedef struct intset {
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} intset;
Copy the code

The meanings of the fields are as follows:

  • Encoding: Data encoding format in which each data element is stored in several bytes (the values can be 2, 4, and 8)
  • Length: indicates the number of elements
  • Contents: flexible array, which is allocated separately and not included in intSet

Specific operation, we won’t open the should have this kind of data structure was very clear, we here, intset have the concept of a data update, for example we have set a 16-bit integer, then insert a 32-bit integer, so cause the entire collection is upgraded to a 32-bit integer, but not in turn, That’s where flexible arrays come in

If the collection is too large, it is stored as a dict

zset

Zset, also called sorted set in many places, is a structure of key-value pairs. Its key is called member, which is the set element (zset is still set, so member cannot be the same). Its corresponding value is called score, which is a floating point number, which can be understood as priority, and is used to arrange the order of Zsets

There are also two storage methods, one is the Ziplist/Zipmap format, which we won’t talk about more, just need to know that this format will arrange the data in order of score

Another storage format is skiplist, which can be regarded as an array of balanced tree maps. The time complexity of the search is almost the same as that of the balanced tree, but the implementation is simpler, such as the following structure (figure source skip table principle) :