I. Introduction and application

Redis is a high-performance, networking-enabled, persistent K-K in-memory database written in ANSI C, and provides apis in multiple languages. The commonly used types are String, List, Hash, Set, and ZSet.

String: cache, traffic limiting, counter, distributed lock, distributed Session

Hash: Stores user information, page views, and combined query

List: Twitter follower timeline List, simple queue

Set: like, step, tag, friend relationship

Zset: Leaderboard

For another example, e-commerce will use some special designs to ensure the stability of the system during big sales promotion. The following designs can be considered for inventory deduction:

From the application scenario above, we can see that Redis is very efficient and stable, then how to implement the underlying Redis?

2. Redis object redisObject

When we execute the set hello world command, we have the following data model:

SDS: key The key “Hello” is stored in SDS (a simple dynamic string). Details are described later.

RedisObject: The value val “world” is stored in redisObject. In fact, redis commonly used 5 types are stored as redisObject; The type field in redisObject indicates the type of the Value object, and the PTR field points to the address of the object.

RedisObject object is very important, Redis object type, internal coding, memory collection, shared object and other functions, all need redisObject support. The advantage of this design is that different data structures can be set for common types in 5 according to different usage scenarios, so as to optimize the use efficiency of objects in different scenarios.

Both dictEntry objects, redisObjects, and SDS objects require a memory allocator (such as Jemalloc) to allocate memory for storage. Jemalloc, the default memory allocator for Redis, does a relatively good job of reducing fragmentation. For example, Jemalloc divides the memory space into small, large, and huge ranges on a 64-bit system. Each range is divided into a number of small memory block units; When Redis stores data, it selects the most suitable memory block for storage.

As mentioned earlier, each Redis object is represented by a redisObject structure whose PTR pointer points to the underlying implementation’s data structure, which is determined by the Encoding attribute. For example, if we run the following command to get the code for storing “hello” :

Third, the String

The underlying implementation of a string object can be int, RAW, or embstr (the table above is described by its name). Embstr encoding allocates a contiguative block of space by calling the memory allocation function once, whereas RAW requires two calls.

Simple dynamic strings (SDS), which are more like C++ strings or Java arraylists, are dynamically variable in length:

struct sdshdr {
    // The length of occupied space in buf
    int len;
    // The length of the remaining free space in buf
    int free;
    // Data space
    char buf[]; // '\0' is null character ending
};
Copy the code

Get: sdsrange– -o (n) set: sdscpy –O(n) create: sdsnew– -o (1) len: sdslen– -o (1) set: sdscpy –O(n) create: sdsnew– -o (1) len: sdslen– -o (1)

Constant complexity gets string length: because SDS records length in len, the time complexity for getting an SDS length is only O(1).

Prespace allocation: If you modify an SDS, there are two conditions:

1. If the SDS length (len value) is less than 1MB, then the program allocates the same amount of unused space as the len attribute. In this case, free and len attributes have the same value. For example, if the LEN of SDS becomes 15 bytes, the program will also allocate 15 bytes of unused space, and the actual length of the BUF array of SDS becomes 15+15+1=31 bytes (an extra byte for the user to save null characters).

2, the SDS length (len value) is greater than or equal to 1MB, the program allocates 1MB of unused space. For example, after modification, the LEN of SDS is 30MB, so its actual length is 30MB+1MB+1byte.

Lazy space release: When an SDstrim is executed, the SDS does not immediately release the extra space. If the string is concatenated the next time and the concatenation is not as large as the freed space, the unused space will be used. Lazy freeing of space avoids memory reallocation operations that operate on strings in certain cases.

Prevent buffer overflows: When using a C string operation, it is easy to overflow the buffer if the length of the string increases (e.g., strcat) and memory is not reallocated. Since the SDS records the length, the corresponding operation will automatically reallocate memory when the buffer overflow is caused, preventing the buffer overflow.

Fourth, the List

The underlying implementation of the List object is quicklist (a quicklist, which is a combination of ziplist compressed lists and linkedlists). Lists in Redis support insertion and eject at both ends, and can get elements at a specified location (or range) that can act as arrays, queues, stacks, etc.

typedef struct listNode {
     // Front node
    struct listNode *prev;
    // Rear node
    struct listNode *next;
    // The value of the node
    void *value;
 } listNode;

 typedef struct list {
     // Header node
    listNode *head;
    // Table tail node
    listNode *tail;
    // Copy the value of the node
    void *(*dup)(void *ptr);
    // The node value release function
    void (*free)(void *ptr);
     // Compare the values of nodes
    int (*match)(void *ptr, void *key);
     // The number of nodes in the list
    unsigned long len;
 } list;
Copy the code

rpush: listAddNodeHead —O(1) lpush: listAddNodeTail —O(1) push: listInsertNode —O(1) index : listIndex —O(N) pop: ListFirst/listLast —O(1) llen: listLength —O(N)

4.1 linkedlist (double-ended linkedlist)

This structure is more like Java LinkedList, you can read the source code interested.

Compared with the double-ended linked list, the compressed list can save the memory space, but the complexity of modification, addition and deletion is higher. So when the number of nodes is small, you can use a compressed list; However, with a large number of nodes, it is still cost-effective to use a dual-end linked list.

4.2 Ziplist (Compressed list)

Redis uses ziplist as the underlying implementation of list keys when a list key contains a small number of list items and is a small integer value or a short string.

Quicklist replaces Ziplist and LinkedList in the new version:

Five, the Hash

The underlying implementation of a Hash object can be either a ziplist (compressed list) or a Hashtable (dictionary or also called a Hash table).

Hashtable can implement O(1) complexity of read and write operations, so it is very efficient. The source code is as follows:

typedef struct dict {
    // Type specific functions
    dictType *type;
     // Private data
    void *privdata;
     / / a hash table
    dictht ht[2];
    / / rehash index
    // When rehash is not in progress, the value is -1
    int rehashidx; /* rehashing not in progress if rehashidx == -1 */
     // The number of security iterators currently running
    int iterators; /* number of iterators currently running */
 } dict;
 typedef struct dictht {
    // Hash table array
    dictEntry **table;
     // Hash table size
    unsigned long size;
    // Hash table size mask, used to calculate the index value
    // is always equal to size minus 1
    unsigned long sizemask;
    // The number of existing nodes in the hash table
    unsigned long used;
} dictht;
typedef struct dictEntry {
    void *key;
    union {void*val; uint64_t u64; int64_t s64; } v;// Point to the next hash table node to form a linked list
    struct dictEntry *next;
 } dictEntry;
 typedef struct dictType {
     // The function that calculates the hash value
    unsigned int (*hashFunction)(const void *key);
     // The function that copies the key
    void *(*keyDup)(void *privdata, const void *key);
     // Copy the value of the function
    void *(*valDup)(void *privdata, const void *obj);
     // Compare the key function
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    // Destroy key function
    void (*keyDestructor)(void *privdata, void *key);
    // Destroy the value of the function
    void (*valDestructor)(void *privdata, void *obj);
} dictType;
Copy the code

The above source code can be simplified into the following structure:

Dictionaries in Redis use hashtable as the underlying implementation, and each dictionary will carry two hash tables, one for regular use and one for rehashing only. As you operate on the hash table, the keys gradually increase or decrease. In order to keep the load factor of the hash table within a reasonable range, Redis will expand or shrink the size of the hash table (rehash), that is, all key pairs in HT [0] will be divided into multiple, progressive rehash to HT [1].

Sixth, the Set

The underlying implementation of a Set object can be an intset (Set of integers) or a hashtable (dictionary or also known as a hashtable).

typedef struct intset {
    // Encoding mode
    uint32_t encoding;
    // The number of elements the collection contains
    uint32_t length;
    // Hold an array of elements
    int8_t contents[];
} intset;
Copy the code

sadd: intsetAdd—O(1) smembers: intsetGetO(1)—O(N) srem: intsetRemove—O(N) slen: Intsetlen — -o (1) intset The underlying implementation is an ordered, non-repeating array that holds the collection elements. Intset the int array in this structure can be of type 16, 32, or 64 bits. If all the integers in the array are 16-bit long, and if a new 32-bit integer is added, the entire 16-bit array is upgraded to a 32-bit array. The upgrade improves intSet flexibility and saves memory, but it is irreversible.

7.ZSet

The underlying implementation of ZSet ordered collection objects can be ziplist or skiplist.

typedef struct zskiplist {
     // The header node and the tail node
    struct zskiplistNode *header, *tail;
    // Number of nodes in the table
    unsigned long length;
    // The number of layers of the node with the largest number of layers in the table
    int level;
 } zskiplist;
typedef struct zskiplistNode {
    // Member objects
    robj *obj;
    / / score
    double score;
     // Back pointer
    struct zskiplistNode *backward;
    / / layer
    struct zskiplistLevel {
        // Forward pointer
        struct zskiplistNode *forward;
         // Span - the distance between the forward pointer and the current node
        unsigned int span;
    } level[];
} zskiplistNode;
Copy the code

Zadd –zslinsert– average O(logN), worst O(N) Zrem –zsldelete– average O(logN), worst O(N) ZRank –zslGetRank– average O(logN), worst O(N) zadd– zslinsert– average O(logN), worst O(N)