Briefly record the basic data structure of REDis and the principle of internal implementation. In fact, each structure can be written separately, there is time to break down the details.
String
Equivalent to an ArrayList in Java. Inside is a character array (dynamic array).
C defines an array of strings “redis”. The structure of the array is as follows:
[' r ', 'e', 'd', 'I', 's', '\0']Copy the code
The last ‘\0’ is a null character, indicating the end of the string.
Redis does not directly use the string structure of C, but encapsulates it, using the abstract type of Simple Dynamic String (SDS).
In Redis, SDS is its own string by default. C strings are used only in places where string variation is unlikely.
SDS
The definition is as follows:
struct sdshdr {
// The length of the string
int len;
// The number of free bytes
int free;
// A byte array to store strings
char buf[];
};
Copy the code
It’s got len and it’s got free.
Note: the size of buf is len+free+1, where the extra byte is used to store ‘\0’.
The structure of the storage is:
Why not use THE C language string implementation instead of SDS? What are the benefits of this implementation?
- Constant complexity gets the length of the string. Get the length directly from len. Getting too long arrays is not a redis bottleneck.
- Prevent buffer overflows. When the string is modified, use free to check whether the space is sufficient. If not, expand the space.
- Reduce the number of memory reallocations for modifying strings.
- C does not record the length of the string. Therefore, if you want to modify the string, you must reallocate the memory (free and then apply). If you do not reallocate the string, the memory buffer will overflow when the string length increases, and the memory will leak when the string length decreases.
- For SDS, due to the existence of len attribute and free attribute, two strategies of space pre-allocation and lazy space release are implemented for SDS
- Space preallocation: When you expand a string, you expand more memory than you need. This reduces the number of memory reallocations required for successive string growth operations.
- Lazy space free: When a string is shortened, the program does not immediately use memory reallocation to reclaim the shortened extra bytes. Instead, it uses the free property to record the number of bytes for later use.
- Binary security, the C string is marked by a null character as the end of the string. Some special binary file contents may contain null characters, so the C string cannot be accessed correctly. SDS is judged by LEN to be final.
- Compatible with some C string functions
- Although SDS is binary safe, it also follows the convention that every string ends in an empty string, which makes it possible to reuse some functions from the C library
.
- Although SDS is binary safe, it also follows the convention that every string ends in an empty string, which makes it possible to reuse some functions from the C library
capacity
If the data is less than 1 MB, double the current space.
If the data is larger than 1 MB, expand 1 MB space at a time.
The maximum length of the string is 512 MB, that is, the maximum length can be expanded to 512 MB.
List
Similar to LinkedList in Java. Inside is a bidirectional linked list.
Linked list nodes are defined as follows:
typedef struct listNode{
// The front node
struct listNode *prev;
// back node
struct listNode *next;
// Node value
void *value;
}listNode
Copy the code
Bidirectional linked list definition:
typedef struct list{
// Table header node
listNode *head;
// Table end node
listNode *tail;
// The number of nodes in the linked list
unsigned long len;
// Node value copy function
void (*free) (void *ptr);
// Node value release function
void (*free) (void *ptr);
// Node value comparison function
int (*match) (void *ptr,void *key);
}list
Copy the code
There’s nothing to be said for bidirectional lists, but Redis actually stores more than a simple LinkedList internally. It’s a QuickList structure. And when the list has fewer elements, it is optimized to a ZipList (compressed list) structure.
ZipList (Compressed list)
In the case of a small number of elements, to save space, a contiguous memory space is used to store the list. This is called ziplist.
A ZIplist is a sequential storage structure composed of a series of specially encoded contiguous memory blocks. Similar to an array, a ZIplist is stored contiguously in memory, but unlike an array, the size of memory occupied by each element of a ziplist can be different to save memory.
Ziplist structure is shown as follows:
-
Zlbytes: indicates the ziplist length (in bytes), which is a 32-bit unsigned integer
-
Zltail: The offset of the last ziplist node, useful when traversing the ziplist or pop tail node backwards.
-
Zllen: indicates the number of ziplist entries
-
Entry: node. A place to store elements.
-
Zlend: the value 0xFF is used to mark the end of a ziplist
The most important implementation is the implementation of Entry.
entry
The structure of entry is shown in the figure.
- The previous_entry_length field indicates the length of the previous element in 1 or 5 bytes.
- If the length of the current element is less than 254 bytes, the previous_entry_length field is represented by one byte.
- If the length of an element is greater than or equal to 254 bytes, it cannot be represented by a single byte.
- Encoding: The encoding stores the content type and length of the node’s content. There are two types of encoding, one is a byte array and the other is an integer. The length of the encoding field can be 1, 2, or 5 bytes.
- Content: The content field is used to hold the content of the node. The content type and length of the node are determined by Encoding.
QuickList
QuickList is simply a series of ziplists linked together using bidirectional Pointers.
Its internal structure is as follows:
Hash
Similar to a HashMap in Java. Inside is the array + zipper way. It’s just that the dictionary in Redis can only be strings.
The Hash type is used to store objects in Redis.
The hash table is defined as follows
typedef struct dictht{
// Hash table array
dictEntry **table;
// Hash table size
unsigned long size;
// Hash table size mask, used to calculate index values
// always equal to size-1
unsigned long sizemask;
// The number of existing nodes in the hash table
unsigned long used;
}dictht
Copy the code
Hash table consists of array table, in which each element points to dict.h/dictEntry structure, which is defined as follows:
typedef struct dictEntry{
void *key;
union{
void *val;
uint64_tu64;
int64_ts64;
}v;
/ / the next
struct dictEntry *next;
}dictEntry
Copy the code
The internal structure is shown as follows:
Here are some points that are as important as hashMap.
The hash algorithm
#1Hash = dict->type->hashFunction(key); #2Sizemask index = hash & dict->ht[x].sizemask;Copy the code
Arithmetic and mask the hash value of the key and the length of the array.
Expansion & contraction
capacity
Trigger condition: The program will extend the hash table if any of the following conditions are met:
- The server is not currently running bgsave or bgrewriteaof, and the hash table load factor is >=1
- The server is currently running bgsave or bgrewriteaof with hash table load factor >=5
Each extension creates another hash table by doubling the space already used by the original hash table.
shrinkage
Trigger condition: When the load factor is less than 0.1, the program will shrink the hash table
Each shrink creates a new hash table based on doubling the size of the used space.
Incremental rehash
This is a very different point from hashMap. Redis uses progressive hash.
Because rehash is a time-consuming operation, if the dictionary is large it will block subsequent operations during rehash. Progressive rehash preserves both the old and new hash structures. In the subsequent scheduled tasks and hash operation commands, data from the old hash table is slowly migrated to the new hash structure.
The hash process is shown below.
Create a new Hash table ht[1].
Data from the old table is migrated step by step to the new table, and the old table is automatically reclaimed.
Ht [0], HT [1],ht[0],ht[1]
- If a key is found, it is first searched in HT [0] and then in HT [1].
- New key-value pairs added to the dictionary are stored in HT [1] instead of HT [0].
Set
Similar to a HashSet in Java. The inside is like a special dictionary, except that the values of the dictionary are null.
ZSet
Similar to Java HashMap and SortSet combination.
ZSet ensures that the elements of the set are unique and can guarantee a certain order. ZSet assigns a score to each value, and we sort by score.
There are two underlying storage structures for Zsets in Redis: Ziplist and skiplist.
ziplist
Ziplist is used to store elements when the following two conditions are met.
- The number of saved elements is less than 128
- The length of all saved elements is less than 64 bytes
Since we are inserting into the linked list in fractional order, space and time are compromised, with the advantage of space > advantage of time.
skiplist
Skiplist is used when the above two conditions are not met (space advantage < time advantage).
The core of zset’s Skiplist consists primarily of a dict object and a Skiplist object.
Structure definition:
typedef struct zset{
/ / jump table
zskiplist *zsl;
/ / a dictionary
dict *dice;
} zset
Copy the code
-
The dictionary is used to store the mapping between value and score.
-
Skiplist is a linked list (skiplist) of ordered elements.
The two data structures share the members and points of the same element through Pointers, so there are no duplicate members and points, resulting in a waste of memory.
Structure diagram of skip table:
Skip table is a convenient data structure for searching ordered linked list. The time complexity of finding a single key, skiplist, and balanced tree is O(log n), roughly the same.
Procedure for creating a hop table:
reference
- Redis Design Implementation
- Redis Deep Adventure
- Some images source: link