SDS (Simple Dynamic String)
struct sdshdr {
unsigned int len; // Assign the length
unsigned int free; // Unused length
char buf[]; / / data
};
Copy the code
- capacity
- If free is long enough, use it directly
- If the length of the new string is less than SDS_MAX_PREALLOC (1M), the expanded length is twice the length of the new string
- Otherwise, the expanded length is the new string length plus SDS_MAX_PREALLOC (1M).
sds sdsMakeRoomFor(sds s, size_t addlen) {
struct sdshdr *sh, *newsh;
size_t free = sdsavail(s);
size_t len, newlen;
if (free >= addlen) return s;
len = sdslen(s);
sh = (void*) (s-(sizeof(struct sdshdr)));
newlen = (len+addlen);
if (newlen < SDS_MAX_PREALLOC)
newlen *= 2;
else
newlen += SDS_MAX_PREALLOC;
newsh = zrealloc(sh, sizeof(struct sdshdr)+newlen+1);
if (newsh == NULL) return NULL;
newsh->free = newlen - len;
return newsh->buf;
}
Copy the code
- Advantages over C strings
- Constant complexity gets the length of the string
- Prevent buffer overflow
- Space preallocation reduces the number of memory reallocations that change the string length
- Binary security
Dict (dictionary)
/* This is our hash table structure. Every dictionary has two of this as we * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
dictEntry **table; / / a hash table
unsigned long size; / / size
unsigned long sizemask; //size-1 is used to calculate the index
unsigned long used; // The number of existing nodes in the hash table
} dictht;
Copy the code
Hash conflict
- Conflicts are resolved using the chained address method, that is, when a conflict occurs, an entry is placed at the end of a single linked list at the location of the conflict.
capacity
- When the data is too long and the length of the hash table is too small, there will be many conflicts, which will lead to the excessively long single linked list and the query performance of the hash table will sharply decline. Therefore, the hash table should be expanded at an appropriate time.
- Expansion conditions
- Load factor: number of entries in the hash table/length of the hash table
- When Redis is not executing commands related to Fork (BGSAVE, BGREWRITEAOF), the load factor is greater than or equal to 1.
- When Redis is executing commands related to Fork (BGSAVE, BGREWRITEAOF), the load factor is greater than or equal to 5.
Why are load factors for capacity expansion different under different conditions?
- BGSAVE and BGREWRITEAOF Fork the child of the current server process, and most operating systems use copy-on — write to optimize the time to Fork a child and save memory, so increasing the load factor while the child is running. Minimize writing operations to memory by the parent process and save memory as much as possible.
- When you write write copy popular understanding, when not using replication technology, create a copy of a process (Fork out a child process) will be the memory space of the parent also create a copy to the child to use, but it will waste of memory space is large and if the parent process of the Fork will be a long time, and using written replication technology, When Fork a child process, you do not need to create a copy of the parent process space. Instead, the child process shares the memory space with the parent process. Only when the parent process performs a write operation, the parent process creates a copy of the page modified by the parent process.
- Size after expansion
- After the expansion, the size is 2^n, where 2^n is the first number greater than or equal to the number of entries in the current hash table x 2
/* Our hash table capability is a power of two */
static unsigned long _dictNextPower(unsigned long size)
{
unsigned long i = DICT_HT_INITIAL_SIZE;
if (size >= LONG_MAX) return LONG_MAX;
// Find the first 2^n (size=d->ht[0]. Used *2)
while(1) {
if (i >= size)
return i;
i *= 2; }}/* Expand the hash table if needed */
static int _dictExpandIfNeeded(dict *d)
{
/* Incremental rehashing already in progress. Return. */
if (dictIsRehashing(d)) return DICT_OK;
/* If the hash table is empty expand it to the initial size. */
if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);
/* If we reached the 1:1 ratio, and we are allowed to resize the hash * table (global setting) or we should avoid it but the ratio between * elements/buckets is over the "safe" threshold, we resize doubling * the number of buckets. */
if (d->ht[0].used >= d->ht[0].size &&
(dict_can_resize ||
d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
{
return dictExpand(d, d->ht[0].used*2); // The size passed is twice the size of the current hash element
}
return DICT_OK;
}
Copy the code
- Shrinkage capacity
- The load factor is less than 0.1
- The size after reduction is 2^n, where 2^n is the first number greater than or equal to the number of entries in the current hash table x 2
Progressive rehash
- Set rehashidx to 0 to start rehash.
- During the rehash, every time the dictionary is added, deleted, changed, or searched, in addition to the specified operation, the entry in the REhashidx bucket of HT [0] is rehash to HT [1]. After the rehash is complete, the rehashidx is added by 1.
- After rehash is complete, rehashidx is reset to -1.
— — — — — — — — to be continued — — — — — — — —