Dictionary is also known as hash table, Redis is a non-relational database, the data structure of the entire database can be said to be a super large dictionary.
Dictionary Common commands
// Add a dictionary type, key: city, value: {hangzhou:"xihu"}
127.0. 01.:6379> hset city hangzhou "xihu"
(integer) 1
127.0. 01.:6379> hset city chongqing "huoguo"
(integer) 1
127.0. 01.:6379> hget city hangzhou
"xihu"
127.0. 01.:6379> hget city chongqing
"huoguo"
// Get all the dictionaries for key
127.0. 01.:6379> hgetall city
1) "hangzhou"
2) "xihu"
3) "chongqing"
4) "huoguo"
// Update operation, return 0
127.0. 01.:6379> hset city hangzhou "dongporou"
(integer) 0
127.0. 01.:6379> hgetall city
1) "hangzhou"
2) "dongporou"
3) "chongqing"
4) "huoguo"
// Batch add
127.0. 01.:6379> hmset city chengdu "chuanchuan" shanxi "roujiamo"
OK
// Dictionaries also support auto-increment
127.0. 01.:6379> hset age peter 25
(integer) 1
127.0. 01.:6379> hincrby age peter 1
(integer) 26
Copy the code
The data structure
The Redis dictionary data structure is very similar to the Java HashMap data structure. (We will not discuss the case of compressed linked lists at the bottom.)
Typedef struct dictht {// dictEntry **table; // Table array size unsigned long size; // Mask, size-1 unsigned long sizemask; // Number of existing nodes unsigned long used; } dictht;Copy the code
The used property in the structure above is an existing node (array + list). In addition, used may be greater than size. The sizemask attribute is size-1 to efficiently get index values through bitwise operations (index values =Hash values & mask values).
Typedef struct dictEntry {// key void *key; // the value is a common object union {// the pointer points to a specific value address void *val; // Hash value uint64_t u64; // Expiration is time int64_t s64; double d; } v; Struct dictEntry *next; struct dictEntry *next; } dictEntry;Copy the code
But Redis encapsulates the dictionary.
typedef struct dict { dictType *type; // Dictionary private data void * privData; dictht ht[2]; +1 long rehashidx; +1 long rehashidx; // unsigned long iterators; } dict;Copy the code
typedef struct dictType {
// The hash function corresponding to the dictionary
uint64_t (*hashFunction)(const void *key);
// The assignment function corresponding to the key
void *(*keyDup)(void *privdata, const void *key);
// The assignment function corresponding to the value
void *(*valDup)(void *privdata, const void *obj);
// The key comparison function
int (*keyCompare)(void *privdata, const void *key1, const void *key2);
// The key destruction function
void (*keyDestructor)(void *privdata, void *key);
// The destruction function of the value
void (*valDestructor)(void *privdata, void *obj);
} dictType;
Copy the code
capacity
- Apply for a new piece of memory. If you apply for a new piece of memory for the first time, the default capacity is 4, and then it will double the current capacity
- New memory addresses are assigned to HT [1]
- Change the value of rehashidx from -1 to 0. Indicates that the rehash operation is ready to begin.
static int dictExpand(dict *ht, unsigned long size) {
// Define a new dictionary
dict n;
// Recalculate the capacity after expansion
unsigned long realsize = _dictNextPower(size), i;
// If the existing elements are still larger than the expanded capacity, an error status is returned
if (ht->used > size)
return DICT_ERR;
_dictInit(&n, ht->type, ht->privdata);
n.size = realsize;
n.sizemask = realsize-1;
n.table = calloc(realsize,sizeof(dictEntry*));
n.used = ht->used;
for (i = 0; i < ht->size && ht->used > 0; i++) {
dictEntry *he, *nextHe;
if (ht->table[i] == NULL) continue;
/* For each hash entry on this slot... * /
he = ht->table[i];
while(he) {
unsigned int h;
nextHe = he->next;
// Recalculate the element index
h = dictHashKey(ht, he->key) & n.sizemask;
// Use header interpolation to place the element in the new dictionary
he->next = n.table[h];
n.table[h] = he;
ht->used--;
he = nextHe;
}
}
assert(ht->used == 0);
free(ht->table);
*ht = n;
return DICT_OK;
}
Copy the code
Shrinkage capacity
When the usage is less than 10% of the total space, the capacity is reduced
void tryResizeHashTables(int dbid) {
// Dictionary memory size
if (htNeedsResize(server.db[dbid].dict))
dictResize(server.db[dbid].dict);
// Expiration time of key
if (htNeedsResize(server.db[dbid].expires))
dictResize(server.db[dbid].expires);
}
int dictResize(dict *d)
{
int minimal;
if(! dict_can_resize || dictIsRehashing(d))return DICT_ERR;
minimal = d->ht[0].used;
if (minimal < DICT_HT_INITIAL_SIZE)
minimal = DICT_HT_INITIAL_SIZE;
// The last call is still the scaling method, but it is actually scaling
return dictExpand(d, minimal);
}
Copy the code
Progressive rehash
If you have a large dictionary with millions of keys that need to be expanded, move all the elements to the new dictionary at once. That redis can’t hurt.
If the service is operating, just rehash the current key and migrate it to the new dictionary.
static void _dictRehashStep(dict *d) {
if (d->iterators == 0) dictRehash(d,1);
}
Copy the code
int dictRehash(dict *d, int n) {
int empty_visits = n*10;
if(! dictIsRehashing(d))return 0;
while(n-- && d->ht[0].used ! =0) {
dictEntry *de, *nextde;
assert(d->ht[0].size > (unsigned long)d->rehashidx);
while(d->ht[0].table[d->rehashidx] == NULL) {
d->rehashidx++;
if (--empty_visits == 0) return 1;
}
de = d->ht[0].table[d->rehashidx];
while(de) {
uint64_t h;
nextde = de->next
h = dictHashKey(d, de->key) & d->ht[1].sizemask;
de->next = d->ht[1].table[h];
d->ht[1].table[h] = de;
d->ht[0].used--;
d->ht[1].used++;
de = nextde;
}
d->ht[0].table[d->rehashidx] = NULL;
// Rehashidx increments by one each time an element is migrated to the dictionary
d->rehashidx++;
}
if (d->ht[0].used == 0) {
zfree(d->ht[0].table);
d->ht[0] = d->ht[1];
_dictReset(&d->ht[1]);
// If all elements have been migrated. Rehashidx is reassigned to -1
d->rehashidx = -1;
return 0;
}
return 1;
}
Copy the code
If the service is idle, rehash is performed in batches, migrating 100 at a time.
int dictRehashMilliseconds(dict *d, int ms) {
long long start = timeInMilliseconds();
int rehashes = 0;
while(dictRehash(d,100)) {
rehashes += 100;
if (timeInMilliseconds()-start > ms) break;
}
return rehashes;
}
Copy the code
Refer to the progressive Rehash process in redis design and implementation below
. Until it’s all done.
Plain iterator
typedef struct dictIterator {
// The iterated dictionary
dict *d;
// Iterate to the index of the hash table
long index;
Ht [0] or HT [1]
// safe indicates whether a security iterator is currently being created
int table, safe;
// Current node, next node
dictEntry *entry, *nextEntry;
// This value will change if the dictionary is changed
long long fingerprint;
} dictIterator;
Copy the code
- The traversal determines whether the element is stored in a dictionary, and in the following two cases it is stored in a compressed linked list
-
The key and value strings of all key-value pairs held by a hash object are less than 64 bytes long
-
A hash object holds less than 512 key-value pairs
- You can see that the plain iterator sets safe to 0
- Traverse elements
In the figure above, the fingerprint of the iterator is initialized on the first traversal. When the iterator is released, this property is compared and an exception is output if it is different. Another point is that two Pointers, Entry and nextEntry, respectively point to the two parent nodes after the Hash conflict. If the entry node is deleted in safe mode, the nextEntry field ensures that data is not lost in subsequent iterations.
Safety iterator
- A safe iterator is the same structure as a normal iterator, but safe=1. Let’s get the safe iterator
-
Iterating over an element calls the dicNext function just like a normal iterator.
-
The dicRehashStep function is called for each dictionary operation, ensuring that no rehash is performed during iteration
- When an iterator is released, the iterators field in the dictionary is subtracted by one to ensure that the iterated rehash will work.
Dictionary some additions and deletions to check the source code is not recorded, the general design is similar to Java