As a mature data storage middleware, Redis provides comprehensive data management functions, such as data expiration mentioned earlier and data obsolescence (EVICT) strategy discussed today. Before I introduce Redis data elimination strategy, I will ask you a few questions to help you understand Redis data elimination strategy more deeply.
- What is data obsolescence? Why should Redis have a data obsolescence policy?
- What data should be eliminated and what data selection criteria should be adopted?
- How does Redis’ data elimination strategy work?
What is Evict
To answer the first question, data culling in Redis actually means cleaning up some data to save memory space when you run out of memory. Although Redis already has an expiration policy, it can clean up data that is out of date. But imagine this scenario. What if there is not enough storage space after the stale data is cleaned up? Can you delete some more data? In the case of caching, the answer to this question is yes, because the data can be found from the cached data source even if it is not found in Redis. Therefore, in certain business scenarios, we can discard some old data in Redis to make room for new data.
How to Evict
Second question, since we need elimination mechanism, which data should you choose to eliminate in specific implementation? There are many different strategies, but the idea is to maximize the total value. We live in an unfair world, and so does data, and not all data is necessarily of equal value. So we only need to select those low-value eliminations when weeding out data.
So again, what data is low value? I have to mention a principle that runs through computer science, the locality principle, which clearly tells you that the locality principle has two phenomena in the cache scenario: 1. The higher the probability that the latest data will be accessed next time. 2. Data accessed for more times has a higher probability of being accessed next time. Here we can simply say that the higher the probability of being visited, the greater the value. Based on the above two phenomena, we can specify two strategies: 1. Eliminate the earliest unaccessed data. 2. To eliminate the Least Frequently accessed data, the two strategies have English names LRU(Least Recently Used) and LFU(Least Frequently Used) respectively.
Evict policy in Redis
In addition to LRU and LFU, random elimination is also possible. This is to treat the data equally, randomly select a portion of the elimination. In fact, Redis implements the above three policies, and you can use it to configure a certain elimination policy according to the specific data. In addition to the above three policies, Redis also provides a TTL elimination policy for expired data, which is to eliminate the smallest remaining TTL data. In addition, it should be noted that the Redis elimination policy can be configured on the global or data with expiration time. Therefore, the Redis elimination policy is configured in the following 8.
Configuration items | Specific meaning |
---|---|
MAXMEMORY_VOLATILE_LRU | LRU is performed only on data that has an expiration date |
MAXMEMORY_VOLATILE_LFU | LFU is executed only on data that has an expiration date |
MAXMEMORY_VOLATILE_TTL | The TTL length is eliminated on data that has an expiration time |
MAXMEMORY_VOLATILE_RANDOM | Only randomly weed out data with an expiration date |
MAXMEMORY_ALLKEYS_LRU | Perform LRU on global data |
MAXMEMORY_ALLKEYS_LFU | Perform LFU on global data |
MAXMEMORY_ALLKEYS_RANDOM | Random elimination on global data |
MAXMEMORY_NO_EVICTION | Do not eliminate the number, when the memory space is full insert data error |
Source analysis
Redis uses MAXMEMORY_VOLATILE_* and MAXMEMORY_ALLKEYS_* to implement the same strategies, but only on different dict devices. In addition, the strategy of Random is relatively simple, so we won’t go into details here. Let’s focus on LRU and LFU.
LRU concrete implementation
The essence of LRU is to eliminate the data that has not been accessed for the longest time. One way to implement LRU is to implement it in the form of a linked list. If the data is accessed, it is moved to the head of the linked list, so the tail must be the data that has not been accessed for the longest time. For example, in Java LinkedHashMap is implemented this way. But Redis does not adopt this strategy, Redis is simply recorded each Key last access timestamp, timestamp sort of way to choose to find the earliest data, of course, if all the data sorting again, would be too slow, so the Redis is each choose a batch of data, and then implement exit strategy from this batch of data. The advantage of this is high performance, the disadvantage of this is not necessarily global optimal, just local optimal.
typedef struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:LRU_BITS;
int refcount;
void *ptr;
} robj;
Copy the code
How is LRU information stored? As mentioned earlier in the redisObject article, there is a 24-bit LRU field in the redisObject. The 24-bit lRU field holds the timestamp of data access in seconds. Of course, the 24-bit lRU field does not hold the full Unix timestamp.
robj *lookupKey(redisDb *db, robj *key, int flags) {
dictEntry *de = dictFind(db->dict,key->ptr);
if (de) {
robj *val = dictGetVal(de);
if(! hasActiveChildProcess() && ! (flags & LOOKUP_NOTOUCH)){if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
updateLFU(val);
} else {
val->lru = LRU_CLOCK(); // Update the LRU timestamp here}}return val;
} else {
return NULL; }}Copy the code
LFU implementation
The lRU field is also used by the LFU, so you can see in the lookupkey above that the LRU is also updated when using the LFU policy. In Redis, LFU appeared a little later, which was introduced in Redis 4.0, so the LRU field is reused here. There is only one way to implement LRU, which is to record the number of times the key is accessed. But to achieve lru have a problem need to be considered while LFU is according to the access frequency to eliminate the data, but there may be new data at any time in the Redis will come, old data may have more access to itself, fewer new data currently being accessed, doesn’t mean that the future is accessed less frequently, if we do not consider that the new data is likely one will come is eliminated, This is clearly unreasonable.
Redis tries to solve this problem by splitting the 24 bits into two parts: the high 16 bit timestamp (minute level) and the low 8 bit counter. Each new data counter starts with a certain value so that it can move out of the village, and then the count decays over time so that old but currently unused data has a chance to be obsolete. Let’s look at the implementation code.
LFU counter grows
The counter only has eight bits, 255 at best. How can it be enough? Redis, of course, does not use exact counting, but approximate counting. The specific implementation is the probability growth of counter. The larger the value of counter is, the slower the growth rate is. The specific growth logic is as follows:
/* Update lFU counter, counter is not an accurate value, but a probability increase, the larger the value of counter, the slower its growth rate * can only reflect the heat of a certain time window, not the specific number of visits */
uint8_t LFULogIncr(uint8_t counter) {
if (counter == 255) return 255;
double r = (double)rand()/RAND_MAX;
double baseval = counter - LFU_INIT_VAL; / / LFU_INIT_VAL is 5
if (baseval < 0) baseval = 0;
double p = 1.0/(baseval*server.lfu_log_factor+1); // server.lfu_log_factor is configurable and defaults to 10
if (r < p) counter++;
return counter;
}
Copy the code
As can be seen from the code logic, the larger the value of counter is, the slower the growth rate will be, so in the case of a large lFU_log_factor configuration, even the 8-bit can store a large number of visits. The figure below shows the growth of different LFU_log_factor at different access frequenciesRedis4.0 hot key discovery mechanism based on LFU.
The LFU counter is attenuated
If counter keeps increasing, even if the growth rate is very slow, it will increase to the maximum value of 255 one day. Finally, data screening cannot be done, so we need to add an attenuating strategy to it. The idea is that counter attenuates over time, and the specific code is as follows:
/* lfu counter decay logic, lfu_decay_time is how long counter decay 1, e.g. Lfu_decay_time == 10 * But when lFU_decay_time is 0, counter does not decay */
unsigned long LFUDecrAndReturn(robj *o) {
unsigned long ldt = o->lru >> 8;
unsigned long counter = o->lru & 255;
unsigned long num_periods = server.lfu_decay_time ? LFUTimeElapsed(ldt) / server.lfu_decay_time : 0;
if (num_periods)
counter = (num_periods > counter) ? 0 : counter - num_periods;
return counter;
}
Copy the code
Server.lfu_decay_time is also configurable and defaults to 10 to indicate that the counter value is reduced by 1 every 10 minutes.
Evict execution process
When evICT is implemented
Every time Redis processes a command, it checks the memory space and tries to execute EVict, because some cases do not need to execute EVict, as can be seen from issafetOperformetions.
static int isSafeToPerformEvictions(void) {
/* No lua script timed out, and no data timed out */
if (server.lua_timedout || server.loading) return 0;
/* Only master needs to do evict */
if (server.masterhost && server.repl_slave_ignore_maxmemory) return 0;
/* Evict is not needed when the client is paused, because the data does not change */
if (checkClientPauseTimeoutAndReturnIfPaused()) return 0;
return 1;
}
Copy the code
evict.c
The EVICT code is all in EVict.c. It contains each evICT execution process.
int performEvictions(void) {
if(! isSafeToPerformEvictions())return EVICT_OK;
int keys_freed = 0;
size_t mem_reported, mem_tofree;
long long mem_freed; /* May be negative */
mstime_t latency, eviction_latency;
long long delta;
int slaves = listLength(server.slaves);
int result = EVICT_FAIL;
if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
return EVICT_OK;
if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
return EVICT_FAIL; /* We need to free memory, but policy forbids. */
unsigned long eviction_time_limit_us = evictionTimeLimitUs();
mem_freed = 0;
latencyStartMonitor(latency);
monotime evictionTimer;
elapsedStart(&evictionTimer);
while (mem_freed < (long long)mem_tofree) {
int j, k, i;
static unsigned int next_db = 0;
sds bestkey = NULL;
int bestdbid;
redisDb *db;
dict *dict;
dictEntry *de;
if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
{
struct evictionPoolEntry *pool = EvictionPoolLRU;
while(bestkey == NULL) {
unsigned long total_keys = 0, keys;
/* We don't want to make local-db choices when expiring keys, * So to start the eviction pool sampling keys from * every DB
for (i = 0; i < server.dbnum; i++) {
db = server.db+i;
dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
db->dict : db->expires;
if((keys = dictSize(dict)) ! =0) { evictionPoolPopulate(i, dict, db->dict, pool); total_keys += keys; }}if(! total_keys)break; /* No keys to evict. */
/* Select the most suitable key from the pool. */
for (k = EVPOOL_SIZE- 1; k >= 0; k--) {
if (pool[k].key == NULL) continue;
bestdbid = pool[k].dbid;
if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
de = dictFind(server.db[pool[k].dbid].dict,
pool[k].key);
} else {
de = dictFind(server.db[pool[k].dbid].expires,
pool[k].key);
}
/* Remove from the elimination pool
if(pool[k].key ! = pool[k].cached) sdsfree(pool[k].key); pool[k].key =NULL;
pool[k].idle = 0;
/* If the key exists, is our pick. Otherwise it is * a ghost and we need to try the next element. */
if (de) {
bestkey = dictGetKey(de);
break;
} else {
/* Ghost... Iterate again. */}}}}/* Volatile -random and allkeys-random */
else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
{
/* We use the static variable next_db to store which DB is currently executed */ when randomizingout
for (i = 0; i < server.dbnum; i++) {
j = (++next_db) % server.dbnum;
db = server.db+j;
dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
db->dict : db->expires;
if(dictSize(dict) ! =0) {
de = dictGetRandomKey(dict);
bestkey = dictGetKey(de);
bestdbid = j;
break; }}}/* Removes the selected key from the dict. */
if (bestkey) {
db = server.db+bestdbid;
robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
/* We separately calculate the amount of memory freed by db*Delete(). In fact, the memory required to propagate between the AOF and the replica may be greater than the memory we are releasing (delete keys), which can be confusing if we consider this. The same is true for CSC failure messages generated by signalModifiedKey. Since the AOF and output buffer memory will eventually be freed, we only need to care about the memory used by the key space. * /
delta = (long long) zmalloc_used_memory();
latencyStartMonitor(eviction_latency);
if (server.lazyfree_lazy_eviction)
dbAsyncDelete(db,keyobj);
else
dbSyncDelete(db,keyobj);
latencyEndMonitor(eviction_latency);
latencyAddSampleIfNeeded("eviction-del",eviction_latency);
delta -= (long long) zmalloc_used_memory();
mem_freed += delta;
server.stat_evictedkeys++;
signalModifiedKey(NULL,db,keyobj);
notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
keyobj, db->id);
decrRefCount(keyobj);
keys_freed++;
if (keys_freed % 16= =0) {
/* When the memory to free starts to be large enough, we might spend too much time here and not be able to transfer the data to the replica fast enough, so we force the transfer in the loop. * /
if (slaves) flushSlavesOutputBuffers();
/* Normally our stop condition is to release a fixed, pre-calculated amount of memory. However, when we * delete an object in another thread, it is best to * check from time to time to see if the target * memory has been reached, because the amount of "mem\u Freed" is only * computed in the dbAsyncDelete () call, and the thread can * always free memory. * /
if (server.lazyfree_lazy_eviction) {
if (getMaxmemoryState(NULL.NULL.NULL.NULL) == C_OK) {
break; }}/* Exit the loop as early as possible after a certain amount of time - even if the memory limit has not been reached *. If we suddenly need to free a lot of memory, don't spend too much time here. * /
if (elapsedUs(evictionTimer) > eviction_time_limit_us) {
// We still need to free memory - start eviction timer proc
if(! isEvictionProcRunning) { isEvictionProcRunning =1;
aeCreateTimeEvent(server.el, 0,
evictionTimeProc, NULL.NULL);
}
break; }}}else {
goto cant_free; /* nothing to free... * /}}/* at this point, the memory is OK, or we have reached the time limit */
result = (isEvictionProcRunning) ? EVICT_RUNNING : EVICT_OK;
cant_free:
if (result == EVICT_FAIL) {
/* At this point, we have run out of evictable items. It's possible * that some items are being freed in the lazyfree thread. Perform a * short wait here if such jobs exist, but don't wait long. */
if (bioPendingJobsOfType(BIO_LAZY_FREE)) {
usleep(eviction_time_limit_us);
if (getMaxmemoryState(NULL.NULL.NULL.NULL) == C_OK) {
result = EVICT_OK;
}
}
}
latencyEndMonitor(latency);
latencyAddSampleIfNeeded("eviction-cycle",latency);
return result;
}
Copy the code
The execution process can be simply divided into three steps. First, fill evictionPoolEntry according to different configuration policies. The default pool size is 16.
conclusion
It can be seen that Redis sacrifices the accuracy of LRU and LFU for the sake of performance, which can only be said to be similar to LRU and LFU, but it is completely enough in the actual use process. After all, Redis has gone through the tests of numerous projects for so many years and still stands firm. Redis’s approach also offers a new way of thinking about software design, sacrificing accuracy for performance.
This article is a Redis source code analysis series of blog posts, but also with the corresponding Redis Chinese annotated version, there are students who want to further study Redis, welcome star and attention. Redis Chinese Annotated version warehouse: github.com/xindoo/Redi… IO /s/1h Redis source code analysis column: zxs. IO /s/1h If you find this article useful, welcome to three links.