Redis performance issues
- Executing the same command faster and slower?
- Does it take a long time to execute SET and DEL?
- All of a sudden you shake and you’re back to normal?
- It’s been stable for a long time, and all of a sudden it starts slowing down?
The larger the flow, the more obvious the performance problem
Three big questions
Network problem, Redis problem, or basic hardware problem
Thinking of the February
Command to query
< https://redis.io/topics/latency-monitor > the official document, use the command, **CONFIG SET latency-monitor-threshold 100** If the value of 100 is higher than 100ns, you need to check the latency-monitor-threshold.Copy the code
- Redis Server is best used on physical machines rather than virtual machines
- Do not connect frequently and use long connections
- Use aggregate commands (MSET/MGET) in preference to pipeline
- Prioritize using pipelines over sending commands frequently (multiple network trips)
- Consider using Lua scripts for commands that are not suitable for pipeline
- Continue to send the PING command, normal Redis baseline performance, target Redis baseline
** Maximum response delay of an instance within 60 seconds **
$127.0.0.1 redis - cli - h - p - 6379 the intrinsic - latency 60 Max latency so far: 1 microseconds. Max latency so far: 15 microseconds. Max latency so far: 17 microseconds. Max latency so far: 18 microseconds. Max latency so far: 31 microseconds. Max latency so far: 32 microseconds. Max latency so far: 59 microseconds. Max latency so far: 72 microseconds. 1428669267 total runs (avg latency: 0.0420 microseconds / 42.00 nanoseconds per run). Worst run took 1429x longer than the average latency.Copy the code
Result Analysis: Maximum response delay 72 microseconds View the minimum, maximum, and average access delay of Redis in a period of time
$ redis-cli -h 127.0.0.1 -p 6379 --latency-history -i 1
min: 0, max: 1, avg: 0.13 (100 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.12 (99 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.13 (99 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.10 (99 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.13 (98 samples) -- 1.00 seconds range
min: 0, max: 1, avg: 0.08 (99 samples) -- 1.01 seconds range
...
Copy the code
The average operation time of sampling Redis every 1 second, and the results are distributed between 0.08 and 0.13 ms
- ** The latest slowLog slowLog is queried
You can see at what point in time and which commands are time-consuming to execute. Run the following command to set the slowLog threshold
Slowlog-log-slower than 5000 slowlog-max-len 500 Slowlog-max-len 500Copy the code
Example Query the latest slow logs
127.0.0.1:6379> SLOWLOG get 5 1) 1) (integer) 32693 # SLOWLOG ID 2) (integer) 1593763337 # SLOWLOG timestamp 3) (integer) 5299 # SLOWLOG get 5 2) "user_List :2000" 3) "0" 4) "-1" 2) 1) (integer) 32692 2) (integer) 1593763337 3) (integer) 5044 4) 1) "GET" 2) "user_info:1000" ...Copy the code
Business Perspective analysis
Whether complex command
Use SlowLog: A log system that queries execution time. The time when the query is executed
-
Analysis:
- Computations that consume CPU
- Data assembly and network transmission are time-consuming
- Command queuing,redis 5 before were single-threaded, although IO multiplexed
-
Solution:
- Aggregate operations, placed on the client side (application) to perform calculations,
- In the O(n) command, n must be smaller than 300
The operation of the BigKey
The phenomenon of
The set/del is very slow
Allocating or releasing memory takes a long time
String large over 10K, Hash: 2W field
avoid
- Avoid BigKeys (less than 10KB)
- UNLINK replace DEL (Redis 4.0 + Lazyfree)
- Redis provides a command to scan for bigkeys. To scan the distribution of bigkeys in an instance, run the following command:
$redis-cli -h 127.0.0.1 -p 6379 --bigkeys -i 0.01... -------- summary ------- Sampled 829675 keys in the keyspace! Total Key length in bytes is 10059825 (AVG Len 12.13) Biggest string found 'Key :291880' has 10 bytes Biggest list found 'mylist:004' has 40 items Biggest set found 'myset:2386' has 38 members Biggest hash found 'myhash:3574' has 37 fields Biggest zset found 'myzset:2704' has 42 members 36313 strings with 363130 bytes (04.38% of keys, Avg size 10.00) 787393 Lists with 896540 items (94.90% of keys, Avg size 1.14) 1994 sets with 40052 members (00.24% of keys, Avg size 20.09) 1990 hashs with 39632 fields (00.24% of keys, Avg size 20.92) 1985 zsets with 39750 members (00.24% of keys, AVg size 20.03)Copy the code
Principle: Redis runs the SCAN command internally, traverses all keys in the entire instance, and then runs the STRLEN, LLEN, HLEN, SCARD, and ZCARD commands respectively according to the key types. To get the length of String and the number of elements of the container type (List, Hash, Set, ZSet). Friendly reminder:
- OPS of Redis increases dramatically when bigkey scans are performed on online instances. To reduce the impact of bigkey scans on Redis, it is best to control the scan frequency. Specify the -i parameter, which indicates the interval of rest after each scan, in seconds
- In the scan result, only the keys with the most elements can be scanned for container keys (List, Hash, Set, and ZSet). However, having more elements in a key does not necessarily mean more memory usage. You need to further evaluate the memory usage based on the service situation
Solution:
- Service applications should avoid writing bigkeys
- If you are using Redis 4.0 or higher, use the UNLINK command instead of the DEL command. This command can reduce the impact on Redis by putting the key memory release operation into the background thread
- If you are using Redis 6.0 or higher, you can enable lazy-free (lazyfree-lazy-user-del = yes). When del is executed, the memory will be freed in the background thread
Focus on overdue
To learn more about Redis, take a look at Dict RedisDB
/* Redis database representation. There are multiple databases identified * by integers from 0 (the default database) up to the max configured * database. The database number is the 'id' field in the structure. */ typedef struct redisDb { dict *dict; / The keyspace for this DB, value value Store space key val space*/ dict *expires; / Timeout of keys with a Timeout set, with a Timeout key space */ dict *blocking_keys; / Keys with clients waiting for data (BLPOP)*/ dict *ready_keys; / Blocked keys that received a PUSH */ dict *watched_keys; / WATCHED keys for MULTI/EXEC CAS */ int id; /* Database ID */ long long avg_ttl; /* Average TTL, just for stats timeout avg TTL */} redisDb;Copy the code
dict
typedef struct dict { dictType *type; Void * privData; dictht ht[2]; long rehashidx; /* rehashing not in progress if rehashidx == -1 */ unsigned long iterators; /* number of iterators currently running */ } dict;Copy the code
Each dict contains the dictionary dictht, which is used for rehashidx, generally using the first HT [0] dicht(dict.h/dicht)
/* This is our hash table structure. Every dictionary has two of this as we * implement incremental rehashing, for the old to the new table. */ typedef struct dictht { dictEntry **table; // Array unsigned long size; unsigned long sizemask; unsigned long used; } dictht;Copy the code
dictEntry(dict.h/dictEntry)
typedef struct dictEntry { void *key; Void *val; void *val; void *val; void *val; uint64_t u64; int64_t s64; double d; } v; struct dictEntry *next; // list} dictEntry;Copy the code
RedisDb instance
The top of the hour slower
Interval fixed time slowlog does not record expired_keys short bursts
Expiry policies
Periodically delete, which can be interpreted as a scheduled task with the default value of 100ms, randomly extracts data for lazy deletion, obtains a specified key, checks whether the key is expired, calls expireIfNeeded to check the input key, and deletes the expired key. Cardinality tree Wiki address
int expireIfNeeded(redisDb *db, robj *key) { mstime_t when = getExpire(db,key); mstime_t now;
if (when < 0) return 0; /* No expire for this key */ /* Don't expire anything while loading. It will be done later. */ if (server.loading) return 0; /* If we are in the context of a Lua script, we claim that time is * blocked to when the Lua script started. This way a key can expire * only the first time it is accessed and not in the middle of the * script execution, making propagation to slaves / AOF consistent. * See issue #1525 on Github for more information. */ now = server.lua_caller ? server.lua_time_start : mstime(); /* If we are running in the context of a slave, return ASAP: * the slave key expiration is controlled by the master that will * send us synthesized DEL operations for expired keys. * * Still we try to return the right information to the caller, * that is, 0 if we think the key should be still valid, 1 if * we think the key is expired at this time. */ if (server.masterhost ! = NULL) return now > when; /* Return when this key has not expired */ if (now <= when) return 0; /* Delete the key */ server.stat_expiredkeys++; propagateExpire(db,key,server.lazyfree_lazy_expire); notifyKeyspaceEvent(NOTIFY_EXPIRED, "expired",key,db->id); return server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) : dbSyncDelete(db,key); }Copy the code
Elimination strategy
The next trigger condition is when the memory is insufficient to hold new data
- Noeviction no space, insert data error
- Allkeys -lru least used key, delete
- Allkes-random Removes a key randomly
- Volatile -lru Removes the least recently used key and finds data in the key whose expiration time is configured
- Volatile -random If the memory is insufficient, a key is randomly removed and data in the key whose expiration time is set is searched
- Volatile – TTL: The key with an earlier expiration time is removed. The key with an earlier expiration time is retrieved
Redis 6 expiration will no longer be based on random sampling, but will be followed by a key cardinality tree sorted by expiration time, specifically Redis data structure
Binding the CPU
Most of the time, in order to improve service performance and reduce the performance loss caused by the context switch between multiple CPU cores, we usually adopt the way of process binding CPU to improve performance when deploying services. In addition to the main thread service client requests, Redis Server also creates child processes, child threads. Child processes are used for data persistence, while child threads are used to perform time-consuming operations such as asynchronously releasing FDS, asynchronously flushing AOF, asynchronously lazy-free, and so on. If you bind a Redis process to a single CPU logic core, then when Redis persists, the child fork will inherit the CPU usage preferences of the parent process. In this case, the child process will consume a large amount of CPU resources for data persistence (scanning out all instance data requires CPU), which will lead to CPU contention between the child process and the main process, thus affecting the main process to service client requests and increasing access latency. This is the performance problem with Redis CPU binding.
The phenomenon of
- Redis does the binding to fix a core
- RDB,AOF rewrite was slow
Socket is called S for short
- On a multi-CPU architecture, applications can run on different processors, can run on S1 for a period of time to save data, schedule to run on S2, and increase application latency if access to memory data from previous S1 is remote memory access. This is called the Non-Uniform Memory Access (NUMA) architecture. Remote access to the respective memory while jumping the program,
Solution: It is best to tie the network interrupt program and Redis instance to the same CPU Socket. The Redis instance can read the network data directly from the local memory, as shown in the figure below: note the NUMA CPU core numbering method, so as not to bind the wrong core, you can run the lscpu command to check the number of these logical cores
-
** Context switch**: The number of times the thread has changed the context switch.
- When one core is running, it needs to record where it is running, and when switching to another core, it needs to synchronize the recorded runtime information to the other core.
- The L1 and L2 caches on the other CPU core do not contain the frequently accessed instructions and data from the previous Redis instance, so these instructions and data need to be reloaded from the L3 cache, or even from memory. This reloading process takes time.
Solution:
Bind to a CPU core and use the command
//绑定到0号核上
taskset -c 0 ./redis-server
Copy the code
- Our system is basically a Linux system, and the CPU mode is adjusted to Performance, that is, high Performance mode
The main thread, background thread, background RDB process, and AOF rewrite process can be configured with a fixed CPU logic core:
# Redis Server and IO thread bound to CPU core 0,2,4,6 server_cpulist 0-7:2 # back table child thread bound to CPU core 1,3 bio_cpulist 1,3 # back table AOF rewrite process bound to CPU core 8,9,10,11 aof_rewrite_cpulist 8-11 # background RDB process bind to CPU core 1,10,11 # bgSAVe_cpulist 1,10-1Copy the code
Command to use
- Disable the keys command.
- Instead of querying all members at once, use the scan command for a batch, cursor – style traversal.
- The data size of Hash, Set, Sorted Set and other structures is strictly controlled through the mechanism.
- Sort, union, intersection and other operations on the client side to reduce the operating pressure of Redis server.
- Deleting (del) a large piece of data can take a long time, so it is recommended to use asynchronous unlink, which will start a new thread to delete the target data without blocking the main thread of Redis.
Memory reaches maxMemory
After the instance reaches maxMemory, you may notice that the latency increases each time new data is written after that point. Reason: After Redis reaches MaxMemory, Redis must first kick out some data from the instance to keep the entire instance below MaxMemory before writing new data to it. Elimination strategy has been said above, specific to see above, optimization scheme:
- Avoid storing bigkeys, reducing memory release time
- Elimination strategy changed to random elimination, random elimination is much faster than LRU (depending on the business situation)
- Split the instances, spreading the burden of eliminating keys over multiple instances
- Lazyfree-lazy-eviction = yes If you are using Redis 4.0 or higher, enable layz-free to exclude key from memory in background thread
Rehash
The phenomenon of
-
Write keys, occasional delays
-
Rehash + maxMemory triggers mass elimination!
- maxmemory = 6GB
- Current power memory = 5.8GB
- 512MB is required when capacity expansion is triggered
- Exceeding maxMemory triggers mass elimination
Rehash applies for memory and doubles the capacity
Control mode:
- The number of keys should be kept below 100 million
- Maxmemory does not rehash. Redis6.0 does not rehash
Let’s talk about the Rehash details
Redis splits to lazy for performance, and active synchronizes until rehash is complete
- lazy
- active
Redis-3.0-annotated – Unstable SRC dict. C redis-3.0-annotated-unstable SRC dict
`/* This function performs just a step of rehashing, and only if there are
- no safe iterators bound to our hash table. When we have iterators in the
- middle of a rehashing we can’t mess with the two hash tables otherwise
- some element can be missed or duplicated.
- Single-step rehash of a dictionary in the absence of a safe iterator.
- You can’t rehash a dictionary with a safe iterator,
- Because two different iteration and modification operations can mess up the dictionary.
- This function is called by common lookup or update operations in the
- dictionary so that the hash table automatically migrates from H1 to H2
- while it is actively used.
- This function is called by multiple common find, update operations,
- It allows dictionaries to rehash while being used.
- T = O(1)
*/ static void _dictRehashStep(dict *d) { if (d->iterators == 0) dictRehash(d,1); } `
/* Performs N steps of incremental rehashing. Returns 1 if there are still
- keys to move from the old to the new hash table, otherwise 0 is returned.
- Perform N incremental rehash steps.
- Returning 1 means there are still keys to move from hash 0 to hash 1,
- Return 0 to indicate that all keys have been migrated.
- Note that a rehashing step consists in moving a bucket (that may have more
- than one key as we use chaining) from the old to the new hash table.
- Note that each rehash step is in units of a hash table index (bucket),
- There may be multiple nodes in a bucket,
- All nodes in the rehash bucket are moved to the new hash table.
- T = O(N)
*/ int dictRehash(dict *d, int n) {
If (!) can only be executed while rehash is in progress. dictIsRehashing(d)) return 0; // T = O(N) while(N --) {dictEntry *de, *nextde; /* Check if we already rehashed the whole table... T = O(1) if (d->ht[0]. Used == 0) {zfree(d->ht[0]. D ->ht[0] = d->ht[1]; // reset old hash table _dictReset(&d->ht[1]); D ->rehashidx = -1; // Return 0 to indicate to the caller that the rehash is complete. } /* Note that rehashidx can't overflow as we are sure there are more * elements because ht[0].used ! = 0 */ / Make sure rehashidx does not overassert (d->ht[0]. Size > (unsigned)d->rehashidx); While (d->ht[0]. Table [d->rehashidx] == NULL) d->rehashidx++; De = d->ht[0]. Table [d->rehashidx]; T = O(1) while(de) {/* Move all keys in this bucket from the old to the new hash HT */ unsigned int h; Nextde = de->next; H = dictHashKey(d, de->key) & d->ht[1].sizemask; // Get the index in the new hash table. De ->next = d->ht[1]. Table [h]; d->ht[1].table[h] = de; // Update counter d->ht[0]. d->ht[1].used++; // proceed to next node de = nextde; Table [d->rehashidx] = NULL; // update rehash index d->rehashidx++; } return 1; }Copy the code
- inIn dictRehashStep, it calls the dictRehash method, andDictRehashStep will only rehash one value from HT [0] to HT [1] at a time, but since _dictRehashStep is called by dictGetRandomKey, dictFind, dictGenericDelete and dictAdd, So it gets called every time a dict is added or deleted, which speeds up the rehash process.
- The dictRehash function increments rehash n elements at a time. Since ht[1] was already set when autoresizable, the main process of rehash is to traverse HT [0], obtain the key, and rehash the key according to the size of the bucket ht[1]. After rehash, point HT [0] to HT [1] and clear HT [1]. Important in this process is the rehashidx, which represents the subscript position of ht[0] at the last rehash.
Active rehashing: serverCron->databasesCron — >incrementallyRehash->dictRehashMilliseconds->dictRehash
- ** serverCron**
- databasesCron
- incrementallyRehash
- dictRehashMilliseconds
- dictRehash
[1] serverCron
/* This is our timer interrupt, called server.hz times per second. *
- This is Redis’s time interrupter, calling server.Hz times per second.
- Here is where we do a number of things that need to be done asynchronously.
- For instance:
- The following operations need to be performed asynchronously:
-
- Active expired keys collection (it is also performed in a lazy way on
- lookup).
- Actively clear expired keys.
-
- Software watchdog.
- Updated information about the software watchdog.
-
- Update some statistic.
- Update statistics.
-
- Incremental rehashing of the DBs hash tables.
- Incrementally Rehash the database
-
- Triggering BGSAVE / AOF rewrite, and handling of terminated children.
- Triggers a BGSAVE or AOF override, and then processes the child processes that are triggered by the BGSAVE and AOF override to stop.
-
- Clients timeout of different kinds.
- Handle client timeout.
-
- Replication reconnection.
- Copy the reconnection
-
- Many more…
- Wait…
- Everything directly called here will be called server.hz times per second,
- so in order to throttle execution of things we want to do less frequently
- a macro is used: run_with_period(milliseconds) { …. }
- Because all the code in the serverCron function calls server.Hz per second,
- To limit the number of times some code can be called,
- Using a macro run_with_period(milliseconds) {… },
- This macro reduces the number of times the included code executes to one milliseconds.
* /
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { int j; REDIS_NOTUSED(eventLoop); REDIS_NOTUSED(id); REDIS_NOTUSED(clientData);
/* Software watchdog: deliver the SIGALRM that will reach the signal
* handler if we don't return here fast enough. */
if (server.watchdog_period) watchdogScheduleSignal(server.watchdog_period);
/* Update the time cache. */
updateCachedTime();
// 记录服务器执行命令的次数
run_with_period(100) trackOperationsPerSecond();
/* We have just REDIS_LRU_BITS bits per object for LRU information.
* So we use an (eventually wrapping) LRU clock.
*
* Note that even if the counter wraps it's not a big problem,
* everything will still work but some object will appear younger
* to Redis. However for this to happen a given object should never be
* touched for all the time needed to the counter to wrap, which is
* not likely.
*
* 即使服务器的时间最终比 1.5 年长也无所谓,
* 对象系统仍会正常运作,不过一些对象可能会比服务器本身的时钟更年轻。
* 不过这要这个对象在 1.5 年内都没有被访问过,才会出现这种现象。
*
* Note that you can change the resolution altering the
* REDIS_LRU_CLOCK_RESOLUTION define.
*
* LRU 时间的精度可以通过修改 REDIS_LRU_CLOCK_RESOLUTION 常量来改变。
*/
server.lruclock = getLRUClock();
/* Record the max memory used since the server was started. */
// 记录服务器的内存峰值
if (zmalloc_used_memory() > server.stat_peak_memory)
server.stat_peak_memory = zmalloc_used_memory();
/* Sample the RSS here since this is a relatively slow call. */
server.resident_set_size = zmalloc_get_rss();
/* We received a SIGTERM, shutting down here in a safe way, as it is
* not ok doing so inside the signal handler. */
// 服务器进程收到 SIGTERM 信号,关闭服务器
if (server.shutdown_asap) {
// 尝试关闭服务器
if (prepareForShutdown(0) == REDIS_OK) exit(0);
// 如果关闭失败,那么打印 LOG ,并移除关闭标识
redisLog(REDIS_WARNING,"SIGTERM received but errors trying to shut down the server, check the logs for more information");
server.shutdown_asap = 0;
}
/* Show some info about non-empty databases */
// 打印数据库的键值对信息
run_with_period(5000) {
for (j = 0; j < server.dbnum; j++) {
long long size, used, vkeys;
// 可用键值对的数量
size = dictSlots(server.db[j].dict);
// 已用键值对的数量
used = dictSize(server.db[j].dict);
// 带有过期时间的键值对数量
vkeys = dictSize(server.db[j].expires);
// 用 LOG 打印数量
if (used || vkeys) {
redisLog(REDIS_VERBOSE,"DB %d: %lld keys (%lld volatile) in %lld slots HT.",j,used,vkeys,size);
/* dictPrintStats(server.dict); */
}
}
}
/* Show information about connected clients */
// 如果服务器没有运行在 SENTINEL 模式下,那么打印客户端的连接信息
if (!server.sentinel_mode) {
run_with_period(5000) {
redisLog(REDIS_VERBOSE,
"%lu clients connected (%lu slaves), %zu bytes in use",
listLength(server.clients)-listLength(server.slaves),
listLength(server.slaves),
zmalloc_used_memory());
}
}
/* We need to do a few operations on clients asynchronously. */
// 检查客户端,关闭超时客户端,并释放客户端多余的缓冲区
clientsCron();
/* Handle background operations on Redis databases. */
// 对数据库执行各种操作
databasesCron();
/* Start a scheduled AOF rewrite if this was requested by the user while
* a BGSAVE was in progress. */
// 如果 BGSAVE 和 BGREWRITEAOF 都没有在执行
// 并且有一个 BGREWRITEAOF 在等待,那么执行 BGREWRITEAOF
if (server.rdb_child_pid == -1 && server.aof_child_pid == -1 &&
server.aof_rewrite_scheduled)
{
rewriteAppendOnlyFileBackground();
}
/* Check if a background saving or AOF rewrite in progress terminated. */
// 检查 BGSAVE 或者 BGREWRITEAOF 是否已经执行完毕
if (server.rdb_child_pid != -1 || server.aof_child_pid != -1) {
int statloc;
pid_t pid;
// 接收子进程发来的信号,非阻塞
if ((pid = wait3(&statloc,WNOHANG,NULL)) != 0) {
int exitcode = WEXITSTATUS(statloc);
int bysignal = 0;
if (WIFSIGNALED(statloc)) bysignal = WTERMSIG(statloc);
// BGSAVE 执行完毕
if (pid == server.rdb_child_pid) {
backgroundSaveDoneHandler(exitcode,bysignal);
// BGREWRITEAOF 执行完毕
} else if (pid == server.aof_child_pid) {
backgroundRewriteDoneHandler(exitcode,bysignal);
} else {
redisLog(REDIS_WARNING,
"Warning, detected child with unmatched pid: %ld",
(long)pid);
}
updateDictResizePolicy();
}
} else {
/* If there is not a background saving/rewrite in progress check if
* we have to save/rewrite now */
// 既然没有 BGSAVE 或者 BGREWRITEAOF 在执行,那么检查是否需要执行它们
// 遍历所有保存条件,看是否需要执行 BGSAVE 命令
for (j = 0; j < server.saveparamslen; j++) {
struct saveparam *sp = server.saveparams+j;
/* Save if we reached the given amount of changes,
* the given amount of seconds, and if the latest bgsave was
* successful or if, in case of an error, at least
* REDIS_BGSAVE_RETRY_DELAY seconds already elapsed. */
// 检查是否有某个保存条件已经满足了
if (server.dirty >= sp->changes &&
server.unixtime-server.lastsave > sp->seconds &&
(server.unixtime-server.lastbgsave_try >
REDIS_BGSAVE_RETRY_DELAY ||
server.lastbgsave_status == REDIS_OK))
{
redisLog(REDIS_NOTICE,"%d changes in %d seconds. Saving...",
sp->changes, (int)sp->seconds);
// 执行 BGSAVE
rdbSaveBackground(server.rdb_filename);
break;
}
}
/* Trigger an AOF rewrite if needed */
// 出发 BGREWRITEAOF
if (server.rdb_child_pid == -1 &&
server.aof_child_pid == -1 &&
server.aof_rewrite_perc &&
// AOF 文件的当前大小大于执行 BGREWRITEAOF 所需的最小大小
server.aof_current_size > server.aof_rewrite_min_size)
{
// 上一次完成 AOF 写入之后,AOF 文件的大小
long long base = server.aof_rewrite_base_size ?
server.aof_rewrite_base_size : 1;
// AOF 文件当前的体积相对于 base 的体积的百分比
long long growth = (server.aof_current_size*100/base) - 100;
// 如果增长体积的百分比超过了 growth ,那么执行 BGREWRITEAOF
if (growth >= server.aof_rewrite_perc) {
redisLog(REDIS_NOTICE,"Starting automatic rewriting of AOF on %lld%% growth",growth);
// 执行 BGREWRITEAOF
rewriteAppendOnlyFileBackground();
}
}
}
// 根据 AOF 政策,
// 考虑是否需要将 AOF 缓冲区中的内容写入到 AOF 文件中
/* AOF postponed flush: Try at every cron cycle if the slow fsync
* completed. */
if (server.aof_flush_postponed_start) flushAppendOnlyFile(0);
/* AOF write errors: in this case we have a buffer to flush as well and
* clear the AOF error in case of success to make the DB writable again,
* however to try every second is enough in case of 'hz' is set to
* an higher frequency. */
run_with_period(1000) {
if (server.aof_last_write_status == REDIS_ERR)
flushAppendOnlyFile(0);
}
/* Close clients that need to be closed asynchronous */
// 关闭那些需要异步关闭的客户端
freeClientsInAsyncFreeQueue();
/* Clear the paused clients flag if needed. */
clientsArePaused(); /* Don't check return value, just use the side effect. */
/* Replication cron function -- used to reconnect to master and
* to detect transfer failures. */
// 复制函数
// 重连接主服务器、向主服务器发送 ACK 、判断数据发送失败情况、断开本服务器超时的从服务器,等等
run_with_period(1000) replicationCron();
/* Run the Redis Cluster cron. */
// 如果服务器运行在集群模式下,那么执行集群操作
run_with_period(100) {
if (server.cluster_enabled) clusterCron();
}
/* Run the Sentinel timer if we are in sentinel mode. */
// 如果服务器运行在 sentinel 模式下,那么执行 SENTINEL 的主函数
run_with_period(100) {
if (server.sentinel_mode) sentinelTimer();
}
/* Cleanup expired MIGRATE cached sockets. */
// 集群。。。TODO
run_with_period(1000) {
migrateCloseTimedoutSockets();
}
// 增加 loop 计数器
server.cronloops++;
return 1000/server.hz;
Copy the code
}
// Perform various operations on the database // Perform delete expired keys, resize, and active and progressive rehash on the database [2] databasesCron void databasesCron(void) {
// The function removes the expired key from the database, Expire keys by random sampling. Not required for slaves * as master will synthesize DELs for us. */ / If the server is not a slave server, then run the active expiration key to clear if (server.active_expire_enabled && server.masterHost == NULL) // Clear mode to CYCLE_SLOW, This mode clears the activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW) keys as much as possible; /* Perform hash tables rehashing if needed, but only if there are no * other processes saving the DB on disk. Otherwise rehashing is bad * as will cause a lot of Copy-on-write of memory pages. */ / Rehash if (server.rdb_child_pid == -1 &&server.aof_child_pid == -1) {/* We use global counters so if We stop the computation at a given * DB we'll be able to start from the successive in the next * cron loop iteration. */ static unsigned int resize_db = 0; static unsigned int rehash_db = 0; unsigned int dbs_per_call = REDIS_DBCRON_DBS_PER_CALL; unsigned int j; If (dbs_per_call > server.dbnum) dbs_per_call = server.dbnum; if (dbs_per_call > server.dbnum) dbs_per_call = server.dbnum; /* Resize */ / Resize for (j = 0; j < dbs_per_call; j++) { tryResizeHashTables(resize_db % server.dbnum); resize_db++; } /* Rehash */ / Incremental Rehash of dictionaries if (server.activerehashing) {for (j = 0; j < dbs_per_call; j++) { int work_done = incrementallyRehash(rehash_db % server.dbnum); rehash_db++; if (work_done) { /* If the function did some work, stop here, we'll do * more at the next cron loop. */ break; }}}}Copy the code
}
Rehashing [3] incrementallyRehash /* Our hash table implementation performs rehashing incrementally while
- we write/read from the hash table. Still if the server is idle, the hash
- table will use two tables for a long time. So we try to use 1 millisecond
- of CPU time at every call of this function to perform some rehahsing.
- While the server incrementally rehashes the database when it executes read/write commands on it,
- But if the server does not execute the command for a long time, the rehash of the database dictionary may never complete,
- To prevent this, we need to perform active rehash on the database.
- The function returns 1 if some rehashing was performed, otherwise 0
- is returned.
- The function returns 1 if an active rehash is performed, and 0 otherwise.
*/ int incrementallyRehash(int dbid) {
/* Keys dictionary */
if (dictIsRehashing(server.db[dbid].dict)) {
dictRehashMilliseconds(server.db[dbid].dict,1);
return 1; /* already used our millisecond for this loop... */
}
/* Expires */
if (dictIsRehashing(server.db[dbid].expires)) {
dictRehashMilliseconds(server.db[dbid].expires,1);
return 1; /* already used our millisecond for this loop... */
}
return 0;
Copy the code
}
// Rehash the dictionary for a given 100 milliseconds. [4] dictRehashMilliseconds /* Rehash for an amount of time between ms milliseconds and ms+1 milliseconds / /
- Rehash the dictionary in 100 steps for a given number of milliseconds.
- T = O(N)
*/ int dictRehashMilliseconds(dict *d, int ms) {long long start = timeInMilliseconds(); int rehashes = 0;
while(dictRehash(d,100)) { rehashes += 100; If (timeInMilliseconds()-start > ms) break; } return rehashes;Copy the code
}
// Performs N steps of incremental rehashing [5] dictRehash /* Performs N steps of incremental rehashing. Returns 1 if there are still
- keys to move from the old to the new hash table, otherwise 0 is returned.
- Perform N incremental rehash steps.
- Returning 1 means there are still keys to move from hash 0 to hash 1,
- Return 0 to indicate that all keys have been migrated.
- Note that a rehashing step consists in moving a bucket (that may have more
- than one key as we use chaining) from the old to the new hash table.
- Note that each rehash step is in units of a hash table index (bucket),
- There may be multiple nodes in a bucket,
- All nodes in the rehash bucket are moved to the new hash table.
- T = O(N)
*/ int dictRehash(dict *d, int n) {
If (!) can only be executed while rehash is in progress. dictIsRehashing(d)) return 0; // T = O(N) while(N --) {dictEntry *de, *nextde; /* Check if we already rehashed the whole table... T = O(1) if (d->ht[0]. Used == 0) {zfree(d->ht[0]. D ->ht[0] = d->ht[1]; // reset old hash table _dictReset(&d->ht[1]); D ->rehashidx = -1; // Return 0 to indicate to the caller that the rehash is complete. } /* Note that rehashidx can't overflow as we are sure there are more * elements because ht[0].used ! = 0 */ / Make sure rehashidx does not overassert (d->ht[0]. Size > (unsigned)d->rehashidx); While (d->ht[0]. Table [d->rehashidx] == NULL) d->rehashidx++; De = d->ht[0]. Table [d->rehashidx]; T = O(1) while(de) {/* Move all keys in this bucket from the old to the new hash HT */ unsigned int h; Nextde = de->next; H = dictHashKey(d, de->key) & d->ht[1].sizemask; // Get the index in the new hash table. De ->next = d->ht[1]. Table [h]; d->ht[1].table[h] = de; // Update counter d->ht[0]. d->ht[1].used++; // proceed to next node de = nextde; Table [d->rehashidx] = NULL; // update rehash index d->rehashidx++; } return 1;Copy the code
}
The source code for rehash is now complete, so let’s continue to analyze why rehash affects performance. Rehash operations cause more data movement operations
When does Redis do rehash?
Redis uses the load factor to determine whether rehash is required. The load factor is calculated by dividing the number of entries in the hash table by the number of hash buckets in the hash table. Redis triggers rehash based on two conditions of the load factor: the load factor is greater than or equal to 1, and the hash table is allowed to rehash; Loading factor ≥5.
- In the first case, if the load factor is equal to 1, and we assume that all the key-value pairs are evenly distributed across the buckets of the hash table, then the hash table can dispenses with chained hashing, because a hash bucket holds exactly one key-value pair. However, if new data is written at this point, the hash table will be chained, which can affect query performance. Rehash of hash tables is disallowed during RDB generation and AOF overwriting to avoid impacting RDB and AOF overwriting. If Redis is not generating RDB and overwriting AOF at this point, then it can rehash. Otherwise, the hash table will start using the slower query chain hash when data is written.
- In the second case, that is, when the load factor is greater than or equal to 5, it indicates that the amount of data currently saved is far greater than the number of hash buckets, and there will be a large number of chained hashes in the hash bucket, which will seriously affect the performance. At this point, rehash is immediately started. If the load factor is less than 1, or if the load factor is greater than 1 but less than 5, and the hash table is temporarily not allowed to rehash (for example, the instance is generating an RDB or overwriting an AOF), then the hash table will not rehash.
Scheduled tasks include rehash operations. A scheduled task is a task that is executed at a certain frequency (for example, every 100ms).
The operational level
The fork persistence
The phenomenon of
Redis delays occur during RDB and AOF rewrite, so you need to check for possible slowdowns during this time. During the fork execution, the main process needs to copy its own memory page table to the child process. If the instance is very large, the copying process will take a long time. At this time, if the CPU resources are also very tight, the fork will take a longer time, possibly reaching the level of seconds. This can seriously affect Redis performance.
Location problem
Run the INFO command on Redis to view the latest_FORK_usec entry in microseconds
Latest_fork_usec :59477
This is how long it takes the master process to fork a child process and the entire instance is blocked and unable to process a client request. If it’s a long one, it can be understood as a STW state in the JVM. Instances are not available. The master node also creates a child process to generate the RDB, which is then sent to the slave node for a full synchronization, so this process also has a performance impact on Redis.
The solution
- Slave executes at night, at low peaks, when configured for persistence, and can turn off AOF and AOF rewrite for services that are insensitive to loss of data (such as using Redis as a pure cache)
- Control the memory of Redis instance, controlled within 10G, fork time is also proportional to the size of the instance
- Reduce the probability of full synchronization between primary and secondary libraries: Properly increase the repl-backlog-size parameter to avoid full synchronization between primary and secondary libraries
Open AOF
How AOF works
- After Redis executes the write command, it writes the command to the AOF file memory (write system call)
- Redis flusher AOF memory data to disk according to the configured AOF flush policy (fsync system call)
Specific version
- After the main thread finishes operating on memory data, it performs write and then decides whether to delay fdatasync immediately or later based on the configuration
- When Redis is started, a dedicated BIO thread is created to handle AOF persistence
- If apendFsync =everysec, the asynchronous task (BIO) will be created when the time arrives
- The BIO thread polls the task pool and synchronously executes fdatasync after receiving the task
Redis uses the apendfsync parameter to set different flush policies. Apendfsync has three main options:
-
Always:
- Note: The main thread is flushed immediately after each write operation. This scheme occupies a large amount of DISK I/O resources, but has the highest data security.
- Problem: The data will be returned only after the command is written to the disk. This process is completed by the main thread, which will increase the pressure of Redis and make the link long
-
No:
- The operating system decides when the memory data is flushed to disk. This scheme has the least impact on performance, but the data security is also the lowest. The data lost during Redis downtime depends on the flush time of the operating system
- Problem: Data in memory will be lost during downtime.
-
Everysec:
- Explain that the main thread writes only to memory for each write operation, and then the background thread performs the flush operation every one second (triggering the fsync system call). This scheme has a relatively small impact on performance, but will lose 1 second of data when Redis goes down
- The Redis background thread will block when it performs an AOF file flush (fsync system call) if the disk IO load is high. At this time, the main thread will still receive write requests, and then the main thread needs to write data to the file memory (write system call), but at this time, due to the high disk load, fsync blocks, and the main thread will be blocked when executing the write system call. The main thread does not return successfully until the background thread fsync completes. :
The phenomenon of
- Disk load is high.
- The child process is doing AOF rewrite, which can consume a lot of disk IO resources
The solution
-
The hardware was upgraded to SSD. Procedure
-
A program that locates bandwidth usage on a disk
-
no-appendfsync-on-rewrite = yes
- (during AOF rewrite, appendfsync = no)
- During AOF rewrite, the child thread behind AOF does not flush
- In the meantime, temporarily set appendfsync to None
The Redis author wrote a blog about the impact of AOF on access latency, fsync() on a Different Thread: Apparently a useless trick, bio is not a huge delay improvement, because while fdatasync runs in the background when apendFsync =everysec, Wirte’s AOF_buF is not large enough to cause almost no blocking. Fdataysnc holds the file handle. Fwrite also uses the file handle. Write blocks the main thread. This is why RAID performance issues on wave servers did not affect most applications, but they did affect applications such as Redis which are very latency-sensitive. Can WE turn off AOF? If AOF is enabled, it will cause access delay. Yes, it can be disabled for pure cache scenarios, such as scenarios where data Missed is automatically accessed to the database or scenarios where data can be quickly reconstructed from the database, to obtain optimal performance. In fact, even if AOF is disabled, it does not mean that the data of a shard will be lost when the shard instance crashes. In actual production environment, each shard has two instances, Master and Slave, which are synchronized by Redis Replication mechanism. When the primary instance crashes, the primary and secondary instances are switched over automatically, and the secondary instance is switched over to ensure data reliability. To avoid simultaneous crashes, the primary and secondary instances are distributed on different physical machines and switches in actual production environments.
Use Swap virtual memory
Redis virtual memory will be available for the first time in a stable release of Redis 2.0. The virtual memory (VM from now on) of the Redis unstable branch on Git is available and has been tested to be stable enough.
Introduction to the
Redis follows the key-value model. Both keys and values are usually stored in memory. However, sometimes this is not the best choice, so during the design process we required that the key must be stored in memory (for quick lookup), and the value can be swapped out of memory to disk when rarely used. In practice, if there is a 100,000-record data set of keys in memory and only 10% of them are used frequently, Redis, which enables virtual memory, will transfer the values corresponding to the less-used keys to disk. When the client requests these values, they are read back from the swap file and loaded into memory.
explain
The official explanation is similar to the Windows virtual memory. When the memory is insufficient, part of the hard disk space is used as memory. The Android operating system is based on Linux. Therefore, Swap partitions can also be used to improve system efficiency. Swap a partition. Swap a partition. Its function is in the case of insufficient memory, the operating system first put temporarily unused data in the memory, save to the hard disk swap space, free up memory for other programs to run, and Windows virtual memory (pagefile.sys) function is the same.
The phenomenon of
- Request to slow
- Response latency in milliseconds/second
- The service is basically unavailable
See # Redis process ID $ps - aux | grep Redis - server # to check the Redis Swap usage $cat/proc / $pid/smaps | egrep '^ (Swap | Size)' Size: 1256 kB Swap: 0 kB Size: 4 kB Swap: 0 kB Size: 132 kB Swap: 0 kB Size: 63488 kB Swap: 0 kB Size: 132 kB Swap: 0 kB Size: 65404 kB Swap: 0 kB Size: 1921024 kB Swap: 0 kB ... Each line of Size indicates the Size of the memory used by Redis. The Swap below Size indicates the Size of the memory used by Redis. How much data has been swapped to disk. In this case, the solution 1 is to increase the memory of the machine so that Redis has enough memory to use. 2. Rearrange the memory space so that Redis can use enough memory, and then release Redis Swap. Let Redis reuse memoryCopy the code
Analysis of the
- Memory data is mapped to disk by virtual address
- Reading data from disk is slow
avoid
- Reserve more space and avoid using swap
- Memory/swap monitoring
Memory fragments
Cause of occurrence
If the redis data is frequently modified, it may cause redis memory fragmentation, which will reduce the redis memory usage. You can run the INFO command to obtain the memory fragmentation rate of this instance:
- The write operation
- Memory allocator
Analysis of the
The official formula for Redis memory fragmentation is as follows: ** mem_fragmentation_ratio = used_memory_rss/used_memory
- The former is the total memory footprint of redis process RES that we see through the top command
- The latter is allocated by Redis memory allocator (such as Jemalloc), including its own memory, buffers, data objects, and so on
If the ratio of the two is < 1, it means the fragmentation rate is low, and > 1 means the fragmentation rate is high. The problem of the high fragmentation rate is described in this article, but the low fragmentation rate is basically due to the use of SWAP, which leads to the slow performance of Redis due to accessing disks. But is that really the case?
- The low memory fragmentation rate of Redis is not only related to SWAP. It is generally recommended that SWAP be disabled in production environments.
- When the replication backlog buffer configuration is large and the amount of service data is small, the fragmentation rate is much lower than 1. This is a normal phenomenon and no optimization or adjustment is required.
- The value of the online environment replication buffer, repl-backlog-size, is usually set to a large value to prevent frequent full replication of the primary library from affecting performance.
- As the amount of service data increases, the Redis memory fragmentation ratio gradually approaches 1.
The solution
- Defragmentation is not enabled
- Set thresholds properly
The automatic defragmentation parameter is turned off by default, You can run the following command to query 127.0.0.1:6379> config get ActiveDefrag 1) "activeDefrag" 2) "no" to enable automatic memory fragmentation clearing 127.0.0.1:6379> config set Activedefrag yes OK Manual purge command 127.0.0.1:6379> Memory purge OK Defragmentation is executed in the main threadCopy the code
Network bandwidth
The phenomenon of
- It’s been running steadily, and then suddenly it starts to slow down, and it continues
- Network bandwidth alarm
avoid
- Troubleshoot the problem. What’s causing the bandwidth drag
- Capacity expansion and Migration
- Bandwidth warning
monitoring
Increase monitoring of various indicators of Redis machine. Monitor whether there are bugs in the script code review
Think about it in terms of resources
- CPU: Complex commands, data persistence
- Memory: BigKey memory application/release, data expiration/obsolescence, defragmentation, large memory page, Copy On Write
- Disk: Data persistence. AOF disk flushing strategy
- Network: traffic overload, short connection
- Computer system: CPU architecture
- Operating system: large memory pages. Copy on Write Swap CPU Bind cores
How to Get the best out of Redis
- Keep the key as short as possible to save memory
- • avoid bigkeys (less than 10KB)
- The aggregation command is done on the client side
- O(N) command, N<=300• Batch command using Pipeline, reduce the IO back and forth times
- Avoid concentrated expiration, expiration time scattered
- Choose an appropriate elimination strategy
- The number of keys in a single instance is less than 100 million
How to use Redis best
- Isolated deployment (line of business, master slave repository)
- A single instance is less than 10G
- Slave Performs backup
- Pure cache can turn off AOF
- Instances are not deployed on virtual machines
- Close large memory pages
- AOF is configured to everysec
- Caution Binding a CPU
Be familiar with the monitoring principles to ensure adequate CPU, memory, disk, and network resources!
Summary area
The resources
Why is Redis slower? Article explaining how to thoroughly screen Redis performance problems | word long kaito-kidd.com/2021/01/23/…
Why the CPU structure will influence the performance of the Redis time.geekbang.org/column/arti…
IO/Documentati Redis documents are distributed under the Creative Commons Attribution – Share alike 4.0 International License Redis. IO/Documentati…
Redis expire keys www.jianshu.com/p/21f648579…
What should I do if the Redis memory fragmentation rate is too low? zhuanlan.zhihu.com/p/360796352