Redis design optimization
Estimate Redis memory usage
To estimate the size of memory occupied by data in REDis, you need to have a comprehensive understanding of the memory model of REDis
Suppose there are 90,000 key-value pairs, each key is 12 bytes long, and each value is 12 bytes long (and neither key nor value is an integer).
Let’s estimate the space taken up by those 90,000 key/value pairs. Before estimating the space occupied, you can first determine the encoding used for the string type: embstr
The memory space occupied by 90,000 key-value pairs can be divided into two parts: one is the space occupied by 90,000 dictentries; One part is the bucket space required by the key-value pair
The space occupied by each dictEntry consists of
-
A dictEntry structure, 24 bytes, jemalloc will allocate 32 bytes of memory block (64-bit operating system, a pointer 8 bytes, a dictEntry consists of three Pointers)
-
A key, 12 bytes, so SDS(key) requires 12+4=16 bytes ([SDS length =4+ string length), jemalloc will allocate 16 bytes of memory blocks
-
A redisObject, 16 bytes, jemalloc allocates 16 bytes of memory (4bit+4bit+24bit+4Byte+8Byte=16Byte)
-
A value, 12 bytes, so SDS(value) requires 12+4=16 bytes ([SDS length =4+ string length), Jemalloc will allocate 16 bytes of memory blocks
-
To sum up, a dictEntry takes up 32+16+16+16=80 bytes
The smallest 2^n bucket array size greater than 90000 is 131072; Each bucket element (which stores pointer elements) is 8 bytes (because Pointers are 8 bytes on 64-bit systems)
Therefore, it can be estimated that the memory size occupied by the 90,000 key-value pairs is: 9000080 + 1310728 = 8248576
As a comparison, if the length of key and value is increased from 12 bytes to 13 bytes, the corresponding SDS becomes 17 bytes and Jemalloc will allocate 32 bytes, so the number of bytes occupied by each dictEntry also changes from 80 bytes to 112 bytes. In this case, the memory occupied by the 90,000 key/value pairs is estimated to be 90000112 + 1310728 = 11128576
Optimize memory footprint
Understanding the memory model of Redis is helpful in optimizing redis memory footprint. The following describes several optimization scenarios and methods
-
Optimization using jEMalloc characteristics
Since jemalloc allocates memory values discontinuously, a one-byte change in the key/value string can cause a large change in memory usage. Take advantage of this at design time
For example, if the key is 13 bytes long, SDS allocates 17 bytes and Jemalloc allocates 32 bytes. At this point, if the key length is reduced to 12 bytes, SDS is 16 bytes and JEMalloc allocates 16 bytes. The space occupied by each key can be reduced by half
-
Use an integer/long integer
If it is an integer/long integer, Redis saves more space by using int (8 bytes) instead of string. Therefore, use long integers in scenarios where long integers can be used instead of strings
-
The Shared object
With shared objects, you can reduce object creation (and redisObject creation), saving memory space. Currently, shared objects in Redis only contain 10,000 integers (0-9999); You can increase the number of shared objects by adjusting the OBJ_SHARED_INTEGERS parameter
For example, if OBJ_SHARED_INTEGERS is set to 20000, objects between 0 and 19999 can be shared. Forum sites store the number of views per post in Redis, and most of those views are in the range of 0 to 20,000. By appropriately increasing the OBJ_SHARED_INTEGERS parameter, you can save memory by using shared objects
- Shorten the storage length of key-value pairs
The length of key-value pairs is inversely proportional to performance. For example, if we do a set of performance tests to write data, the results are as follows
As can be seen from the above data, under the condition that the key remains unchanged, the larger the value is, the slower the operation efficiency will be, because Redis will use different internal encodings for the same data type. For example, there are three internal encodings for strings: Int (integer encoding), RAW (string encoding optimized memory allocation), embstr (dynamic string encoding), this is because the author of Redis wants to achieve the balance of efficiency and space through different encoding, but the larger the amount of data, the more complex the use of internal encoding, and the more complex the internal encoding storage performance will be lower
That’s just the speed of writing, but there are a few other problems that can arise when the key-value pairs are large
-
The larger the content, the longer the persistence time, and the longer the hang time, the worse Redis performance will be
-
The larger the content, the more content that can be transferred over the network, the longer it takes, and the slower the overall performance
-
The larger the content, the more memory it consumes, the more frequently it triggers memory flushing, which puts more of a running burden on Redis
Therefore, while ensuring the integrity of semantics, we should shorten the storage length of key-value pairs as much as possible. If necessary, data should be serialized, compressed and then stored. Take Java as an example, protostuff or Kryo can be used for serialization, and Snappy can be used for compression
View Redis memory statistics
Used_memory :853464 # The amount of memory used by the operating system, not including virtual memory (bytes)used_memory is the amount of memory used by Redis, It contains the memory used by the actual cache and the memory used by Redis itself to run (e.g., metadata, LUa). It is allocated by Redis using the memory allocator, so this figure does not take into account the memory wasted by fragmentation. Used_memory_rss :12247040 # Memory fragmentation ratio Less than 1 indicates virtual memory usage mem_fragmentation_ratio:15.07 # Memory fragmentation bytes mem_fragmentation_bytes Mem_allocator :jemlloc-5.1.0Copy the code
-
used_memory
The usED_memory field data represents the amount of memory allocated by the Redis allocator, in bytes
Used_memory_human is just a more human display
-
used_memory_rss
Records the Redis process memory allocated by the operating system and the memory fragments (in bytes) in Redis memory that can no longer be allocated by Jemalloc
Used_memory vs. USed_memory_rss
The former is from the Redis perspective, the latter is from the operating system perspective. The difference is partly due to the fact that memory fragmentation and Redis processes require less memory to run, making the former less likely than the latter
Due to the large amount of Redis data in practical applications, the memory occupied by the process running at this time is much smaller than the amount of Redis data and memory fragments. Therefore, the ratio of USED_memory_RSS to USED_memory becomes a parameter to measure the memory fragmentation rate of Redis. This parameter is mem_fragmentation_ratio
-
mem_fragmentation_ratio
Memory fragmentation ratio, which is the ratio of USED_memory_rss to USED_memory
Mem_fragmentation_ratio is generally greater than 1, and a larger value indicates a larger proportion of memory fragments
mem_fragmentation_ratio<1
The mem_fragmentation_ratio value is calculated as the ratio between the size of the process’s memory resident set (RSS, as measured by the OS) and the total number of bytes Redis allocates using the allocator. Now, if more memory is allocated using LIBC (compared to Jemalloc, TCMALloc), or if some other process on the system is using more memory during benchmarking, Redis memory can be exchanged through the operating system. It reduces RSS(because a portion of Redis memory is no longer in main memory). The resulting fragmentation rate will be less than 1. In other words, this ratio only makes sense if you are certain that the operating system does not swap Redis memory (if it does not, then there will be performance issues anyway)
In general, mem_fragmentation_ratio around 1.03 is a healthy state (for Jemalloc); The initial mem_fragmentation_ratio is large because no data has been stored into Redis, and the Redis process itself runs in memory that makes USED_memory_rss much larger than USED_memory
-
mem_allocator
The memory allocator used by Redis, specified at compile time; It can be libc, jemalloc, or TCMALloc. The default is jemalloc
Redis performance optimization
Sets the expiration time of a key value
We should set a reasonable expiration time for key values according to the actual business situation. In this way, Redis will automatically clear expired key pairs for you to save memory usage and avoid excessive accumulation of key values and frequent triggering of memory flushing policy
Redis has four different commands that can be used to set the lifetime of the key (how long the key can exist) or the expiration time (when the key will be removed)
-
The EXPlRE command is used to set the lifetime of a key to TTL seconds
-
The PEXPIRE command is used to set the lifetime of a key to TTL milliseconds
-
The EXPIREAT < timestamp> command is used to set the expiration time of key to the timestamp specified in seconds
-
The PEXPIREAT < timestamp > command is used to set the expiration time of key to the timestamp specified in milliseconds
Using the Lazy Free feature
The lazy Free feature, a much-used addition to Redis 4.0, can be understood as lazy or delayed deletion. Delete a key asynchronously and delay the release of the key in the BIO(Background I/O) child thread to reduce the block on the main Redis thread
Lazy Free corresponds to four scenarios, all of which are turned off by default
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no
Copy the code
Represents the following meanings
-
Lazyfree-lazy-eviction: specifies whether to enable lazyfree removal when Redis running memory exceeds its maximum memory
-
Lazyfree-lazy-expire: specifies the key value of the expire time. Whether to enable the lazyfree mechanism after the expiration
-
Lazyfree – lazy – server – del: Some commands have an implicit del key when dealing with existing keys, such as the rename command. Redis removes the target key when it already exists. If the target key is a big key, this will block deletion. This configuration indicates whether lazy free deletion is enabled in this scenario
-
Slave-lazy-flush: Full data synchronization is performed on the slave node. Before loading the master RDB file, the slave runs flushall to flush its own data, which indicates whether the lazy free mechanism is enabled for deletion
It is recommended to enable the lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-server-del configurations to improve the execution efficiency of the main thread
Limit Redis memory size and set memory elimination policy
The largest cache
maxmemory 1048576
maxmemory 1048576B
maxmemory 1000KB
maxmemory 100MB
maxmemory 1GB
Copy the code
Max cache is not specified, 32 bits will crash Redis if new data is added and the maximum memory is exceeded, so be sure to set this. The optimal setting is 75% of physical memory and 60% more writes
LRU principle
LRU (Least recently used) algorithm filters out data based on historical access records. Its core idea is that “if the data has been accessed recently, the probability of future access is higher”.
LFU principle
LFU is the most Frequently Used policy. The data that is Used the Least Frequently in a period of time is eliminated. * Least use * (* LFU*) is a caching algorithm used to manage computer memory. It records and tracks the number of blocks used, and when the cache is full and more space is needed, the system will clear the memory at the lowest block usage rate. The simplest way to use the LFU algorithm is to assign a counter to each block loaded into the cache. Each time the block is referenced, the counter increases by one. When the cache reaches its capacity and a new block of memory is waiting to be inserted, the system searches for the block with the lowest counter and removes it from the cache
LRU and LFU have different emphases. LRU is mainly reflected in the use time of elements, while LFU is mainly reflected in the use frequency of elements. The drawback of LFU is that certain caches are accessed so frequently over a short period of time that they are immediately promoted to hot data, guaranteed not to be obsolete, and thus remain in system memory. In fact, this part of data is only briefly accessed with high frequency, and then will not be accessed for a long time. Transient high frequency access will accelerate the reference frequency of this part of data, and some newly added caches are easy to be deleted quickly, because their reference frequency is very low
Redis cache elimination strategy
When redis memory data sets grow to a certain size, a data obsolescence strategy is implemented
Maxmemory-policy Voltile-LRU supports 8 hot-configured memory flushing policies after Redis 4.0
-
Noeviction does not weed out any data, new operations will bug when memory is low, Redis default memory bugs policy
-
Allkeys-lru removes the oldest unused key from the entire key
-
Allkeys-random Randomly discards any key value
-
Volatile -lru removes the oldest unused key of all values with expiration dates
-
Volatile -random Randomly eliminates any key value with an expiration date
-
Volatile – TTL prioritizes keys that expire earlier
Two more elimination strategies have been added in Redis 4.0
-
Volatile -lfu: Eliminates the least used key of all expired keys
-
Allkeys-lfu: Removes the least used key from the entire key
Where allkeys-xxx indicates flushing data from allkeys, and volatile- XXX indicates flushing data from expired keys
Disable the query commands that take a long time
The time complexity of most Redis read and write commands ranges from O(1) to O(N)
Where O(1) means safe to use, and O(N) should be careful, N means uncertain, the larger the data, the slower the query speed may be. Because Redis uses only one thread to query data, if these instructions take a long time, they can block Redis, causing a lot of latency
To avoid the impact of O(N) command on Redis, you can start from the following aspects
-
Decided to disallow the keys command
-
Instead of querying all members at once, use the scan command for a batch, cursor – style traversal
-
The data size of Hash, Set, and Sorted Set structures is strictly controlled by a mechanism
-
Sort, union, intersection and other operations on the client side to reduce the operating pressure of Redis server
-
Deleting (del) a large piece of data can take a long time, so it is recommended to use asynchronous unlink, which will start a new thread to delete the target data without blocking the main thread of Redis
Redis6.0 introduced multithreading
Why did Redis choose the single-threaded model in the first place
-
IO multiplexing
Redis top-level design
FD is a file descriptor that indicates whether the current file is in a readable, writable, or abnormal state. Use I/O multiplexing mechanisms to listen for the readable and writable states of multiple file descriptors simultaneously
Once a network request is received, it is quickly processed in memory, which is very fast because most operations are pure memory
This means that in single-threaded mode, even if the connected network has a lot of processing, due to IO multiplexing, it can still be ignored in high-speed memory processing
-
High maintainability
Although multithreaded model performs well in some aspects, it introduces the uncertainty of program execution order and brings a series of problems of concurrent read and write. In single-threaded mode, debugging and testing can be carried out easily
-
Memory – based, single – threaded efficiency is still high
Multithreading can make full use of CPU resources, but for Redis, due to the memory speed that is quite high, can reach 100,000 user requests in a second, if one hundred thousand a second is not satisfied, then we can use Redis sharding technology to give different Redis server. This approach avoids the introduction of a large number of multithreading operations in the same Redis service
And memory-based, there’s basically no I/O involved unless you’re doing AOF backup. The reading and writing of these data occurs only in memory, so the processing speed is very fast; A multithreaded model for handling all external requests may not be a good solution
Based on memory and using multiplexing technology, single thread speed is fast, but also ensure the characteristics of multithreading. There is no need to use multithreading
Why did Redis add multithreading after 6.0 (in some cases, single threading has drawbacks that can be solved by multithreading)
Because read/write system calls on read/write networks consume most of the CPU time during Redis execution, making network reads and writes multithreaded can greatly improve performance
Redis can delete an element using the del command. If the element is very large, perhaps tens of megabytes or hundreds of megabytes, it cannot be done in a short period of time, thus requiring multithreaded asynchronous support
Use multithreading to delete work can be done in the background
Summary: Redis chooses to use single-threaded model to process client requests mainly because CPU is not the bottleneck of Redis server, so the performance improvement brought by using multi-threaded model cannot offset the development cost and maintenance cost brought by it, and the performance bottleneck of the system is mainly in the network I/O operation. The introduction of multi-threaded operation in Redis is also for the consideration of performance. For the deletion operation of some large key value pairs, the non-blocking release of memory space through multi-threading can also reduce the blocking time on the main thread of Redis and improve the efficiency of execution
There are asynchronous threads since 4.0
Run the slowlog command to optimize time
You can use the slowlog function to find the most time-consuming Redis commands for optimization to improve the speed of Redis. Slow query has two important configuration items
-
Slowlog-log-slower than: Slowlog-log-than: Sets the evaluation time for slow queries, meaning that commands exceeding this parameter are logged as slow operations executed in microseconds (1 second equals 1,000,000 microseconds)
-
Slowlog-max-len: sets the maximum number of slowly-queried logs
You can perform corresponding configuration according to actual services. Slow logs are stored in the slow query log in reverse order of insertion. You can use slowlog get n to obtain relevant slow query logs, and then find the corresponding services for optimization
Avoid simultaneous failure of large amounts of data
This configuration can be configured in redis.conf. The default value is Hz 10. Redis randomly selects 20 values and deletes the expired keys. If more than 25% of the keys are out of date, repeat the process
If a large number of caches expire at the same time in a large system, Redis will continue to scan and delete the expired dictionary for many times until the expired key values in the expired dictionary are deleted sparsely, and the read and write of Redis will appear obvious lag in the whole execution process. Another reason for caton is that the memory manager needs to recycle pages frequently, so it also consumes CPU.
In order to avoid this phenomenon, we need to prevent a large number of cache expiration at the same time, a simple solution is to add a specified range of random number based on the expiration time
Batch manipulation of data using Pipeline
A Pipeline is a batch technology provided by the client
You can batch execute a set of instructions and return all the results at once, which can reduce frequent request responses
Client usage optimization
In the use of the client side, in addition to the use of Pipeline technology, we also need to pay attention to the use of Redis connection pool as far as possible, instead of frequently creating and destroying Redis connections, so as to reduce the number of network transmission and reduce unnecessary call instructions
import redis.clients.jedis.JedisPool;
import redis.clients.jedis.JedisPoolConfig;
Copy the code
Use a distributed architecture to increase read and write speed
The Redis distributed architecture has important tools
-
Master-slave synchronization
-
The guard mode
-
Redis Cluster Cluster
With master-slave synchronization, we can put writes on the master library and transfer reads to the slave service, thus processing more requests per unit of time, thus improving the overall speed of Redis
Sentinel mode is an upgrade to the master/slave function, but when the master node crashes, it automatically restores the normal use of Redis without human intervention
Redis Cluster is officially introduced by Redis 3.0. Redis Cluster is to balance the load pressure of each node by distributing the database to multiple nodes
The THP feature is disabled
Transparent Huge Pages (THP) is added in the 2.6.38 kernel to support 2MB allocation of large memory Pages. This feature is enabled by default
When THP is enabled, the speed of forking is slower, and each memory page after forking is increased from 4KB to 2MB, greatly increasing the memory consumption of the parent process during rewriting. In addition, the unit of memory page copied by each write command increases by 512 times, which slows down the execution time of write operations and results in a large number of write operations and slow query. For example, the incr command also appears in slow queries. Therefore, Redis recommends disabling this feature
echo never > /sys/kernel/mm/transparent_hugepage/enabled
In order to make the machine after the restart THP configuration is still effective, can be in the/etc/rc. The local append echo never > / sys/kernel/mm/transparent_hugepage/enabled