Redis performance optimization thirteen catch-22

preface

Redis, as a high-performance in-memory database, will also encounter performance bottlenecks in the case of large data volume. Only keeping in mind the optimization rules in daily development can make Redis performance to the utmost.

This article will introduce 13 performance optimization rules, as long as the development process in accordance with the implementation, performance will be qualitative leap.

1. Avoid slow query commands

Slow query commands refer to commands that are executed slowly. Redis provides many commands, but not all commands are slow. This is related to the complexity of commands, so you must know the complexity of different Redis commands.

For example, if Value is String, the GET/SET operation is used to manipulate the Redis hash index. The complexity of this operation is basically fixed, which is O(1). However, when Value is Set, the complexity of SORT and SUNION/SMEMBERS operations is O(N+M*log(M)) and O(N), respectively. Where N is the number of elements in Set and M is the number of elements returned by SORT. This is a lot more complicated. The Redis official documentation describes the complexity of each command. When you need to know the complexity of a command, you can directly query it.

If you find that the Redis performance is slow, you can use the Redis logs or the Latency Monitor tool to query slow requests. Check whether you use slow query commands with high complexity based on the specific commands and official documents.

If a large number of slow query commands exist, you are advised to use the following methods:

Use other efficient commands instead: For example, if you need to return all the members of a SET, instead of using the SMEMBERS command, use SSCAN to return multiple iterations to avoid blocking the thread by returning a large amount of data at once.
When you need to perform SORT, intersection, and union operations, you can do them on the client side instead of using SORT, SUNION, and SINTER to slow down the Redis instance.

2. Disable the keys command in the production environment

Keys is the slowest query command to ignore. Because keys traverses the stored key-value pairs, operation latency is high, and Redis is likely to block when used in production environments. Therefore, it is not recommended to use keys in a production environment.

3. Set the expiration time for keys

As an in-memory database, Redis stores all data in memory. If the memory usage is too large, the performance will be greatly affected. Therefore, Redis can periodically delete expired data by setting an expiration time for time-limited data.

4. Do not set the same expiration time for keys in batches

By default, Redis removes expired keys every 100 milliseconds, using the following algorithm:

The samplingACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOPA number of keys and delete all expired keys.
If more than25% 的 keyIf the expiration date expires, the deduplication process is repeated until the expiration datekeyThe proportion has dropped below 25%.

ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP is a parameter of Redis. The default value is 20, and approximately 200 expired keys are deleted in a second. This strategy is helpful for cleaning up expired keys and freeing up memory. If you delete 200 expired keys every second, you won’t have much impact on Redis.

However, if the second part of the algorithm above is triggered, Redis will always delete to free up memory. Note that the delete operation blocks (Redis 4.0 can reduce the blocking effect with asynchronous threading). Therefore, once this condition is triggered, the Redis thread will continue to perform the deletion, so that other key operations cannot be properly served, which will further increase the latency of other key operations, and Redis will slow down.

Frequently using the EXPIREAT command with the same time parameter to set expired keys triggers the second algorithm, which results in a large number of keys expiring within a second.

Therefore, do not set expiration time for keys in batches during development.

5. Choose data structures carefully

There are five data structures commonly used by Redis: String, hash, list, set, and zset (sorted set). As you can see, using string in most scenarios can solve the problem. However, this is not necessarily the optimal choice. The following is a brief description of their applicable scenarios:

string: single cache result, not related to other KV
hashAn Object contains many properties that need to be stored separately. Note: Do not use strings in this case, because strings take up more memory
list: An Object contains a lot of data that allows repetition and requires ordering
set: An Object contains a lot of data. It does not require sequential data, but it does not allow repetition
zset: An Object contains a lot of data, and the data itself contains a weight value, which can be used to sort

Redis also provides several extension types, as follows:

HyperLogLog: Suitable forbaseStatistics, such as PV and UV statistics, existerrorQuestion, not suitable for precise statistics.
BitMap:Binary stateStatistics, such as check-in, are either clocked or not clocked.

6. Check the persistence policy

The following three persistence strategies have been used since Redis4.0:

AOF log: a strategy for logging commands in the form of file appending. It also provides three configuration items for synchronous and asynchronous appending.
RDB snapshot: Writes memory data at a specified point in time in binary mode to a disk as a snapshot.
AOFandRDBMixed use: This method is new in Redis4.0. To take advantage of the advantages of both methods, AOF logs are used to record operations during the RDB snapshot period. In this way, data between the two snapshots will not be lost in the event of downtime.

Because of the IO performance bottleneck of writing to disk, it is recommended to disable persistence or adjust the persistence policy if Redis is not used as a database (which can be recovered from the back end).

7. Use high-speed solid-state drives as log writing devices

Since AOF log rewriting puts a lot of pressure on the disk and is likely to block, it is recommended to use a fast SOLID-state drive as a log writing device if persistence is required.

8. Use physical machines instead of VMS

Because the virtualization software layer is added to VMS, VMS have higher performance costs than physical VMS. You can run the following command to test the baseline performance of physical VMS and VMS:

./redis-cli --intrinsic-latency 120
Copy the code

Test results show that the baseline performance of physical machines is significantly better than that of virtual machines.

9. Increase machine memory or use Redis cluster

Insufficient memory on the physical machine causes memory Swap in the operating system.

Memory swap is a mechanism for the OPERATING system to swap memory data back and forth between the memory and disk. It involves disk read and write operations. Therefore, the performance of both the data swap process and the data swap process is affected by slow disk read and write operations.

Redis is an in-memory database, which consumes a large amount of memory. If you do not control the memory usage or run it together with other applications that require large amounts of memory, it may be affected by swap, resulting in slow performance.

This is especially important for a Redis in-memory database: normally, the Redis operation is performed directly by accessing memory, but once the swap is triggered, the Redis request is not performed until the disk data is read or written. Also, instead of using the Fsync thread to read and write AOF log files, swap triggers the main Redis I/O thread, which greatly increases Redis response time.

Therefore, increasing the memory of the machine or using Redis cluster can effectively solve the Swap of operating system memory and improve performance.

10. Batch operation data using Pipeline

Pipeline is a batch processing technology provided by the client to process multiple Redis commands at once, thereby improving the performance of the overall interaction.

11. Optimize client usage

In the use of the client side, in addition to the use of Pipeline technology, we also need to pay attention to the use of Redis connection pool as far as possible, instead of frequently creating and destroying Redis connections, so as to reduce the number of network transmission and reduce unnecessary call instructions.

12. Use a distributed architecture to increase read and write speed

Redis distributed architecture has three important tools:

Master-slave synchronization
The guard mode
Redis ClusterThe cluster

With master-slave synchronization, we can put writes on the master library and transfer reads to the slave service, thus processing more requests per unit of time, thus improving the overall speed of Redis.

Sentinel mode is an upgrade to the master/slave function, but when the master node crashes, it automatically restores the normal use of Redis without human intervention.

Redis Cluster is officially launched by Redis 3.0. Redis Cluster distributes data to multiple nodes to balance the load of each node.

Redis Cluster uses virtual hash slot partitioning. All keys are mapped to integer slots from 0 to 16,383 according to the hash function. The calculation formula is slot = CRC16(key) & 16383. This allows Redis to spread the read-write load from one server to multiple servers, resulting in a significant performance improvement.

In these three functions, we only need to use one, there is no doubt that Redis Cluster should be the preferred implementation, it can automatically share the read and write pressure to more servers, and has the ability of self-disaster recovery.

13. Avoid memory fragmentation

Frequent addition and modification will increase memory fragmentation, so you need to clear memory fragmentation at any time.

Redis provides INFO Memory for viewing memory usage, as follows:

INFO memory
# MemoryUsed_memory :1073741736 USed_memory_human: 1024.00m USed_memory_rss :1997159792 USED_memory_rss_human: 1.86g... Mem_fragmentation_ratio: 1.86Copy the code

There is a mem_fragmentation_ratio, which is the current memory fragmentation rate of Redis. So how is this fragmentation rate calculated? This is the result of dividing the two indices used_memory_rss and used_memory from the command above.

mem_fragmentation_ratio = used_memory_rss/ used_memory
Copy the code

Used_memory_rss is the physical memory space actually allocated by the operating system to Redis, which contains fragments; Used_memory is the space Redis actually requested to use to store data.

So, knowing this metric, how do we use it? Here, I provide some empirical thresholds:

Mem_fragmentation_ratio is greater than 1 but less than 1.5. That makes sense. That’s because the factors I just described are unavoidable. After all, internal memory allocators must be used, and allocation policies are common and not easily modified; The external factor is determined by the Redis load and cannot be restricted. Therefore, memory fragmentation is normal.
Mem_fragmentation_ratio is greater than 1.5. This indicates that the memory fragmentation rate has exceeded 50%. Generally, at this point, we need to take some measures to reduce the memory fragmentation rate.

Once the memory fragmentation rate is too high, it is time to use methods to clean up memory fragmentation

conclusion

This article focuses on 13 performance optimization rules, in the development process or need to be specific to the specific analysis of specific problems, I hope the author of this article can help you.

The article has been included at GitHub: github.com/JavaFamily