This article will take Redis Bigkey as the theme for technical development. Through understanding the high performance of Redis, the harm of Bigkey, the reasons for its existence, four solutions, and the introduction of practical simulation, we will understand, discuss and learn Redis together.
Meet Redis and Bigkey
Redis – the darling of the Internet
As an excellent industrial-grade memory database, Redis has gradually become the darling of the Internet since its birth. It supports the colorful functions and huge QPS (query rate per second) of the Internet, and has become a synname for high performance like NGINX. For example, Redis is waiting silently behind every hot search on Weibo. In a sense, the usage of Redis also represents the traffic of an Internet company.
In the Von Neumann computing system, memory is an important presence. There is a certain dialectical relationship between calculation and storage. Storage can be reduced through calculation, and consumption can be reduced by storage. Therefore, the idea of caching has also become an important optimization method to reduce the computational load of system performance.
Let’s look at two comparisons.
This is a performance comparison of CPU, memory, and disk. The performance of memory reading and writing is nearly one thousand times that of disk. As a memory storage medium, Redis has revolutionized the performance of the system. The economic basis determines the superstructure, so the economic benefit is always the first.
Now let’s look at the price of the various types of storage.
Let’s simply evaluate and analyze: the conversion of memory is about 30 yuan per gigabyte, and the disk is about 0.5 yuan per gigabyte. Compared with the difference in performance, the input-output ratio of economic benefit is still very high.
Redis storage data structure: Redis is a key/value structure storage database. The internal organization uses the data access index structure of HashMap, and the data read time complexity is O(1). The key of HashMap is String type, and Valuek can be String, List, HashMap, Set and ZSet five basic data structures. These data are stored in memory, with efficient read and write performance, often used for cache processing, improve the performance of the system.
The underlying storage structure is as follows:
Another design implementation that underpins Redis’ high performance is the single-threaded task processing design.
In the face of a huge workload, in order to complete the task as soon as possible, they usually choose to add people and divide the task into several parallel processing parts. This is a simple idea of multithreading concurrent processing, which can improve the throughput of task processing.
So why does Redis take the opposite approach and use a single thread? Single-threaded here means that Redis network IO and key-value pair reads and writes are done by a single thread, which is also the main process of Redis providing key-value storage services.
First of all, if you use multithreading, what are the problems when dealing with network IO? Thread is the basic unit of CPU scheduling. CPU interrupts by clock to switch time slices, which shows the effect of parallel processing. Task switching consumes resources inevitably. When too many threads are started, at a certain point, these switches can hit a performance bottleneck.
Another problem is resource sharing under multithreading, which is also the core problem of concurrent programming. Concurrent environment existence resource competition, you need to lock the Shared resources of critical section, handle concurrent processing into serialization, redis data structure, the underlying using hashMap index data structure, under the condition of multithreading, inevitable resource contention problem, which becomes a serial synchronization process. Therefore, we reject multithreading here.
So what are the benefits of a single thread in this scenario
Request the client and the server via TCP three-way handshake will set up a network connection, after the request data from the network card to be written to the operating system kernel buffer, user programs to perform a read operation, the data from the kernel space is written to the user of the program execution variables, do not receive their data if the kernel space, blocking occurs here waiting.
In order to solve this technical problem, a technique called IO multiplexing was created. There are three implementations under Linux: SELECT, POLL, and EPOLL. In short, this mechanism allows multiple listening sockets and connected sockets to exist simultaneously in the kernel. Then complete read and write ready to notify the user state to execute. Specific here do not spread out, interested friends can search on the net.
NodeJS, Nginx and other gateway layer technology, are single threaded design, in the network IO processing, single thread is better than multi-threading. But a single thread also has its fatal weakness. Once a request task in processing takes too long, it will block subsequent requests. This is also the main reason for Bigkey’s harm, which will be discussed in the following article. Just like on a one-way street, a car breaks down, there will be a traffic jam.
What is bigkey?
From the Redis underlying data storage structure in the figure above, we can see that Value has various data structures. Therefore, the value size is in the type of string, represented by the length of the string. Value is of compound type, representing the number of elements.
Bigkey is the big value problem in the Redis key/value system. According to the classification of data types, BigKey is embodied in two points:
- The stored data is of string type, and the value value is too long.
- Value is a compound type that contains too many elements.
In Redis, a string is up to 512MB, and a secondary data structure (hash, list, set, zset) can store about 4 billion (2^32-1) elements. This is a theoretical value, but in practice, we can use the data provided by operation to comprehensively measure the limit number. Generally, strings are limited to 10KB, and complex hash, list, set, and zset have no more than 5000 elements.
2. What’s the harm of Bigkey? Why?
At this point, we have a basic understanding of Bigkey. Next, we will introduce the hazards and causes of Bigkey one by one.
1. Four major hazards of Bigkey
For Redis, Bigkey is like a rat’s droppings, as the saying goes. Its danger is mainly manifested in the following four aspects:
1. Uneven memory space In cluster mode, the existence of Bigkey will cause uneven memory of host nodes, which is not conducive to the unified memory management of the cluster and may lead to the hidden danger of data loss.
Due to the single-threaded nature of Redis, working with BigKey is usually time consuming, which means that Redis is more likely to be blocked, which can cause client blocking or failover. They are often present in slow queries.
3. Network congestion Bigkey also means that the network traffic to be generated every time the acquisition is large. Assuming a Bigkey of 1MB and a client of 1000 accesses per second, the 1000MB traffic generated per second is too much for a typical Gigabit (128MB/s in bytes) server.
If you are using a pre-Redis version 4.0, the expired key is deleted asynchronously, and there is a possibility that the expired key will block the Redis, and the expired delete will not be detected from the slow query (because the delete is not generated by the client). Is an internal loop event).
2. How did Bigkey come into being?
The generation of BigKey is mainly caused by poor program design, such as the following common business scenarios
- Social: The fan list, if some celebrity or big V is not well designed, is Bigkey.
- Statistics: For example, a collection of users that stores a feature or website by the day, unless very few people use it, it must be BigKey.
- Cache classes: Serializing data from the database into Redis is a common practice, but there are two caveat points: first, is it necessary to cache all fields; Second, is there any relevant data?
Therefore, in the program design, we should have a fundamental assessment of the growth and boundary of the data volume, and do a good job in the technical selection and technical architecture.
Three or four types of investigation found Bigkey solutions
Let’s start with a thought question:
At the beginning of this year, the COVID-19 outbreak broke out in Shijiazhuang one after another. For a medium-sized and large city with a population of more than 10 million, the prevention and control of the COVID-19 outbreak is facing great pressure. How to efficiently detect the infected and contact groups of the virus has become the key to the success of the prevention and control of the epidemic. The government has done the following work, which can be summarized as four points:
- Prohibit the circulation of personnel, home isolation;
- Establish risk levels;
- Grid management;
- Accounting testing.
According to the characteristics of epidemiological medicine, symptoms must be reported actively. This is the active reporting, because the COVID-19 has a certain incubation period and many asymptomatic patients, so it is necessary to actively detect the COVID-19 through the mechanism of accounting and detection, which actually reflects the idea of computer processing scanning.
The idea of discovering and dealing with Bigkey is similar to the approach of epidemic prevention and control. There are also four conventional approaches.
1. Redis client tools
Redis-CLI provides –bigkeys to find bigkeys. For example, the following is the result of one execution.
As can be seen from the figure above, this method gives the top 1 bigkey of each data structure, the number of keys and the average size of each data type. But if we need more BigKey, it won’t work that way. It is performed internally by means of a SCAN, with a certain performance expense, and in order not to affect the business, the task can be placed on the slave node for execution.
2, the debug of the object
Redis provides a command to debug object key. Suppose there is a requirement to “find the keys in Redis that are larger than 10KB”. In order to obtain the result data, all keys need to be scanned first, and then the byte size of all keys should be obtained by invoking Debug Object Key through a loop.
Due to the slow execution of the Debug Object Key, it is possible to block the Redis thread. Therefore, this scheme will also have some damage to the business. When in use, the executive program can be run on the slave node.
3. RDB file scanning
As we know, Redis has a persistence scheme called RDB persistence, which is a disk snapshot of the data stored in Redis memory. By scanning the RDB files with RDB tools, you can find the existence of BigKey.
When choosing this option, you first need to do RDB file persistence. RDB persistence is a form of memory snapshot, according to a certain frequency of snapshot downtime, this scheme is an ideal choice, will not affect the operation of the Redis host, but in the high data reliability requirements of the scene, RDB persistence scheme will not be chosen, so it is not universal.
4. Scanning design idea of Dataflux Bigkey
The first few schemes are either discovered by the client or require a full-volume data scan, which is a computationally resource-intensive activity. It is like conducting nucleic acid tests on all the people in a city with a population of 10 million, which not only consumes huge material resources, financial resources and manpower, but also has low efficiency. It runs counter to the epidemic prevention and control which requires a race against time.
Above, we have also analyzed the causes of Redis Bigkey, many of which are caused by unreasonable business design and inadequate evaluation. Therefore, in the design of the DataFlux product, the Redis collector of Datakit uses a scheme to independently configure potential BigKey for scanning discovery, which supports fixed key values and key patterns. In the key pattern, a certain range of keys is obtained through the scan pattern, and then the length function is used to value the key (“HLEN””LLEN””SCARD””ZCARD””PFCOUNT””STRLEN”) of each type. The length of the corresponding key is obtained and reported to the DataFlux platform for monitoring and storage.
There are two advantages to this approach:
1. Because length is obtained for the target key, the length value of various data types in Redis is obtained with O(1) time complexity. So the execution is very efficient;
2. The collected results are reported to the DataFlux storage platform, under which all kinds of charts and charts can be displayed for the index data and alarm monitoring can be done.
Next, a simple business scenario simulation is carried out to illustrate the scheme.
Four, actual combat drills, such as this moment!
In a business system, Redis String type is used to store user authentication tokens and List data type is used to queue asynchronous messages.
Business analysis: The key data stored in the token is a fixed length data, and there will be no change in the amount of data, and it will not form a BigKey. Taking message queue as another example, if the consumer side fails and the message producer side floods with a large amount of data at this moment, the Redis key using the message queue will become a potential BigKey. Therefore, we need to monitor this key.
We assume that the key of the message queue is named QUEUE.
Let’s follow the official tutorial to install the Datakit tool.
The official tutorial: “how to install DataKit https://help.dataflux.cn/doc/…”
Once installed, go to the conf.d/db directory under the DataKit installation directory, copy redis.conf.sample and name it redis.conf. Make the following configuration:
Simulate the initial queue push 10 values
The data is reported to the DataFlux platform
Push a certain amount of data
Finally, you can see the following collection results on the DataFlux backboard.
On the Dataflux platform, the monitored key can be displayed in charts, monitoring alarms and visual display through indicators, so as to present the data value to the greatest extent.