I. Introduction to Redis
Redis is an open source high performance key-value pair in-memory database developed in C language. It can be used for database, cache, message middleware and other scenarios. It is a NoSQL(not-only SQL, non-relational database) database
Second, Redis features
- Excellent performance, data is stored in memory, read and write speed is very fast, can support concurrent 10W QPS
- A single-threaded process, thread-safe, uses IO multiplexing
- Available as a distributed lock
- Five data types are supported
- Supports persistent data to disks
- It can be used as message middleware, supporting message publishing and subscription
Second, data type
The following table lists the features of the five data types and their usage scenarios
Third, the cache
Data caching is one of the most important scenarios in Redis. It was created for caching. In SpringBoot, there are generally two ways to use data caching:
- Use directly with the RedisTemplate
- Integrate Redis with Spring Cache (i.e. annotations)
4. Problems encountered in using cache
(1) Data consistency
In a distributed environment, data consistency issues can easily arise between caches and databases. If a project’s requirement for cache consistency is strong, don’t use caches.
We can only use policies in the project to reduce the probability of cache and database consistency, which cannot guarantee the strong consistency of the two. General policies include cache update mechanism, timely update cache after database update, and add retry mechanism when the cache fails
(2) Cache avalanche
Before know snow crash, we first understand what is the cache avalanche, assume that A system to deal with 5000 requests per second, but the database can only handle 4000 requests per second, one day, the cache machine downtime, hang up and then all requests all at once to fall on the database, the database must carry not to live, hang up to call the police, At this time, if the cache facility is not adopted, and the database is in a hurry to use, restart the database, just restart completed (may not start), the request comes again, the database hangs immediately. This is the avalanche event, one of the deadliest problems in Redis cache (one is penetration). You can see the picture below
There is no need to worry or panic after the avalanche event, we can think about solutions in the three aspects of the accident before, after and after
- Before the accident: Redis high availability solution, master slave + Sentinel, cluster solution, avoid total crash
- Accident: less database pressure, local Ehcache + traffic limiting and degradation, avoid overloading the database under pressure
- After an accident: Do Redis persistence to quickly recover data from disk once Redis restarts
Let’s take A look at the data flow after the transformation. Suppose user A sends A request, the system first requests whether there is data in the local Ehcache, if there is no data request from Redis, if there is no data request from the database, then the data is obtained and synchronized to Ehcache and Redis
The function of the flow limiting component: you can set the number of requests per second, how many requests are passed, and the rest of the failed requests can be degraded, return some default values, or friendly reminders and other default actions. The specific process can be seen below:
The benefits of this are:
- Database security: The database will not hang while the flow limiting component is available, based on how many requests are guaranteed to pass per second.
- Some requests can be processed: the database is not hung, which means that at least 2/5 requests can be processed
- During the peak period, some requests cannot be processed and require multiple clicks, because only 2/5 of the requests are processed. For the rest of the requests, users cannot swipe out of the interface and need to click several times
- The redis cache expiration time is not set to the same time, but can be flexibly set according to the function, service, and request interface: setRedis (key, value, time+ math.random ()*10000);
(3) Cache penetration
Cache penetration refers to data that does not exist in the cache or the database. Users (hackers) constantly send requests to directly query the database. Such malicious attacks will directly result in database suspension
It is relatively easy to handle this situation, which is to bypass Redis or local cache to reach the database directly. The following solution can be used:
- At the request interface layer, you can perform some verification, such as user check rights, parameter verification, and direct return of illegal requests.
- In addition, authentication or direct interception can be performed for valid ids. Non-conforming ids can be filtered directly or saved to Redis with a unified key. Next time an invalid ID request is made, data can be directly obtained from the cache
- Redis advanced interface Bloom Filter is adopted to quickly determine whether your Key exists in the database with efficient data structure and algorithm. If it does not, you should return. If it does, you should check DB to refresh KV and return
(4) Cache breakdown
The penetration mentioned above is for a large area of data requests, so the breakdown is for a point (a key) to cause redis exceptions, but a key is very hot, the request is very frequent, in the centralized access phenomenon, when the key failure (expiration), a large number of requests will break down the cache, directly request the database. It’s like cutting a hole in the barrier.
Cache breakdown solutions in different scenarios
- Data remains unchanged: If the value of the hotspot data is not updated, you can set it to never expire
- Infrequent data updates: If the cache refresh process takes a little time, the distributed mutex or local mutex of distributed middleware such as Redis and ZooKeeper can be used to ensure that a few requests can be sent to the database and the cache is updated. Other processes can access the new cache only after the locks are released
- Frequent data updates: A timed thread is used to actively rebuild the cache or extend the expiration time before the cache expires to ensure that all requests can always access the cache
Why is Redis so fast
Redis official introduction can reach 10W+ QPS, this data is not worse than MEMCache, and Redis is a single-process single-thread model, completely based on memory operation, CPU is not the bottleneck of Redis, Redis is the bottleneck of memory and network bandwidth, has the following characteristics:
- Using the principle similar to HashMap, the time complexity of query and operation of HashMap is O(1), and the vast majority of requests are pure fragmented memory operations, and data is stored in memory
- Simple data structure, simple data operation, based on KV
- Good deadlock phenomenon using single thread operation, avoid unnecessary context switch and competition conditions, there is no CPU switch phenomenon, there is no consideration of various locks
- Multiplexing IO models using non-blocking IO
Six, Redis elimination strategy
- Policies prefixed with volatile are eliminated from expired datasets.
- All allkeys prefixes are eliminated for allkeys.
- LRU (least recently used) Indicates the least recently used LRU.
- LFU (Least Frequently Used)
- They are all triggered when the memory used by Redis reaches a threshold.
7. Redis persistence
There are two Redis persistence strategies:
- RDB: In snapshot mode, memory data is directly saved to a dump file and saved periodically to save policies.
- AOF: Store all Redis server modification commands in a file, a collection of commands. Redis is the snapshot RDB persistence mode by default.
If you really care about your data, but can still afford to lose it for a few minutes, you can use RDB persistence only.
AOF appends every command executed by Redis to disk. Handling large writes will slow down Redis performance.
Database backup and disaster recovery: Periodically generating RDB snapshots is very convenient for database backup, and RDB can recover data sets faster than AOF.
Of course, Redis can enable RDB and AOF at the same time. After the system restarts, Redis will use AOF to recover data first to minimize data loss.
Redis master/slave replication
- Run slaveof[masterIP][masterPort] on the secondary node to save information about the primary node.
- Discover the master node information from the scheduled task in the node, and establish a Socket connection with the master node.
- The slave node sends a Ping signal, the master node returns Pong, and the two sides can communicate with each other.
- After the connection is established, the master sends all data to the slave (data synchronization).
- After the master node synchronizes the current data to the slave node, the replication process is completed. The master node then continuously sends write commands to the slave node to ensure data consistency between the master and slave nodes.
Redis Sentinel mode
Let’s start with the problems with master-slave replication:
- Once the master node is down, the master node needs to be promoted from the master node to the master node, and the master node address of the application needs to be changed. All the slave nodes need to be ordered to copy the new master node. The whole process requires manual intervention.
- The write capability of the primary node is limited by the stand-alone node.
- The storage capacity of the primary node is limited by the single node.
- The disadvantages of native replication are also highlighted in earlier versions, such as the slave node initiating psync after a Redis replication break.
- If the synchronization fails, full synchronization is performed on the primary database. When the primary database performs full backup, delay of milliseconds or seconds may occur.
Sentinel’s architectural pattern is as follows:
The system can perform the following four tasks:
- Monitoring: Continuously check whether the primary and secondary servers are running properly.
- Notification: Sentinel notifies administrators or other applications through API scripts when a monitored Redis server has a problem.
- Automatic failover: When the primary node does not function properly, Sentinel starts an automatic failover operation. It upgrades one of the secondary nodes that has a master-slave relationship with the failed primary node to the new primary node and points the other secondary nodes to the new primary node so that manual intervention is not required.
- Configure provider: In Redis Sentinel mode, the client application initializes with a Sentinel node collection to obtain the master node information.