This article has been included in github/JavaMap. There are a series of articles about the advanced technology stack for Java programmers. Welcome to Star.
Redis data structures and common commands
Redis is a key-value database. The key type can only be String, but the value data type is relatively rich, mainly including five types:
- String
- Hash
- List
- Set
- Sorted Set
1. String Indicates the character String
grammar
SET KEY_NAME VALUE
Copy the code
The string type is binary safe. The redis string can contain any data. Like JPG images or serialized objects. The string type is the most basic Redis data type. A key can store up to 512MB of data.
2. A Hash Hash
grammar
HSET KEY_NAME FIELD VALUE
Copy the code
A Redis hash is a set of key=>value pairs. Redis hash is a mapping table of fields and values of string type. Hash is especially suitable for storing objects.
3. List the List
grammar
// Add string element LPUSH KEY_NAME VALUE1.. VALUEN // Add string element RPUSH KEY_NAME VALUE1.. VALUEN // Delete count elements that are the same as value from the list LREM KEY_NAME Count value // Return the length of the list corresponding to the key LLEN KEY_NAMECopy the code
A Redis list is a simple list of strings, sorted by insertion order. You can add an element to the head (left) or bottom (right) of the list
4. Set the Set
grammar
SADD KEY_NAME VALUE1... VALUEnCopy the code
Redis’ Set is an unordered collection of type string. Collections are implemented by hashing tables, so adding, deleting, and searching are O(1) complexity.
5. Get a Sorted Set
grammar
ZADD KEY_NAME SCORE1 VALUE1.. SCOREN VALUEN
Copy the code
Redis zset, like set, is a collection of string elements and does not allow duplicate members. The difference is that each element is associated with a double score.
Redis uses scores to sort the members of a collection from smallest to largest.
Members of a Zset are unique, but scores can be repeated.
6. Redis common command References
For more information about command syntax, see the manual:
www.redis.net.cn/order/
Redis transaction mechanism
1. Redis transaction lifecycle
- Start transaction: Use MULTI to start a transaction
- Command enqueue: The command for each operation is added to a queue, but the command will not be executed
- Commit a transaction: Commit a transaction using the EXEC command, starting with the sequential execution of the commands in the queue
2. Are Redis transactions atomic?
Mysql > define atomicity in relational database ACID
** Atomicity: ** All operations in a transaction either complete or do not complete, and do not end up somewhere in between. If a transaction fails during execution, it will be rolled back to the state before the transaction began, as if the transaction had never been executed.
The official documentation defines transactions as follows:
- A transaction is a single isolated operation: all commands in the transaction are serialized and executed sequentially. The transaction will not be interrupted by command requests from other clients during execution.
- A transaction is an atomic operation: all or none of the commands in a transaction are executed. The EXEC command is responsible for triggering and executing all commands in a transaction: if the client starts a transaction using MULTI and fails to execute EXEC because of a disconnect, none of the commands in the transaction will be executed. On the other hand, if the client successfully executes EXEC after the transaction is started, all commands in the transaction will be executed.
Redis transactions are officially considered an atomic operation, which is executed or not. However, according to the atomic definition of ACID, strictly speaking, Redis transactions are non-atomic, because in order of command execution, Redis does not stop execution and roll back data if a command execution error occurs.
3. Why does Redis not support roll back?
While Redis commands may fail during a transaction, Redis still executes the remaining commands in the transaction and does not perform rollback operations. If you are familiar with mysql relational database transactions, you will be confused by this. The official reasons for Redis are as follows:
The command will fail only if the Redis command is called with syntax errors (which Redis can detect while the command is placed on a transaction queue), or if an operation is performed on a key that does not match its data type: In practice, this means that Redis commands fail only because of program errors, which are most likely to be discovered during program development, but rarely in production. Supporting transaction rollback capabilities leads to design complexity, which is contrary to Redis’s goal of simplifying functionality and ensuring faster performance.
There is a common objection to this official justification: what if the program is buggy? However, regression does not solve A bug in the program, such as A careless programmer who plans to update key A but ends up updating key B. Rollback does not solve this human error. Because this kind of human error is less likely to enter the production system, the official design of Redis is a simpler and faster method, and there is no rollback mechanism.
4. Redis transaction failure scenario
There are three types of failure scenarios:
(1) Before the transaction is committed, the command cache (queue) executed by the client fails, such as syntax errors of the command (wrong number of command parameters, unsupported commands, etc.). If this type of error occurs, Redis returns a response containing the error message to the client, while Redis empties the queue of commands and cancels the transaction.
127.0.0.1:6379> set name zhangsan 127.0.0.1:6379> set name zhangsan QUEUED 127.0.0.1:6379> setset name zhangsan2 Error ERR unknown command 'setset', with args beginning with: 'name', 'zhangsan2', '127.0.0.1:6379> exec # Error EXECABORT Transaction discarded because of previous errors. 127.0.0.1:6379> get name # Select * from username where username = "xiaoming"Copy the code
(2) Execute commands in sequence after the transaction is submitted, and the commands cached in the queue may fail to execute.
127.0.0.1:6379> multi # enable transaction OK 127.0.0.1:6379> set name xiaoming # set name QUEUED 127.0.0.1:6379> set age 18 # set age QUEUED 127.0.0.1:6379> set age 18 # 127.0.0.1:6379> lpush age 20 # QUEUED 127.0.0.1:6379> exec # 1) OK 2) OK 3) (Error) WRONGTYPE Operation against a key holding the wrong kind of value 127.0.0.1:6379> get SQL > roll back "xiaoming"Copy the code
(3) Due to optimistic lock failure, all previously cached command sequences will be discarded when the transaction commits. Simulate this failure scenario by opening two Redis clients and using the watch command.
# client 1 set name OK 127.0.0.1:6379> watch name # client 1 set name OK 127.0.0.1:6379> watch name # client 1 set name OK 127.0.0.1:6379> watch name 127.0.0.1:6379> get name # client 2 Query name "xiaoming" 127.0.0.1:6379> set name zhangsan # client 2 Modify name OK # client 1 127.0.0.1:6379> multi # client 1 commit transaction OK 127.0.0.1:6379> set name lisi # client 1 modify name QUEUED 127.0.0.1:6379> exec # client 1 commit transaction, Lisi "zhangsan" = lisi "zhangsan" = lisi "zhangsan" = lisi "zhangsan"Copy the code
If the monitored key is changed by another client during a transaction, the optimistic lock of the current client fails and all command cache queues are discarded when the transaction is submitted.
5. Redis transaction related commands
(1) WATCH
You can provide check-and-set (CAS) behavior for Redis transactions. WATCH keys are monitored to see if they have been changed. If at least one monitored key is modified before EXEC executes, the entire transaction is cancelled and EXEC returns nil-reply to indicate that the transaction has failed.
(2) the MULTI
Used to start a transaction and always returns OK. MULTI after execution, the client can send any number of commands to the server. These commands are not executed immediately, but are placed in a queue. When the EXEC command is invoked, all commands in the queue will be executed.
(3) the UNWATCH
Unmonitor the WATCH command for all keys, typically before DISCARD and EXEC commands. If the EXEC or DISCARD command is executed first after the WATCH command, there is no need to execute UNWATCH. Because the EXEC command performs transactions, the effect of the WATCH command has already occurred; The DISCARD command cancels the transaction as well as all monitoring of the key, so there is no need to execute UNWATCH after these two commands are executed.
(4) the DISCARD
When the DISCARD command is executed, the transaction is abandoned, the transaction queue is emptied, and the client exits from the transaction state.
(5) the EXEC
Responsible for triggering and executing all commands in the transaction:
If the client executes EXEC after successfully starting the transaction, all commands in the transaction will be executed.
If a client starts a transaction using MULTI and fails to execute EXEC because of a disconnection, none of the commands in the transaction will be executed. It is important to note that even if some commands fail in a transaction, other commands in the transaction queue continue to execute. Redis does not stop executing commands in a transaction and does not roll back the commands we normally use in a relational database.
Redis persistence strategy
What is persistence?
Persistence is the saving of data (such as objects in memory) to a permanent storage device (such as a disk). The main application of persistence is to store objects in memory in a database, in disk files, XML data files, and so on.
Persistence can also be understood in two simple ways:
- Application layer: if you shutdown your application and restart it, the previous data remains.
- System layer: if you shutdown your system (computer) and restart it, the previous data still exists.
Why is Redis persistent?
Redis is an in-memory database, so all operations are done in memory to ensure efficiency. Data is cached in memory, and when you restart the system or shut it down, the data that was cached in memory is lost and can never be retrieved. So to avoid this, Redis needs to implement persistence to store data in memory.
How does Redis persist?
Redis officially provides different levels of persistence:
- RDB persistence: The ability to take snapshots of your data at specified intervals.
- AOF persistence: Records each write operation to the server. When the server restarts, these commands will be executed again to restore the original data. AOF saves each write operation to the end of the file using redis protocol. Redis can also rewrite AOF files in the background so that AOF files are not too large.
- Don’t use persistence: If you only want your data to live while the server is running, you can also choose not to use persistence at all.
- Enable BOTH RDB and AOF: You can also enable both persistence methods. In this case, when Redis restarts, AOF files will be loaded first to recover the original data, since AOF files usually hold more complete data sets than RDB files.
So what do we do with all these persistence methods? Before choosing, we need to understand the differences and advantages and disadvantages of each persistence method.
RDB persistence
Redis Database (RDB) persistence is the process of generating snapshots of current memory data and saving them to hard disks. The RDB persistence process can be manually triggered or automatically triggered.
(1) Manual trigger
Manually triggering the corresponding save command will block the current Redis server until the RDB process is completed, which will cause a long time of blocking for instances with large memory. Therefore, it is not recommended to use this command in online environments.
(2) Automatic trigger
The bgsave command is automatically triggered, and the Redis process forks to create a child process. The RDB persistence process is the responsibility of the child process and ends automatically after completion. Blocking occurs only during the fork phase, which is usually very short.
In the redis.conf configuration file you can configure:
save <seconds> <changes>
Copy the code
Bgsave is automatically triggered when data is modified xx times within XX seconds. If you want to turn off automatic triggering, you can add an empty string after the save command, that is:
save ""
Copy the code
There are other common triggers for BGSave, such as:
- If the slave node performs a full copy operation, the master node automatically bgSave to generate RDB files and send them to the slave node.
- By default, when the shutdown command is executed, if AOF persistence is not enabled, bgSave is automatically executed.
Bgsave works
(1) Execute bgsave command, Redis parent process to determine whether there is currently executing child, such as RDB/AOF child process, if there is, bgsave command directly return.
(2) The parent process forks to create a child process. During the fork process, the parent process blocks. The latest_fork_usec option can be checked by the info stats command to obtain the last fork time, in microseconds
(3) After the parent process forks, bgsave returns “Background saving Started” and stops blocking the parent process, so it can continue to respond to other commands.
(4) The child process creates the RDB file, generates a temporary snapshot file according to the parent process memory, and atomic replacement of the original file after completion. Run the lastsave command to obtain the time when the RDB was generated for the last time, which corresponds to the rdb_last_save_time option of info statistics.
(5) The process sends a signal to the parent to indicate completion, and the parent updates the statistics. For details, see the rDB_ * option under Info Persistence.
AOF persistence
AOF (Append only File) persistence: Each write command is recorded in an independent log, and the commands in the AOF file are executed again when the system restarts to recover data. The main role of AOF is to solve the real-time of data persistence, which has been the mainstream way of Redis persistence.
AOF persistence works
Appendonly yes: This function is disabled by default.
The AOF filename is set through the appendfilename configuration. The default filename is appendone.aof. The save path is the same as the RDB persistent path and is specified through the dir configuration.
AOF workflow operations: command write (append), file synchronization (sync), file rewrite (rewrite), restart load (load).
(1) All write commands are appended to aOF_buf (buffer).
(2) The AOF buffer synchronizes data to disks based on the corresponding policy.
Why does AOF append commands to aOF_buf? Redis uses single-threaded response commands, and if every time you write an AOF file command is appended directly to the disk, performance depends entirely on the current disk load. Writing to buffer AOF_buf first has another benefit. Redis can provide multiple buffer synchronization policies to balance performance and security.
(3) With the increasing size of AOF files, AOF files need to be rewritten periodically to achieve the purpose of compression.
(4) When the Redis server restarts, AOF files can be loaded for data recovery.
The AOF rewrite mechanism
Purpose of rewriting:
- Reduce the space occupied by AOF files;
- Smaller AOF files can be loaded and recovered faster by Redis.
AOF rewriting can be divided into manual triggering and automatic triggering:
- Manual trigger: Directly invoke the bgrewriteaof command.
- Automatic trigger: Determine the automatic trigger time according to the auto-aof-rewrite-min-size and auto-aof-rewrite-percentage parameters.
Auto-aof -rewrite-min-size: indicates the minimum size of a file when aof rewriting is run. The default size is 64MB.
Auto-aof-rewrite-percentage: specifies the ratio of the current AOF file space (AOF_current_size) to the aOF file space after the last rewrite (aOF_base_size).
Automatic trigger time
Aof_current_size >auto-aof-rewrite-minsize and (aof_current_size-aof_base_size) /aof_base_size>=auto-aof-rewritepercentage.
Where aOF_CURRENT_size and aof_base_size can be viewed in the Info Persistence statistics.
Why do AOF files become smaller after being rewritten?
(1) Old AOF file contains invalid command, such as: del key1, hdel key2, etc. Overrides write commands that retain only the final data.
(2) Multiple commands can be combined, such as lpush list a, lpush list b, lpush list c can be directly converted into lpush list A, b, c.
** AOF file data recovery **
Data restoration process:
(1) When AOF persistence is enabled and AOF files exist, AOF files are preferentially loaded.
(2) Load the RDB file when AOF is closed or the AOF file does not exist.
(3) After the AOF/RDB file is successfully loaded, Redis is successfully started.
(4) If there is an error in the AOF/RDB file, Redis will fail to start and print an error message.
Advantages and disadvantages of RDB and AOF
Advantages of RDB
- RDB is a very compact file, save it for a certain time point data sets, very suitable for a backup data sets, such as you can in each hour to save the data in the past 24 hours, at the same time every day to save the past 30 days, even out of the question you can restore to a different version of the data set according to demand.
- RDB is a compact single file that can be easily transported to another remote data center, making it ideal for disaster recovery.
- When RDB saves RDB files, the only thing the parent process needs to do is fork out a child process. The child process does all the following work. The parent process does not need to do other IO operations, so THE RDB persistence method can maximize the performance of Redis.
- Compared to AOF, RDB is faster for recovering large data sets.
AOF advantages
- You can use different fsync policies: no fsync, fsync per second, fsync every time you write. With the default fsync per second policy, Redis still performs well (fsync is handled by background threads and the main thread does its best to handle client requests), and you can lose up to 1 second of data in the event of a failure.
- The AOF file is an append only log file, so there is no need to write seek, even if for some reason (disk space is full, write downtime, etc.) the full write command is not executed, you can use the redis-check-aof tool to fix these problems.
- Redis can automatically rewrite the AOF in the background when the AOF file becomes too large: the rewritten new AOF file contains the minimum set of commands needed to restore the current data set. The entire rewrite operation is absolutely safe because Redis continues to append commands to existing AOF files while creating new AOF files, and the existing AOF files will not be lost even if an outage occurs during the rewrite. Once the new AOF file is created, Redis switches from the old AOF file to the new AOF file and starts appending the new AOF file.
- AOF files orderly store all writes to the database in the Redis protocol format, so the contents of AOF files are easy to read and parse. Exporting AOF files is also very simple: For example, if you accidentally execute the FLUSHALL command, as long as the AOF file isn’t overwritten, stop the server, remove the FLUSHALL command at the end of the AOF file, and restart Redis, You can restore the data set to the state it was in before the FLUSHALL execution.
RDB shortcomings
- Redis can be a lot of work to save the entire data set, you will usually do a full save every 5 minutes or more, in case of an unexpected Redis outage, you may lose a few minutes of data.
- The RDB often forks the child process to save the data set to the hard disk. When the data set is large, the fork process can be very time-consuming and may cause Redis to fail to respond to client requests for some milliseconds.
AOF shortcomings
- AOF files are usually larger than RDB files for the same data set.
- AOF is slower than RDB for data recovery (load), which generally provides a more guaranteed maximum latency.
RDB and AOF are briefly compared and summarized
RDB advantages:
- RDB is a compact binary file suitable for backup, full copy, and other scenarios
- RDB recovers data much faster than AOF
RDB faults:
- RDB cannot implement real-time or second-level persistence;
- The old version is not compatible with RDB format.
AOF advantages:
- Better protection against data loss;
- The appen-only mode provides high write performance.
- Suitable for catastrophic error deletion emergency recovery.
AOF faults:
- For the same file, the AOF file is larger than the RDB snapshot;
- When AOF is enabled, the QPS written will be affected, and the QPS written will decrease compared with RDB.
- The database recovery is slow and is not suitable for cold backup.
Redis memory elimination strategy
What is the elimination strategy?
Redis memory flushing policy refers to the policy of adding new data selection by flushing out old data when the cache memory is insufficient.
How to configure maximum memory?
Through the configuration file
Modify the redis.conf configuration file
Maxmemory 1024mb // set the maximum memory size occupied by Redis to 1024mbCopy the code
Note: MaxMemory is set to 0 by default. On a 64-bit operating system, the maximum REDis memory is the remaining memory of the operating system. On a 32-bit operating system, the maximum REDis memory is 3GB. Using dynamic commands
Redis supports dynamic memory size modification at runtime with commands:
127.0.0.1:6379> config set maxMemory 200MB // Set the maximum Redis memory size to 200M 127.0.0.1:6379> config get maxmemory 1) "maxMemory "2) "209715200"Copy the code
Classification of elimination strategies
If Redis continues to add data after running out of memory, how to handle this situation? In fact, Redis has officially defined eight strategies to deal with this situation:
noeviction
Default policy: Returns an error for a write request without elimination.
allkeys-lru
Lru (less recently used) : least recently used. The approximate LRU algorithm is used for elimination from all keys.
volatile-lru
Lru (less recently used) : least recently used. An approximate LRU algorithm is used to weed out keys from which expiration time is set.
allkeys-random
Eliminate all keys at random.
volatile-random
Random elimination from keys with expiration time set.
volatile-ttl
Time to Live (TTL) : A key whose expiration time is set is eliminated according to its expiration time. The earlier the key expires, the earlier the key is eliminated.
allkeys-lfu
Lfu (Least Frequently Used) is the Least Frequently Used. An approximate LFU algorithm is used to eliminate all keys. Support from Redis4.0 onwards.
volatile-lfu
Lfu (Least Frequently Used) is the Least Frequently Used. An approximate LFU algorithm is used to flush out keys with expiration time. Support from Redis4.0 onwards.
Note: When volatile- LRU, volatile- Random, volatile- TTL policies are used, errors will be returned like noeviction if no expired key is set to be deprecated.
LRU algorithm
LRU(Least Recently Used) is a cache replacement algorithm. When using memory as a cache, the size of the cache is usually fixed. When the cache is full and data is added to the cache, some of the old data needs to be eliminated to free up memory for new data. At this point you can use the LRU algorithm. The idea is that if a piece of data has not been used in the recent past, it is unlikely to be used in the future, so it can be eliminated.
Implementation of LRU in Redis
Redis uses an approximate LRU algorithm, which is not quite the same as the regular LRU algorithm. The approximate LRU algorithm weeded out data by random sampling, randomly generating 5 (default) keys at a time, and weeding out the least recently used keys.
You can modify the number of samples with the maxmemory-samples parameter, such as maxmemory-samples 10
The larger the maxmenory-samples configuration is, the closer the result of the elimination is to the strict LRU algorithm, but the CPU consumption is high as a result.
In order to realize the approximate LRU algorithm, Redis adds an extra 24bit field to each key, which is used to store the last access time of the key.
Redis3.0 optimization of approximate LRU
Redis3.0 has some optimizations for the approximate LRU algorithm. The new algorithm maintains a candidate pool (size 16). The data in the pool is sorted according to the access time. The first randomly selected key is added to the pool, and the subsequent randomly selected key is added to the pool only when the access time is shorter than the minimum time in the pool. When the pool is full, if new keys need to be added, the pool with the highest last access time (most recently accessed) is removed.
When it is necessary to flush out a key, select the key from the pool that has been accessed for the shortest time and flush it out.
LFU algorithm
LFU(Least Frequently Used) is a new elimination strategy added to Redis4.0. The core idea of LFU is that the key is eliminated according to the frequency of recent accesses. The key that is rarely accessed is eliminated first, and the key that is Frequently accessed is retained.
The LFU algorithm can better represent the heat of a key being accessed. If you are using the LRU algorithm, a key that has not been accessed for a long time, but only once in a while, is considered hot data and will not be eliminated, while some keys that are likely to be accessed in the future will be eliminated. This is not the case with the LFU algorithm, because using a key once does not make it hot data.
Redis memory invalidation policy
Redis usually sets an expiration date for keys. After the expiration date, Redis will remove these keys from memory. There are three policies: scheduled cleanup, lazy cleanup, and scheduled scan cleanup.
Periodic cleanup (active)
Each key that is set to expire needs to create a timer that will be cleared immediately when it expires.
This policy can immediately clear expired data and is memory friendly, but it consumes a large amount of CPU resources to process expired data, affecting cache response time and throughput.
Lazy cleanup (passive)
When a key expires, it is not immediately cleared from the memory. Only when a key is accessed, the system checks whether the key has expired. If the key has expired, the system clears the key and returns null.
This strategy maximizes CPU savings but is very memory unfriendly. In extreme cases, a large number of expired keys may not be accessed again and thus will not be cleared, occupying a large amount of memory.
Periodic sweep sweep (active)
Every time a certain number of keys in the expires dictionary of a certain number of databases are scanned and expired keys are cleared. This strategy is a compromise between the first two. By adjusting the interval of periodic scan and the time limit of each scan, you can achieve the optimal balance of CPU and memory resources in different situations.
Both lazy cleanup and periodic scan cleanup strategies are used in Redis.
The so-called periodic scanning and clearing means that Redis randomly selects some keys with expiration time every 100ms by default, checks whether they are expired, and deletes them if they are expired.
If you put 10W keys in redis, all of which are set to expire, and you check 10W keys every few hundred milliseconds, the Redis will basically die, and the CPU will be overloaded with expired keys. Note that this is not a case of going through all the keys with expiration dates every 100ms, which would be a performance disaster. In fact redis randomly selects some keys every 100ms to check and delete.
However, the problem is that regular deletion may result in many expired keys not being deleted when the time comes. What can be done? So it’s lazy deletion. That is, when you get a key, Redis checks, is that key expired if it’s expired? If it’s out of date it’s deleted at this point and it doesn’t give you anything back.
When a key is retrieved, if the key has expired, it is deleted and nothing is returned. But this is actually a problem. What happens if you periodically delete a lot of expired keys and you don’t check them in time? What if a large number of expired keys pile up in memory, causing redis to run out of memory blocks? The answer is: go through the memory elimination mechanism.
Cache update strategy
There are three common strategies for cache updates:
- Cache aside
- Read/Write through
- Write behind caching
Cache aside
Cache aside is the most common caching strategy. The procedure for requesting data is as follows:
(1) If it is a data read request, the application will first determine whether the data exists in the cache. If the cache is hit, the data is returned directly. If the cache is not hit, the cache penetrates to the database, queries the data from the database, writes the data back to the cache, and finally returns the data to the client.
(2) If it is a data write request, update the database first, then delete the data from the cache.
The detailed process can be combined with the following flow chart:
If you look carefully at the above process, you can find that the common routine of read requests is to update the cache first and then delete the cache. Some students may ask why to delete the cache, update the database first and then update the cache. How about updating the cache first and then the database? There are a few pits involved here, and I’ll explain them one by one.
The Cache value on pit
If you use the wrong Cache aside strategy, you might run into pits. Let’s step on them one by one.
Step 1: update the database first, then update the cache
If a write request comes in and we update the database and then the cache, two concurrent write requests may result in dirty data.
Request 1 updates the database first and request 2 updates the database later. The expected result is that the age in the database is 20 and the age in the cache is 20. However, the age in the cache is 18 because the cache is updated after request 1 than request 2, which causes the inconsistency between the database and the cache and the age in the cache is dirty data.
Step 2: delete the cache first, then update the database
If the process of a write request is to delete the cache and then update the database, dirty data may occur in the scenario of a concurrent read request and a concurrent write request.
The process is as follows:
(1) Write request deletes cached data;
(2) Read request query cache missed, then query database, the returned data will be written back to the cache;
(3) Write requests to update the database.
During the whole process, it was found that the age in the database was 20 and the age in the cache was 18. The cache data was inconsistent with the database data, and dirty data appeared in the cache.
Best practice: Update the database before deleting the cache
This is recommended for write requests in real systems, but is problematic in theory.
The process is as follows:
(1) Read request first query cache, cache missed, query database returns data;
(2) Write request to update database and delete cache;
(3) Read request write back cache;
The age of the database is 20 and the age of the cache is 18. That is, the database is inconsistent with the cache, causing the application to read old data from the cache.
However, the probability of these problems is very low, because database update operations usually take several orders of magnitude more time than memory operations. As shown in the figure above, the last step of write back caching is usually done before updating the database. However, to avoid the effects of dirty data in this extreme case, we still need to set an expiration time for the cache.
Read through
In Cache Aside mode, the application code maintains two data stores, a Cache and a database. In the Read-through policy, the application does not need to manage the Cache and database. Instead, it only delegates synchronization of the database to the Cache Provider. All data interaction is done through the abstract cache layer.
Read-through reduces the load on the data source when a large number of reads are performed and is also resilient to cache service failures. If the cache service fails, the cache provider can still operate by going directly to the data source.
Read-through is suitable for multiple requests for the same data. This is very similar to cache-aside, but there are some differences, again:
- In cache-aside, the application is responsible for fetching data from the data source and updating it to the Cache.
- In read-through, this logic is typically supported by a separate cache provider.
Write through
In the write-through policy, when data is updated (Write), the Cache Provider is responsible for updating the underlying data source and Cache. The cache is consistent with the data source, and writes always reach the data source through the abstract cache layer.
Write behind
When data is updated, only the cache is updated and data is flushed to the database at regular intervals.
The advantage is that the data writing speed is very fast, which is suitable for frequent write scenarios.
The disadvantage is that the cache and database are inconsistent.
Cache Exception Scenario
In actual production environments, exceptions such as cache penetration, cache breakdown, and cache avalanche may occur. To avoid huge losses caused by exceptions, you need to understand the causes and solutions of each exception to improve system reliability and high availability.
The cache to penetrate
What is cache penetration?
Cache penetration refers to that the data requested by the user does not exist in the cache, that is, does not match the data, and does not exist in the database. As a result, the user has to query the data in the database every time the user requests the data, and then returns null.
If a malicious attacker constantly requests data that does not exist in the system, it will cause a large number of requests to fall on the database in a short time, resulting in excessive pressure on the database, and even break down the database system.
Caching penetrates common solutions
(1) Bloom filter (recommended)
Bloom Filter (BF), proposed by Burton Howard Bloom in 1970, is a probabilistic data structure with high spatial efficiency.
Bloom filters are designed to detect the presence of a particular element in a collection.
If we want to judge whether an element is in a set at ordinary times, we usually adopt the method of search comparison. The search efficiency of different data structures is analyzed below:
- Linear table storage, search time complexity O(N)
- Balanced binary sorting tree (AVL, red-black tree) storage, search time complexity is O(logN)
- Using hash table storage, considering the hash collision, the overall time complexity is also O[log(n/m)].
When it is necessary to determine whether an element exists in the massive data set, not only the search time is slow, but also occupies a large amount of storage space. Let’s look at how bloom filters solve this problem.
Bloom filter design idea
A Bloom filter is a data structure consisting of a bit array of m bits and K hash functions. Bit arrays are initialized to 0, and all hash functions separately hash the input data as evenly as possible.
When you want to insert an element into the Bloom filter, the element is computed by k hash functions to produce K hash values, using the hash value as the subscript in the bit array, and setting all k corresponding bit values from 0 to 1.
When you want to query an element, it is also calculated through the hash function to generate the hash value, and then check the corresponding K bit value: if any bit is 0, it indicates that the element must not be in the set; If all bits are 1, the set is likely to be in the set. Why doesn’t it have to be in the set? Because different elements may have the same hash value, a hash collision can occur, causing a nonexistent element to have a 1 bit. This is known as a false positive. In contrast, “false negatives” never appear in BF.
To summarize: What the Bloom filter thinks is not in the set is definitely not in the set; What the Bloom filter thinks is in, may or may not be in the set.
For example, here is a Bloom filter with 18 bits and 3 hash functions. The three elements of the set x, y, and z are hashed to different bits by three hash functions, and the bit position is 1. When you query for element W, you can be sure that the element is not in the set because one of the bits has a value of 0, evaluated by three hash functions.
Advantages and disadvantages of bloom filter
Advantages:
- Space saving: You do not need to store the data itself, but only the hash bits corresponding to the data
- Low time complexity: The time complexity of both insert and search is O(k), where K is the number of hash functions
Disadvantages:
- There are false positives: Bloom filter judge exists, may appear elements are not in the set; The accuracy depends on the number of hash functions
- Cannot delete elements: If an element is deleted but cannot be removed from the Bloom filter, this is also the cause of false positives
Application scenario of bloom filter
- Crawler URL deduplication
- Spam filtering
- The blacklist
(2) Return an empty object
If the cache is not hit and the query persistence layer is empty, the returned empty object can be written to the cache. In this way, the next time the key is requested, the empty object will be returned directly from the cache, and the request will not fall to the persistence layer database. To avoid storing too many empty objects, it is common to set an expiration time for empty objects.
There are two problems with this approach:
- If there are a large number of key traversals, caching empty objects can take up valuable memory space.
- The key of an empty object is set to expire at a time when inconsistencies between cache and persistence layer data can occur.
Cache breakdown
What is cache breakdown?
Cache breakdown refers to a key is very hot, in the continuous carrying of large concurrency, large concurrency focused on this point to access, when the key in the moment of failure, continuous large concurrency will Pierce the cache, directly request the database, just like in a barrier cut a hole.
Cache breakdown hazard
A sudden increase in database pressure caused a large number of requests to block.
How to solve
Use mutex keys
The idea is to let one thread write back to the cache, and the other threads wait for the write back cache thread to finish, and then re-read the cache.
Only one thread reads the database and writes back to the cache at a time, and all other threads are blocked. In a high-concurrency scenario, a large number of threads blocking will inevitably reduce throughput. How can this situation be resolved? You can discuss it in the comments section.
Distributed locks are required for distributed applications.
Hotspot data never expires
Never expire actually has two meanings:
- The physical key does not expire, and the expiration time is not set for the hotspot key
- Logical expiration: store the expiration time in the value corresponding to the key. If it is found to be about to expire, build the cache through an asynchronous thread in the background
In practice this approach is very performance-friendly, the only downside is that while the cache is being built, the rest of the threads (non-build cache threads) may be accessing old data, which is acceptable for systems that do not pursue strict consistency.
Cache avalanche
What is cache avalanche?
Cache avalanche refers to the fact that a large amount of data in the cache reaches the expiration time, but a large amount of query data directly falls on the database, causing excessive pressure or even downtime of the database. Unlike a cache breakdown, which refers to simultaneous searches for the same data, a cache avalanche is when different data is out of date and a lot of data is not available to search the database.
Cache Avalanche solution
Common solutions are:
- Uniform overdue
- Add a mutex
- Cache never expires
- Two-tier cache strategy
(1) Uniform expiration
Set different expiration times so that the cache expires as evenly as possible. It is often possible to add a random value to the expiration date or to unify the planned expiration date.
(2) Add mutex
Same as the cache breakdown solution, only one thread builds the cache at a time, and the other threads block and queue.
(3) Cache never expires
In line with the cache breakdown solution, the cache is physically never expired, and an asynchronous thread updates the cache.
(4) Two-layer cache strategy
Use the primary/secondary cache:
Primary cache: The validity period is set according to the empirical value. The primary cache is set to read by the primary cache, and the latest value is loaded from the database after the primary cache fails.
Backup cache: the cache that is read when the lock fails to be obtained. When the primary cache is updated, the backup cache needs to be updated synchronously.
Cache warming
What is cache preheating?
Cache preheating means that relevant cache data is directly loaded into the cache system after the system goes online. In this way, users can avoid querying the database first and then writing the data back to the cache.
If there is no preheating, the initial status data of Redis will be empty. In the early stage of system on-line, high concurrent traffic will be accessed to the database, causing traffic pressure to the database.
The operation method of cache preheating
- When the amount of data is not large, loading cache action is carried out when the project is started.
- When there is a large amount of data, set a scheduled task script to refresh the cache.
- If the amount of data is too large, ensure that hotspot data is loaded to the cache in advance.
Cache the drop
Cache degradation refers to the failure of the cache or the failure of the cache server. Instead of accessing the database, the default data is returned or the memory data of the service is accessed.
In actual project practice, it is common to cache some hot data in the memory of the service. In this way, once the cache is abnormal, the service memory data can be directly used, thus avoiding huge pressure on the database.
A downgrade is generally harmful. Therefore, minimize the impact of a downgrade on services.
Highly available architecture
Replication (master/slave Replication)
What is master-slave replication?
Primary/secondary replication refers to the replication of data from one Redis server to other Redis servers. The former is called the master node and the latter is called the slave node. The replication of data is one-way and can only go from the master node to the slave node.
The role of master-slave replication
- Data redundancy: Master/slave replication implements hot backup of data and is a data redundancy method other than persistence.
- Fault recovery: When the primary node is faulty, the secondary node provides services for rapid fault recovery. It’s actually redundancy of services.
- Load balancing: On the basis of master/slave replication and read/write separation, the master node provides the write service and the slave node provides the read service to share server load. Especially in the scenario of less write and more read, the concurrency of the Redis server can be greatly increased by sharing the read load with multiple slave nodes.
- High availability cornerstone: Master-slave replication is the foundation upon which sentry and clustering can be implemented, so master-slave replication is the foundation of High availability for Redis.
Implementation principle of master/slave replication
The master-slave replication process can be divided into three stages: connection establishment, data synchronization, and command transmission.
Connection setup phase
The main purpose of this phase is to establish a connection between the primary and secondary nodes to prepare for data synchronization.
Step 1: Save the information about the primary node
The slaveof command is asynchronous. Run the slaveof command on the slave node. The slave node immediately returns OK to the client.
Step 2: Establish a socket connection
Replication timing function replicationCron() is invoked once every second on a secondary node. If a primary node is available for connection, a socket connection is created based on the IP address and port of the primary node.
The slave node establishes a file event handler for the socket to handle the replication work, and is responsible for the subsequent replication work, such as receiving RDB files and receiving command transmission.
After the master node receives the socket connection from the slave node (that is, after accept), it creates the corresponding client state for the socket and treats the slave node as a client connected to the master node. The following steps take the form of a command request from the slave node to the master node.
Step 3: Send the ping command
After the secondary node becomes the client of the primary node, the ping command is sent for the first request. The purpose is to check whether the socket connection is available and whether the primary node can currently process the request.
After the ping command is sent from a node, the following situations may occur:
(1) Pong is returned: the socket connection is normal, and the master node can currently process requests. The replication process continues.
(2) Timeout: The slave node does not receive any reply from the master node after a certain period of time. If the socket connection is unavailable, the slave node disconnects the socket connection and reconnects the slave node.
(3) Return results other than pong: If the master node returns other results, such as scripts that run timeout are being processed, it indicates that the master node cannot process commands at present. Then disconnect the socket connection from the slave node and reconnect.
Step 4: Authentication
If the masterAuth option is set from the slave node, the slave node needs to authenticate to the master node; If this option is not set, no authentication is required. Authentication from the slave node is done by sending the auth command to the master node, whose parameter is the value of masterauth in the configuration file.
If the password status on the master node is the same as that on the slave node masterauth (consistent means both exist and the password is the same, or neither exists), the authentication succeeds and the replication continues. If no, disconnect the socket from the secondary node and reconnect the socket.
Step 5: Send the port information of the secondary node
After authentication, the slave node sends the port number it listens on to the master node (6380 in the previous example). The master node saves this information in the Slave_listening_port field of the slave node’s corresponding client. This port information has no other function than to be displayed when info Replication is being performed on the primary node.
Data synchronization phase
After the connection between the primary and secondary nodes is established, data synchronization can start. This phase can be understood as the initialization of data on the secondary node. To perform this operation, the secondary node sends the psync command to the primary node to start the synchronization.
Data synchronization is the core phase of the primary/secondary replication. Based on the status of the primary/secondary nodes, data synchronization can be divided into full replication and partial replication. The two replication modes and the execution process of the psync command will be explained later.
Command propagation phase
After the data synchronization phase is complete, the primary and secondary nodes enter the command transmission phase. In this phase, the master node sends the write command to the slave node, and the slave node receives and executes the command to ensure data consistency between the master and slave nodes.
It should be noted that command propagation is an asynchronous process, that is, the master node does not wait for a reply from the slave node after sending a write command. Therefore, it is difficult to maintain real-time consistency between master and slave nodes and delay is inevitable. The extent of data inconsistency depends on the network status between the primary and secondary nodes, the execution frequency of write commands on the primary node, and the repl-disable-tcp-nodelay configuration on the primary node.
Sentinel (Sentinel mode)
Why sentry mode?
In the master-slave replication mode of Redis, once the master node fails to provide services, the slave node needs to be manually promoted to the master node and the client needs to update the address of the master node, which is unacceptable to some extent.
Redis 2.8 provides the Redis Sentinel mechanism to solve this problem.
What is Sentinel mode?
Redis Sentinel is a highly available implementation of Redis. Sentinel is a tool that manages multiple instances of Redis, enabling monitoring, notification, and automatic failover of Redis.
The Redis Sentinel architecture diagram is as follows:
The principle of sentinel mode
The main function of sentinel mode is that it can automatically complete failure discovery and failover, and notify clients, thus achieving high availability. The Sentinel pattern usually consists of a set of Sentinel nodes and a set (or groups) of master/slave replication nodes.
heartbeat
(1) Sentinel and Redis Node
Redis Sentinel is a special Redis node. When the Sentinel mode is created, the relationship between Sentinel and Redis Master Node needs to be specified through the configuration. Then Sentinel will obtain the information of all slave nodes from the Master Node. Then Sentinel periodically sends the info command to master and slave nodes to obtain their topology and status information.
(2) Sentinel and Sentinel
Based on the subscription publishing function of Redis, each Sentinel node sends a message to the Sentinel of the primary node: The hello channel sends the judgment of the Sentinel node on the master node and the information of the current Sentinel node. At the same time, each Sentinel node will subscribe to the channel to obtain the information of other Sentinel nodes and their judgment on the master node.
Through the above two steps, all Sentinel nodes and between them and all Redis nodes are aware of each other. Then, each Sentinel node will send a ping command periodically to the master node, the slave node, and the other Sentinel nodes for heartbeat detection. To verify that these nodes are reachable.
failover
Each Sentinel will conduct heartbeat check periodically. When the heartbeat detection timeout occurs on the primary node, the primary node is considered to be unavailable, and this judgment is called subjective offline.
Then, the Sentinel node will ask other Sentinel nodes for the judgment of the primary node through the Sentinel ismaster-down-by-addr command. When all the Sentinel nodes consider the node to be faulty, Objective offline is performed, that is, the node is considered unavailable. This also explains why a set of Sentinel nodes is necessary, because a single Sentinel node can easily misjudge the failure state.
Here, the quorum value is specified during Sentinel mode construction, which will be explained later. It is usually the total number of Sentinel nodes /2+1, that is, objective offline can be performed if more than half of the nodes make subjective offline judgment.
Since only one Sentinel node is required to complete the failover, an election is made between the Sentinel nodes to select a Sentinel leader based on Raft algorithm to perform the failover.
The steps for Sentinel elected leaders to failover are as follows:
(1) Select a node from the node list as the new master node
- Filter nodes that are unhealthy or do not meet requirements;
- Select the slave node with the highest slave-priority. If the slave node exists, the slave node returns. If the slave node does not exist, the slave node continues.
- Select the slave node with the largest replication offset, return if it exists, continue if it does not exist.
- Select the slave node with the smallest RUNId.
(2) The Sentinel leader node will execute the Slaveof no one command on the selected slave node to make it become the master node.
(3) The Sentinel leader node sends commands to the remaining slave nodes to copy data from the new master node.
(4) The Sentinel leader will update the original master node to the slave node, monitor it, and order it to replicate the new master node after its recovery.
Cluster (Cluster)
Why Cluster?
In both master/slave mode and Sentinel mode, only one master can write data. In the scenario of massive data and high concurrency, data writing on one node is prone to bottlenecks. Cluster mode enables multiple nodes to write data at the same time.
What is Cluster mode?
Redis-cluster adopts a centrless structure. Each node stores data and is connected to each other to know the status of the whole Cluster.
As shown in the figure, Cluster mode is actually a combination of multiple master-slave replication structures. Each master-slave replication structure can be regarded as a node, so there are three nodes in the Cluster above.
Principle of Cluster mode
Redis cluster TCP port
Each Redis cluster node needs to enable two TCP listening ports. One is used to provide common Redis service to clients, usually 6379, and the other is used for inter-cluster communication service. The offset of the common port is 10000, for example, 16379.
The second port is used for the cluster bus, a node-to-node communication channel using binary protocols. Nodes use the cluster bus for fault detection, configuration updates, failover authorization, and so on. Clients should never attempt to communicate with the cluster bus port, but always use the normal Redis command port, but make sure you have both ports open in the firewall or the Redis cluster nodes will not be able to communicate.
Redis cluster data sharding
Common Application Scenarios
todo
Practical article
Docker is used to build a primary/secondary redis replication cluster
Target 0.
Locally set up three Redis instances (one active and two standby) to achieve the effect: data inserted by the primary instance can be replicated and synchronized by the secondary instance.
1. Install docker and run Docker
Docker installation steps omitted, you can download and install from the official website.
Check whether docker is running successfully:
docker info
Copy the code
If the command output is displayed, the command is successfully executed and you can go to the next step.
2. Pull the Redis image file
Run the following command to pull an official Redis image whose tag is latest by default
docker pull redis
Copy the code
3. Prepare the redis configuration file redis.conf
Download address: raw.githubusercontent.com/antirez/red…
Conf, redis02.conf, and redis03.conf
Open all configuration files and modify the following configuration items:
- Comments only listen for local options and can be connected remotely. # bind 127.0.0.1
- Turn off the protection mode protected-mode no
- Turn on the AOF persistent switch appendOnly Yes
4. Start the Redis instance
Example # 1 docker run - p, 6381:6379 - name redis server - 01 - v/your/path/redis/conf/redis01. Conf: / etc/redis/redis conf - v / your/path/redis/data01: / data - d redis redis server/etc/redis/redis instance conf # 2 docker run - p, 6382:6379 - name redis-server-02 -v /your/path/redis/redis/conf/redis02.conf:/etc/redis/redis.conf -v /your/path/redis/data02:/data -d Docker run -p 6383:6379 --name redis-server-03-v docker run -p 6383:6379 --name redis-server-03-v /your/path/redis/conf/redis03.conf:/etc/redis/redis.conf -v /your/path/redis/data03:/data -d redis redis-server /etc/redis/redis.confCopy the code
A brief explanation of the above commands:
- -p 6381:6379,6381 indicates the host port, and 6379 indicates the container instance port, which means that the container instance port is mapped to the host port.
- The –name redis-server-01 parameter gives the container instance a name.
- Parameters – v/your/path/redis/conf/redis01. Conf: / etc/redis/redis conf, hosting configuration file path is before the colon, the colon is after the container configuration file path, which means the container instance configuration path mapped to the path of the host machine.
- Parameters – v/your/path/redis/data01: / data, means the same as above.
- The -d option indicates running the instance in the background.
- Redis-server /etc/redis/redis.conf indicates that the redis-server command is executed. /etc/redis/redis.conf indicates that the redis instance is started using the configuration file. Note The configuration file must be the first parameter of the redis-server command.
5. Configure the primary and secondary replication clusters
Check the instance health:
docker ps
Copy the code
If there are three redis instances in the command output, it is normal. Example Query the internal IP address of redis-server-01
docker inspect redis-server-01
Copy the code
As shown in the command output, “IPAddress”: “172.17.0.4” Example 1 is configured as the active instance, and the other two instances are configured as the standby instance. You can configure the IP address and port of the active instance in the standby configuration file to achieve the effect of primary/secondary replication.
Redis02. conf and redis03.conf, find replicaof (slaveof before redis5.0) and change it to:
Replicaof 172.17.0.4 6379Copy the code
After modification, restart instances 2 and 3:
docker restart redis-server-02
docker restart redis-server-03
Copy the code
Check whether instance 1 is in the primary state and two standby instances are mounted:
Docker exec it redis-server-01 redis-cli 127.0.0.1:6379> infoCopy the code
The primary/secondary replication is successfully configured if the following information is displayed:
Replication role:master Connected_SLAVES :2 Slave0: IP =172.17.0.3,port=6379,state=online,offset=84,lag=1 = 172.17.0.2 slave1: IP and port = 6379, state = online, offset = 84, lag = 1Copy the code
6. Test the primary/secondary replication effect
Insert a record into redis instance 1:
Docker exec it redis-server-01 redis-cli # connect to 1 127.0.0.1:6379> set name rayCopy the code
Connect redis instance 2 to Redis instance 3 to check whether the replication is successful:
Docker exec -it redis-server-02 redis-cli # connect to docker 2 127.0.0.1:6379> get name "ray" #Copy the code
— So far redis master-slave replication instances build and test completed, friends learn?
Build redis master-slave copy + sentinel mode with Docker
Set up three Sentinel instances +Redis instances
0. Sentry function
- Monitoring: The sentry continuously checks whether the master and slave nodes are functioning properly.
- Automatic failover: When the master node does not work properly, the Sentry starts an Automatic failover operation by upgrading one of the slave nodes of the failed master node to the new master node and making the other slave nodes replicate the new master node instead.
- Configurationprovider: During initialization, the client connects to the sentinel to obtain the primary node address of the current Redis service.
- Notification: The sentry can send the result of a failover to the client.
1. Prepare the sentinel configuration file sentinel-conf
IO /download the sentinel.conf file
Make 3 triplicate copies, such as sentinel01.conf, Sentinel02.conf, sentinel03.conf
Open all configuration files and modify the following configuration items:
- Turn off the protection mode protected-mode no
- Sentinel monitor myMaster 172.17.0.4 6379 1 – Configure logfile “sentinel.log”
2. Start the Sentinel instance
Example # 1 docker run - p 26381-26379 - v/your/path/sentinel01 conf: / etc/redis/sentinel conf - v /your/path/sentinel-data01:/data --name sentinel-01 -d redis redis-sentinel /etc/redis/sentinel.conf # 26382:26379 -v /your/path/sentinel02.conf:/etc/redis/sentinel.conf -v /your/path/sentinel-data02:/data --name Example 3 Docker run -p 26383:26379-v /your/path/sentinel03.conf:/etc/redis/sentinel.conf -v /your/path/sentinel-data03:/data --name sentinel-03 -d redis redis-sentinel /etc/redis/sentinel.confCopy the code
3. Test Sentinel mode
Join Sentry instance 1 to query current status:
Docker exec it sentinel-01 redis-cli -p 26379 127.0.0.1:26379> info sentinel The following command output is displayed: # sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 Master0: name = mymaster, status = ok, address = 172.17.0.4:6379, slaves = 2, sentinels = 3Copy the code
The command output indicates that the IP address of the primary instance of the redis-server is 172.17.0.4. If the primary instance of the redis-server is disabled, the sentry cluster will select one of the two secondary instances as the primary instance.
Docker stop redis-server-01Copy the code
Join Sentry instance 1 to query whether the IP address of the current primary instance has changed:
Docker exec it sentinel-01 redis-cli -p 26379 127.0.0.1:26379> info sentinel The following command output is displayed: # sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 Master0: name = mymaster, status = ok, address = 172.17.0.3:6379, slaves = 2, sentinels = 3Copy the code
The command output shows that the primary IP address has been changed to 172.17.0.3 – the sentry mode has been tested.
— END —
Daily praise: Hello technical person, your praise is my motivation on the way forward, the next stage is more exciting.
The blogger graduated from Huazhong University of Science and Technology with a master’s degree. He is a programmer who pursues technology and has passion for life. A few years in a number of first-line Internet companies, with years of actual combat experience.
Search the official wechat account “Architect who loves to laugh”, I have technology and story, waiting for you.
The articles are constantly updated, you can see my archived series of articles on Github /JavaMap, have interview experience and technical expertise, welcome Star.