Lift the veil and chase Redis for seven straight questions

Hello Redis has a few questions for you

Hello, Redis! We have been getting along for many years, from a vague understanding to now we have deeply combined. I have always known and remembered your kindness. Could you please let me ask you a few more questions, so that I can get to know you more deeply?

1. What is the communication protocol of Redis

Redis communication Protocol is text Protocol, yes, REDis server and client through RESP (Redis Serialization Protocol) Protocol communication, yes, text Protocol is indeed a waste of traffic, but its advantages are intuitive, very simple, parsing performance and its good, We don’t need a special Redis client to just Telnet or text stream to communicate with Redis.

Client command format:

Simple Strings, starting with a “+” plus sign
Errors, starting with a “-” minus sign
Integer Integer, starting with a colon (:)
The large string type Bulk Strings begins with the dollar sign “$”
The array type starts with an asterisk (*)

setHello ABC a simple text stream can be a redis clientCopy the code

Simple summary

IO /topics/prot… Redis document believes that simple implementation, fast parsing, intuitive understanding is the most important place to use RESP text protocol, it is possible that the text agreement will cause a certain amount of traffic waste, but it is fast and simple in performance and operation, which is also a tradeoff and coordination process.

2. Whether redis has ACID transactions

To find out whether redis has a transaction, it is actually very simple, go to the official website of Redis to check the document, find:

atomic

Atomicity of a transaction means that the database executes multiple operations in a transaction as a whole, and the service either performs all or none of the operations in the transaction.

Transaction queue

First understand redis started transaction multi command, redis will be generated a queue for this transaction, each operation command according to the sequence is inserted into the queue, the queue command will not be executed immediately in it, and know the exec command to commit the transaction, all the inside of the queue command will be a one-off, and exclusive for execution.

As you can see from the example above, when executing a successful transaction, the commands in the transaction are executed in order and exclusively in the queue. But the other thing about atomicity is that it’s all or nothing, which is what we call rollback in traditional DB.

When we execute a failed transaction:

It can be found that even if there is a failure in the middle, the set ABC x operation has been executed without rollback, so redis is not atomic in a strict sense.

Why does Redis not support rollback

Redolog is a redolog that can be used to record the operation instructions. Redolog is a redolog that can be used to record the operation instructions. Redolog is a reDOLog that is used to record the operation instructions. Redolog, binlog is written before the transaction commits

To know that mysql in order to be able to rollback is a lot of cost, Redis application scenarios are more against high concurrency with high performance, so Redis to choose a simpler, faster way to deal with transactions without rollback is also in line with the scenario.

consistency

Transaction consistency means that if the database is consistent before the transaction is executed, it should be consistent after the transaction is executed, regardless of whether the transaction succeeds or not.

From redis can be from two levels, one is whether the execution error to ensure consistency, the other is downtime, redis whether there is a mechanism to ensure consistency.

Whether the execution error ensures consistency

An error transaction is still executed, and errors are identified and handled during the execution of the transaction. These errors do not change the database or affect the consistency of the transaction.

Impact of downtime on consistency

Regardless of the distributed highly available Redis solution, let’s first see if the downtime recovery can satisfy the data integrity constraints from a single machine.

Either RDB or AOF persistence schemes can be used to restore data to a consistent state using RDB files or AOF files.

Reconsidered consistency ❓❓

The above view of the effect of execution errors and outages on consistency is taken from Huang Jianhong’s Redis Design and Implementation. While reading this chapter, there are still some points of doubt. After all, Redis is not a relational database. Consistency is to go from state A to state B through transactions without breaking various constraints, just talking about redis business implementation, that is obviously satisfactory consistency.

But if the business is added to talk about consistency, for example, A transfers to B, A reduces 10 yuan, B increases 10 yuan, because Redis does not have rollback, it does not have the traditional sense of atomicity, so it should not have the traditional consistency from Redis.

In fact, here is a brief discussion of the concept of traditional ACID redis how to connect, maybe, or maybe I think too much, using traditional relational database ACID to audit redis is meaningless, redis has no intention to implement ACID transactions.

Isolation,

Isolation refers to the fact that multiple transactions are executed concurrently in the database without affecting each other, and that transactions executed concurrently and sequentially produce exactly the same results.

Because redis is a single thread operation, there is a natural isolation mechanism in isolation. When Redis executes a transaction, the server of Redis guarantees that the transaction will not be interrupted during the execution of the transaction. Therefore, Redis transactions are always run in a serial way, and the transaction also has isolation.

persistence

Transaction persistence means that when a transaction is completed, the results of the execution of the transaction are kept in persistent storage, and the results of the executed transaction will not be lost even if the server is down after the transaction is completed.

Whether Redis has persistence depends on the redis persistence mode

Pure memory running, no persistence, once the service is down, all data will be lost
RDB mode, depending on the RDB policy, bgSave is executed only if the policy is met. Asynchronous execution does not guarantee persistence in Redis
In aOF mode, redis is persistent only when appendfsync is set to always

(Set appendfsync to always, in theory persisting is possible, but generally not)

Simple summary

Redis has some atomicity, but does not support rollback
Redis does not have the concept of consistency in ACID (or at least ignores it in its design)
Redis has isolation
Redis ensures persistence through certain strategies

Redis and ACID are purely from the user’s point of view. Redis design is more about the pursuit of simplicity and high performance, and will not be constrained by traditional ACID.

3. How is redis optimistic lock Watch implemented?

When we think of optimistic locking, we think of CAS (Compare And Set). The CAS operation consists of three operands — the value of the memory location (V), the expected old value (A), And the new value (B). If the value of the memory location matches the expected original value, the processor automatically updates the location to the new value. Otherwise, the processor does nothing.

Watch is implemented in the transaction of Redis. Watch will keep track of one or more key variables before the transaction starts. When the transaction is executed, that is, when the server receives the exec instruction to execute the cached transaction queue sequentially, Redis will check whether the key variables have been modified since Watch.

Java’s AtomicXXX optimistic locking mechanism

In Java we often use some optimistic lock parameters, such as AtomicXXX, these mechanisms behind how to implement, whether Redis is also Java CAS implementation mechanism is the same, let’s first look at Java Atomic class, let’s look at the source code, You can see that behind it is actually Unsafe_CompareAndSwapObject

Can see compareAndSwapObject is native method, need to pursue, you can download the source code or open hg.openjdk.java.net/jdk8u/

cmpxchg

It can be found that tracing to the final CAS, “compare and modify”, originally means two meanings, but finally one CPU instruction CMPXCHG is completed, CMPXCHG is a CPU instruction command rather than multiple CPU instructions, so it will not be interrupted by multi-thread scheduling, so it can ensure that the cas operation is an atomic operation. Of course, CMPXCHG has ABA and multiple retries, which are not discussed here.

Redis watch mechanism

Does Redis watch also use CMPXCHG? There are similarities and some differences in usage. Redis Watch does not have ABA problem and does not have multiple retry mechanism, among which the most significant difference is:

Redis transaction execution is actually serial, simple chase after the source code: excerpt from the source code may be some messy, good can simply summarize the data structure diagram and simple flow chart, then look at the source code will be a lot clearer

storage

RedisDb holds a watched_keys DCIT structure. Each watch key value is a linked list structure that holds a set of Redis client flags.

process

The watched_keys structure will be queried every time when watch, multi, and exec are executed. Every time a touch touches a key under watch, it will be marked as CLIENT_DIRTY_CAS

Because all transactions are serial in Redis, assuming that both client A and client B watch the same key, when client A makes touch modification or A finishes execution first, Client A is removed from the watched_keys list of keys and all clients in the list are set to CLIENT_DIRTY_CAS. When client B starts executing, it determines that its state is CLIENT_DIRTY_CAS. DiscardTransaction terminates the transaction.

Simple summary

The implementation of CMPXCHG mainly uses CPU instructions, seemingly two operations are completed by one CPU instruction, so it will not be interrupted by multiple threads. The Watch mechanism of Redis makes more use of the single thread mechanism of Redis itself, and adopts watched_keys data structure and serial process to realize the optimistic lock mechanism.

4. How is Redis persisted

Redis persistence has two mechanisms. One is RDB, also known as snapshot. A snapshot is a full backup, which serializes all redis memory data in binary and stores it to disk. The other is aOF journaling. Aof journaling is the recording of instructions for data manipulation and modification. Aof journaling is similar to mysql’s binlog, and the AOF date only increses infinitely over time.

When redis is restored, the RDB snapshot can be read directly from the disk, while AOF needs to replay all operation instructions for recovery, which may be a very long process.

RDB

Redis in RDB snapshot generation there are two methods, one is save, because Redis is a single process single thread, direct use of save, Redis will carry out a huge file IO operation, because a single process single thread is bound to block online business, generally not directly use SAVE, but bgSave. When using BGSave, Redis forks a child process to persist snapshots, while the parent process continues to handle requests for online transactions.

The fork mechanism

Want to clear up RDB snapshot mechanism of generating principle must be made clear the fork, the fork mechanism is a process of Linux operating system, when the parent process fork out a child process, the child process and share a common memory data structure, process, the child has just produced, it and the inside of the parent process Shared memory of code and data segments.

At the beginning, both processes have the same memory segment. During data persistence, the child process does not modify the current memory data, but separates the pages of the data segment by cow (Copy on Write). When the parent process modifies a data segment, the shared pages are copied and separated. The parent process then makes changes in the new data segment.

split

This process is also called splitting. Both parent and child processes point to many of the same blocks, but if the parent process makes the change to one of the blocks, it copies it, splits it, and modifies it on the new block.

Since the child process forks the memory, the data at this point in time does not change, so we can safely generate snapshots without worrying about the contents of the snapshot being affected by the parent process request. Also, we can imagine if redis does nothing during the bgSave process. The parent process does not receive any business requests and does not have any backside actions such as expiration removal. The parent and child processes will use the same memory block.

AOF

AOF is the log store of redis operation instructions, similar to the binlog of mysql. Assuming that AOF has been executed since the creation of Redis, AOF records all records of Redis instructions. If you want to restore Redis, you can perform instruction replay on AOF to repair the whole redis instance. However, AOF logs also have two major problems. One is that AOF logs will increase over time. If a large amount of data is run for a long time, THE AOF log volume will become extremely large.

AOF write operations are performed after redis has processed the business logic, and some AOF logs are saved according to certain policies. This is very different from mysql’s redolog and binlog. In fact, for this reason, Redis records the operation logs after the processing logic. This is one of the reasons why Redis can’t roll back.

bgrewriteaof

Redis also uses bgrewriteAof to slim down aOF logs after 2.4. The bgrewriteaof command is used to asynchronously perform an AOF file rewrite operation. Overrides create a volume-optimized version of the current AOF file.

RDB and AOF mix and match modes

If we use RDB to restore Redis, we may lose a lot of data due to bgSave strategy. If we use AOF mode and recover through AOF operation log replay, the REPLAY of AOF log takes much longer than RDB.

Redis4.0, in order to solve this problem, the introduction of the new persistence model, hybrid persistence, RDB file and local incremental AOF file, the combination of RDB can use long time preservation strategy, AOF need not be full amount log, only needs to be saved before a RDB storage to incremental AOF this time log can, In general, this log volume is very small.

5. How does Redis increase and reduce expenditure in memory usage

Redis is different from other traditional databases. Redis is a pure memory database, and stores some data structures. If the memory is not controlled, RedIS is likely to crash the system due to the large amount of data

ziplist

127.0.0.1:6379> hset hash_test abc 1
(integer) 1
127.0.0.1:6379> object encoding hash_test
"ziplist"
127.0.0.1:6379> zadd z_test 10 key
(integer) 1
127.0.0.1:6379> object encoding z_test
"ziplist"
Copy the code

When I first tried to open a hash structure with a small amount of data and a zset structure, I found that their real structure type in Redis is a Ziplist. Ziplist is a compact data structure with continuous memory between each element. If in Redis, When the data structure redis enables is small, Redis switches to compact storage for compressed storage.

For example, in the above example, we use the hash structure for storage. The hash structure is a two-dimensional structure, and it is a typical structure that trades space for time. But if you use the amount of data is small, the use of two-dimensional structure instead of wasted space, on the performance of the time also did not get too big, it is better to direct use of one dimensional structure for storage, at the time of search, while the complexity is O (n), but because the data quantity is little traversal also very fast, faster than a hash structure itself, to query.

This small object store can also be upgraded to a standard structure if the number of elements in a collection object increases, or if a value becomes too large. Redis can also be configured with conversion parameters that define compact and standard constructs:

hash-max-ziplist-entries 512  If the number of hash elements exceeds 512, they must be stored in a standard structure
hash-max-ziplist-value 64     Any element of the # hash whose key/value is longer than 64 must be stored in a standard structure
list-max-ziplist-entries 512  
list-max-ziplist-value 64  
zset-max-ziplist-entries 128 
zset-max-ziplist-value 64  
set-max-intset-entries 512 
Copy the code

quicklist

127.0.0.1:6379> rpush key v1
(integer) 1
127.0.0.1:6379> object encoding key
"quicklist"
Copy the code

The QuickList data structure is a bidirectional linked list data structure introduced by Redis in 3.2. It is indeed a Ziplist bidirectional linked list. Each data node of the QuickList is a Ziplist, and ziplist itself is a compact list. If the Quicklist contains 5 Ziplist nodes, and each Ziplist list contains 5 data, then from the outside, This QuickList contains 25 items of data.

The structure of QuickList can be summarized as a compromise between space and time:

A bidirectional list can do push and pop at both ends, but it also stores two Pointers at each node in addition to its own data, adding extra memory overhead. Secondly, because each node is independent and not contiguous in memory address, more nodes are prone to memory fragmentation.
Ziplist itself is a piece of continuous memory, storage and query efficiency is very high, but it is not conducive to modify operations, every data change will trigger memory realLOC, if the Ziplist length is very long, a realLOC will lead to a large number of data copy.

So, combining the advantages of Ziplist and bidirectional linked lists, QucikList was born.

Objects share

Redis has built a reference-counting method into its object system, which allows applications to track the reference counting information of objects and, in addition to releasing objects when appropriate, share them as objects. For example, if key A creates A string with integer value 100 as A value object and key B creates A string with integer value 100 as A value object, then the redis operation:

A pointer to a database key points to an existing value object
Add one to the reference count of the shared value object

If, instead of A and B, there are hundreds of keys to the integer value 100 in our database, the redis server only needs the memory of one string object to store data that would otherwise require the memory of hundreds of string objects.

6. How does Redis implement master/slave replication

Several definitions

RunID Indicates the running ID of the server
Offset Specifies the replication offset of the primary server and that of the secondary server
Replication Backlog Replication backlog buffer for the primary server

After redis2.8, use the psync command to replace the sync command to perform replication synchronization. The psync command has two modes: full resynchronization and partial resynchronization.

Full synchronization is used to handle the initial replication. Full resynchronization is performed in the same way as sync, by having the primary server create and send the RDB file, and by sending the secondary server write commands stored in the buffer.
Partial synchronization is used to handle break line repeat: when after disconnection from the server to connect the server, the main service can speak master-slave server connection is broken during the execution of write command to send from the server, as long as the receiving from the server and perform these write command, you can speak database updates to the current state of the main server.

Full resynchronization:

The slave sends psync to the master for the first time without the runid and offset
The master receives the request and sends the master’s runid and offset to the slave node
Master generates and saves RDB files
The master sends an RDB file to the slave
At the same time that the RDB operation is sent, the write operation is copied to the replication Backlog buffer and sent from the buffer area to the slave
The slave loads data from the RDB file and updates its own data

If the network jitter or short disconnection also needs to be fully synchronized, this can result in a lot of overhead, including bgSave time, RDB file transfer time, RDB reload time, and AOF rewrite if the slave has AOF. These are a lot of overhead so partial resynchronization was implemented after Redis2.8.

Partial resynchronization:

A network error occurs, and the master and slave are disconnected
The master still writes data to the buffer buffer
Slave reconnects to the master
The slave sends its current RUNId and offset to the master
The master determines whether the offset sent by the slave to itself exists in the buffer queue. If it does, it sends a continue to the slave. If it does not, it means that too much data may have been wrong and the buffer has been emptied
The master sends the buffer data offset from offset to the slave
Slave Obtains data and updates its own data

7. How does Redis formulate the expiration deletion policy

When a key is expired, the memory in Redis is not removed from the memory in real time, but redis removes some expired keys through a certain mechanism, so as to achieve the release of memory. When will Redis delete a key when it is expired? There are three possibilities for when to delete, and these three possibilities represent the three different deletion strategies of Redis.

Scheduled deletion: When you set the past time of a key, create a timer to delete the key when the timer expires.
Lazy deletion: A key is left to expire, but each time a key is retrieved from the key space, the key is checked for expiration, and if so, the key is deleted.
Periodically delete: Every once in a while, the program checks the database and removes expired keys. How many expired keys to delete depends on the algorithm.

Time to delete

Set the key of the expiration time, create a timer, once the expiration time is coming, to operate key, as soon as this is friendly to memory, but the CPU time is the most unfriendly, especially in the business is busy, expired keys a lot of time, this operation will delete expired key occupy a significant portion of the CPU time and want to know what redis is a single-threaded operation, When memory is not tight but CPU is tight, wasting CPU time on removing expired keys that are irrelevant to the business can have an impact on redis server response time and throughput. In addition, the need to create a timer events, the time in the redis server when close time implementation is unordered list of events, the time complexity is O (n), allow the server to create a timer to achieve timing deletion policy in great quantities, can produce large performance impact, so the timing to delete is not a good deletion policy.

Lazy to delete

In contrast to timed deletion, the lazy deletion strategy is the most CPU-friendly, and the program checks only when the key is removed, a passive process. Lazy deletion, meanwhile, is the least memory-friendly. An expired key is not deleted and its memory is not freed as long as it is not retrieved. Obviously, lazy deletion is not a good strategy, redis is very dependent on memory and snapgood memory, if some long-term keys are not accessed for a long time, it will cause a lot of memory garbage, or even memory leaks.

The expireIfNeeded function is used to determine the expiration of the written key when writing data to the execution. The expireIfNeeded function does three things internally:

Check whether the key expires
The action of executing the past key is propagated to the slave node
Deleting an Expired Key

Periodically delete

The above two deletion strategies, either periodic deletion or lazy deletion, have obvious defects in a single use, either taking up too much CPU time or wasting too much memory. The periodic deletion strategy is an integration and compromise of the first two strategies

The periodic deletion policy performs the deletion expiration key at intervals and limits the time and frequency of the deletion operation to reduce the impact on the CPU time
Reasonable deletion expiration key can be achieved by reasonable deletion execution duration and frequency

conclusion

Redis is extensive and profound, so to speak, a simple even ask or just seven elephant, or it just felt a elephant nose, or should also follow the nose touched down, the next may touch an elephant ears, as long as you willing to go down further to understand, rather than just the application without thinking, one day it will give cottoned on redis this elephant.

[Note: Some chapters refer to and quote Huang Jianhong’s Redis Design and Implementation]

Lift the veil and chase Redis for seven straight questions

Hello Redis has a few questions for you

1. What is the communication protocol of Redis

Simple summary

2. Whether redis has ACID transactions

atomic

Transaction queue

Why does Redis not support rollback

consistency

Whether the execution error ensures consistency

Impact of downtime on consistency

Reconsidered consistency ❓❓

Isolation,

persistence

Simple summary

3. How is redis optimistic lock Watch implemented?

Java’s AtomicXXX optimistic locking mechanism

cmpxchg

Redis watch mechanism

Simple summary

4. How is Redis persisted

RDB

AOF

bgrewriteaof

RDB and AOF mix and match modes

5. How does Redis increase and reduce expenditure in memory usage

ziplist

quicklist

Objects share

6. How does Redis implement master/slave replication

7. How does Redis formulate the expiration deletion policy

Time to delete

Lazy to delete

Periodically delete

conclusion

Related Posts

JUnit4 & TestNG integration with Spring

You’ve used several registries, and you don’t know the difference?

Collision Attack And collision attack