How to perform the first synchronization between master and slave libraries?
Replicaof command: replicaof (slaveof before Redis 5.0)
Replicaof 172.16.19.3 6379 (IP: 172.16.19.5) replicaof 172.16.19.3 6379 (IP: 172.16.19.5)
The first stage is the process of establishing connections and negotiating synchronization between master and slave libraries, which is mainly to prepare for full replication. In this step, a connection is established between the master library and the master library, and synchronization between the master and slave libraries can begin after the master acknowledges the reply.
Specifically, the slave sends the psync command to the master to indicate data synchronization, and the master starts the replication according to the parameters of this command. The psync command contains the runID of the master library and the replication progress offset.
- RunID, a random ID automatically generated when each Redis instance is started, is used to uniquely identify the instance. Set runID to “? “because you do not know the runID of the master library when the slave and master library are first copied. .
- Offset: the value is -1, indicating the first replication.
The master library receives the psync command and returns the FULLRESYNC response to the slave library with two parameters: the master runID and the master’s current replication progress offset. These two parameters are logged when the response is received from the library.
One thing to note here is that the FULLRESYNC response represents a full copy of the first copy, meaning that the master will copy all of the current data to the slave.
In the second phase, the master synchronizes all data to the slave. After receiving the data from the library, load the data locally. This process relies on the RDB files generated by the memory snapshot.
Specifically, the master library executes the BGsave command, generates an RDB file, and then sends the file to the slave library. After receiving the RDB file from the library, the current database is emptied before the RDB file is loaded. This is because the slave database may save additional data before starting synchronization with the master database using replicaof. To avoid the impact of previous data, the slave database needs to be emptied of the current database.
While the master is synchronizing data to the slave, the master is not blocked and can still receive requests normally. Otherwise, Redis service is interrupted. However, the write operations in these requests are not recorded in the RDB file just generated. To ensure data consistency between the master and slave libraries, the master library uses a special replication buffer in memory to record all write operations received after the RDB file is generated.
Finally, in the third phase, the master sends the new write commands received during the second phase to the slave. When the master library has finished sending the RDB file, the changes in the replication buffer are sent to the slave library, which then re-performs the changes. In this way, the master and slave libraries are synchronized.
The cascading master/slave mode shares the load of the primary database during full replication
By analyzing the process of the first data synchronization between the master and slave libraries, you can see that in a full copy, there are two time-consuming operations for the master: generating and transferring the RDB file.
If there are many slave libraries, and all of them need to be fully replicated with the master, the master will be busy forking RDB files for full synchronization. The fork operation blocks the main thread from processing normal requests, causing the main library to be slow to respond to requests from the application. In addition, the transfer of RDB files also takes up the network bandwidth of the primary library, which also puts pressure on the resource usage of the primary library. So, is there a good solution to share the burden of the main library?
There is, in fact, a master-slave model.
In the master-slave mode just described, all slave libraries are connected to the master, and all full copies are made to the master. We can now cascade the rDB-generating and rDB-transmitting stress from the master library to the slave library in a master-slave mode.
In simple terms, when deploying a master slave cluster, you can manually select a slave library (for example, select a slave library with a high memory resource configuration) to cascade other slave libraries. Replicaof can then execute IP 6379 of the selected slave libraries on the selected slave libraries (for example, one-third of the slave libraries) and establish a master-slave relationship between them and the selected slave libraries.
This way, the slave libraries know that they don’t have to interact with the master when they synchronize, just synchronize their writes to the cascaded slave libraries, reducing stress on the master.
Therefore, once the master and slave libraries complete full replication, they will always maintain a network connection, through which the master library will receive subsequent command operations to synchronize to the slave library. This process is also known as long-connect-based command propagation, which can avoid the overhead of frequent connection establishment.
This may sound simple, but there are risks, the most common of which are network disconnection or congestion. If the network is disconnected, the command cannot be propagated between the master and slave libraries, the data from the slave library cannot be kept consistent with the master library, and the client may read the old data from the slave library.
What if the network between master and slave libraries is down?
Prior to Redis 2.8, if the master and slave libraries had network interruptions during command propagation, the master and slave libraries would have to do a full copy again, which was very expensive.
Starting with Redis 2.8, when the network is down, the master and slave libraries continue to synchronize using incremental replication.
Full replication synchronizes all data. Incremental replication synchronizes only the commands received by the master database to the slave database during the disconnection between the master database and the slave database.
So how exactly do master and slave libraries stay in sync during incremental replication? The trick here lies in the repl_backlog_buffer buffer. Let’s take a look at how it can be used to synchronize incremental commands.
When the master and slave libraries are disconnected, the master will write the write commands received during the disconnection to the Replication buffer, as well as the repl_backlog_buffer buffer.
Repl_backlog_buffer is a circular buffer, where the master library records where it wrote and the slave library records where it read.
In the beginning, the master and slave libraries write and read together, which is their starting position. As the master receives new writes, its write position in the buffer gradually deviates from its starting position. This offset is usually measured by the offset, which is master_REPL_offset for the master. The more new writes the main library receives, the greater the value will be.
Similarly, after the slave library copies the write command, its read position in the buffer begins to shift from the original position. At this point, the slave_REPL_offset offset copied from the slave library also increases. Normally, these two offsets are roughly equal.
After the connection between master and slave is restored, the slave will first send the psync command to the master and send its current Slave_REPL_offset to the master. The master will determine the difference between its master_REPL_offset and Slave_REPL_offset. During the network disconnection phase, the master library may receive new write commands, so generally, master_REPL_offset will be greater than slave_repl_offset. At this point, the master library simply synchronizes command operations between master_REPL_offset and slave_REPL_offset to the slave library. As shown in the middle of the diagram, there are two different operations put d e and PUT D F between the master and slave. In incremental replication, the master only needs to synchronize them to the slave.
However, because repl_backlog_buffer is a circular buffer, the main library continues to write after the buffer is full, overwriting previous operations. If reads from the slave library are slow, it is possible that unread operations from the slave library are overwritten by newly written operations from the master library, resulting in data inconsistencies between the master and slave libraries.
Therefore, we want to avoid this by generally tweaking the repl_backlog_size parameter.
This parameter is related to the size of the buffer space required. The buffer space is calculated as follows: Buffer space = command writing speed of the primary library * operation size – Command transmission speed between the primary and secondary libraries * operation size.
In practice, to allow for some unexpected request pressure, we would normally double this buffer size, repl_backlog_size = buffer size * 2, which is the final value of repl_backlog_size.
For example, if the main library writes 2000 operations per second, and each operation is 2KB in size, and the network can transmit 1000 operations per second, then 1000 operations need to be buffered, which requires at least 2MB of buffer space. Otherwise, the newly written command overwrites the old operation. To cope with possible bursts of stress, we ended up setting repl_backlog_size to 4MB.
This reduces the risk of data inconsistency between master and slave libraries during incremental replication. However, if the volume of concurrent requests is so large that there is not enough buffer space to hold new operation requests twice as large, the data from the master and slave libraries may still be inconsistent.
In this case, on the one hand, you can increase the repl_backlog_size value appropriately depending on the memory resources of the server where Redis is located, such as 4 times the buffer size, on the other hand, you can consider using a sliced cluster to share the request load of a single master library.
Reasons for using RDB instead of AOF for master/slave full synchronization:
1. RDB file content is compressed binary data (optimized for different data types), and the file is very small. AOF files record the commands for each write operation. The more write operations, the larger the file becomes, and there are many redundant operations on the same key. In master-slave full quantity data synchronization, transmission RDB file for main machine can be minimized by network bandwidth consumption, from the library when loading RDB file, the file is small, read the entire file will be soon, because RDB file storage is binary data, analytical reduction according to RDB agreement directly from library data, will be very fast, AOF, on the other hand, replays each write command sequentially, which requires lengthy processing logic and much slower recovery than RDB, so RDB has the lowest cost of primary/secondary full synchronization.
2. If you want to use AOF for full synchronization, it means that you must turn on the AOF function. If you open AOF, you must choose the strategy of file flushing. The RDB triggers the generation of a snapshot only when periodic backup and full data synchronization are required. However, in many business scenarios where data loss is not sensitive, AOF is not required.