Next, “Redis master slave synchronization (a) before 2.8 version of the way”, in the last, spoke to the 2.8 version before the way has defects, everyone still remember, today will talk about from 2.8 version of the master slave synchronization mode improvement
Let’s review the shortcomings of the pre-2.8 redis master-slave synchronization
- If the existing data is synchronized after the sync operation is completed, at this point, the master server is in the command propagation stage between the slave server and the master server, at this point, the following command is transmitted, and then the connection between the master server and the slave server is disconnected
- After the connection between the primary and secondary servers is re-established, the synchronization between the primary and secondary servers starts again
- At this point in the sync operation, the master generates an RDB file that contains the key from the first command development in the first synchronization
- While the sync operation can still be done, there are many keys that do not need to be synchronized again
- Since the main server needs to generate RDB files during the sync operation, this operation takes up a lot of resources of the server: CPU, memory, IO
- The transfer of RDB files between the master and slave servers consumes server bandwidth
- Loading RDB files from the server also consumes a large amount of server resources: CPU, memory, AND IO
- Therefore, no unnecessary sync operation can consume server resources and degrade service performance
In summary, the master/slave synchronization mode of the old version was very low in case of disconnection and reconnection, so version 2.8 started to passPSYNC
Command in place ofSYNC
Command to perform a synchronization operation
PSYNC command
PSYNC
Commands have two modes:Full resynchronization mode
andPartial resynchronization mode
Two modes
Full resynchronization mode
- Full synchronization mode is used for processing
The first copy
situation - Full synchronization mode execution process and
SYNC
The command is consistent. Let’s review the process as follows:
- 1. The secondary server sends the slaveof command to the primary server
- 2. The primary server receives the slaveof command
- 3. Run the BGSAVE command
- 4. Generate an RDB file in the background
- 5. Use a buffer to record all commands executed since the beginning
- 6. The BGSAVE command is executed on the primary server
- 7. The primary server sends the RDB file generated by the BGSAVE command to the secondary server
- 8. Accept the RDB file from the server and load it
- 9. Update your database status to the database status when the master server runs the BGSAVE command
- 10. The master server sends all write commands recorded in the buffer to the slave server
- 11. Run these commands from the server to complete the synchronization
Partial resynchronization mode
- Partial synchronization mode is mainly handled
Replication scenario after disconnection and reconnection
- Other and
SYNC
The commands are consistent. The master node will synchronize the commands during the disconnection to the slave node only after the connection is reconnected
Let’s take a look at what happens when we break and reconnectPSYNC
Command Execution Process
time | The primary server | From the server |
---|---|---|
T0 | The synchronization is complete on the primary and secondary servers | The synchronization is complete on the primary and secondary servers |
T1 | Run the broadcast command to broadcast the set k1 v1 command | Execute the set k1 v1 command propagated from the master server |
T2 | Run the propagation command to propagate the set k2 v2 command | Execute the set k2 v2 command propagated from the primary server |
T3 | The primary and secondary servers are disconnected | The primary and secondary servers are disconnected |
T4 | set k3 v3 | At this point, the primary server is disconnected and set K3 V3 is not propagated |
T5 | set k4 v4 | At this point, the primary server has been disconnected and set K4 V4 has not been propagated |
T6 | The fault is rectified and the primary and secondary servers are reconnected | The fault is rectified and the primary and secondary servers are reconnected |
T7 | Send the PSYNC command to the primary server | |
T8 | Send an acknowledgement reply to the slave server to perform partial synchronization | Receives an acknowledgement reply to the master server and performs partial synchronization |
T9 | Send the set k3 v3 and set k4 V4 commands to the slave server | |
T10 | The set k3 v3 and set k4 V4 commands sent by the primary server are received and executed | |
T11 | The primary and secondary servers complete partial synchronization | The primary and secondary servers complete partial synchronization |
- And this process depends on three main parts
Replication offset of the primary and secondary servers
- Both the master and slave services maintain a replication offset
- Each time the master propagates n bytes to the slave, it adds n to the offset it maintains
- Each time the slave service receives data propagated from the master server, it adds the offset it maintains to the number of bytes of the data size
- Master slave offset chestnut:
Offset of the primary server: offset=100 Offset of the secondary server A: offset=100 Offset of the secondary server B: offset=100Copy the code
- In this case, the primary server and secondary server A and B are synchronized
- When the primary service synchronizes 100 bytes of data to the secondary service
Offset of the primary server: offset=100 Offset of secondary server A: offset=200 (secondary server B and the primary server are disconnected) Offset of secondary server B: offset=100Copy the code
- In this case, the primary server and secondary server A are synchronized
- The offset value of server B is 100 due to disconnection of service B, so it can be judged that they are not in data synchronization state
Replication backlog of the primary server
Copy the backlogged cache
Is a queue maintained by the primary server with a default size of 1MB- The queue is
Fixed length
andFirst in first out
the - When the primary server executes on the secondary server
Command transmission
Not only will all commands be sent to all slave services - It also writes all the commands
Copy the backlogged cache
, as shown in the figure:
- because
Copy the backlogged cache
isThe queue
, is to write the bytes of each command of the main service to the queue, consisting of two parts,The offset
andCommand byte value
- The queue structure is as follows:
Node1:
offset=1
val=s
Node2:
offset=2
val=e
Node3:
offset=3
val=t
Node4:
offset=4
val=k
Node5:
offset=5
val=1
Node6:
offset=6
val=v
Node7:
offset=7
val=1
Copy the code
- See the queue structure, which is the command of the master service
set k1 v1
Each byte of and byte corresponding toThe offset
Deposit toCopy the backlogged cache
In the - When the slave server reconnects to the master server, the slave server passes
PSYNC
To his ownThe replication offset is offset
Send it to the master server, which determines whether it is used based on the replication offsetComplete synchronization
orPart of the synchronization
, the specific operations are as follows:
time | The primary server | From the server |
---|---|---|
T0 | The primary and secondary servers are disconnected | The primary and secondary servers are disconnected |
T1 | The fault is rectified and the primary and secondary servers are reconnected | The fault is rectified and the primary and secondary servers are reconnected |
T2 | Send the PSYNC command to the server, which will send its own replication offset | |
T3 | Receives the replication offset passed from the server for judgment | |
T4 | If the data after the replication offset is still in the replication cache, partial synchronization is performed and the data is fetched from the replication cache | |
T5 | If the data after the replication offset does not exist (because the queue is first in first out), perform a full synchronization |
Run ID of the redis service
- When each Redis service starts, it has its own run ID
- Each ID will never be repeated. How do I do that
- When the primary replication is performed between the secondary service and the primary server, the secondary service saves the running ID of the primary service
- Run the command after the connection is reconnected
PSYNC
The command is used to compare the ids. If the ids are the same, the active service is checkingCopy offset
Determine whether full or partial synchronization is required. If the ids of the two main services are different, select this optionComplete synchronization
PSYNC
Detailed command execution process
time | The primary server | From the server |
---|---|---|
T0 | The primary server starts, generating a unique run ID (pretend to simulate one)=1 | Start from the server, generate a unique run ID (pretend to simulate one)=2 |
T1 | At this point, the primary/secondary replication is complete (assuming no command is used), the primary offset is 0, and there is no data in the replication cache | At this point, the primary/secondary replication is complete (assuming no command is used), and the offset is 0 |
T2 | Run the propagation command to propagate the set k1 v1 command and write the command to the replication backlog cache with the primary offset=9 | Run the set k1 v1 command propagated from the primary server, offset=9 |
T3 | Run the propagation command to propagate the set k2 v2 command and write it to the replication backlog cache with the primary offset=18 | Run the set k2 v2 command propagated from the primary server with offset=18 |
T4 | The primary and secondary servers are disconnected | The primary and secondary servers are disconnected |
T5 | Run the set k3 v3 command to write the command to the replication backlog cache with the primary offset=27 | At this point, the primary server is disconnected and set K3 V3 is not propagated |
T6 | Run set k4 v4 to write the command to the replication backlog cache with the primary offset=36 | At this point, the primary server has been disconnected and set K4 V4 has not been propagated |
T7 | The fault is rectified and the primary and secondary servers are reconnected | The fault is rectified and the primary and secondary servers are reconnected |
T8 | Send the PSYNC command to the primary server with the primary server running ID =1 and the offset of the secondary server =18 | |
T9 | The primary server received the PSYNC command and found that the run ID was the two servers before the disconnection | |
T10 | The master server determines the offset sent from the slave server. The offset sent from the slave server is offset=18. The master server determines whether the data in the replication backlog cache after offset=18 exists | |
T11 | Send an acknowledgement reply to the slave server to perform partial synchronization | Receives an acknowledgement reply to the master server and performs partial synchronization |
T12 | The master server sends two commands set K3 v3 and set k4 V4 to the slave server after offset=18 from the replication backlog cache | |
T13 | Received the set k3 v3 and set k4 V4 commands sent by the master server, executed them, and updated the offset of the slave server, offset=36 | |
T14 | At this point, the primary and secondary servers have partially synchronized, and their offsets are offset=36 | At this point, the primary and secondary servers have partially synchronized, and their offsets are offset=36 |
Today, I talked about the way after the 2.8 version of the master/slave synchronization of Redis, welcome everyone to communicate, point out some mistakes in the article, let me deepen my understanding, wish you no bugs, thank you!