To avoid a single point of failure in a real production environment, Redis would certainly not be deployed on a single machine, but would deploy multiple instances simultaneously. There are three main ways to realize Redis multi-machine database, namely master-slave replication, Sentinel and cluster. This paper will mainly elaborate on the realization principle of master-slave replication.
The main content of this article is from Redis Design and Implementation.
How do I use master-slave replication
In Redis, you can make one server copy another by using the SLAVEOF command or setting the SLAVEOF option. The replicated server is called the master server, and the replicated server is called the slave server.
Old copy function implementation
Prior to version 2.8, Redis replication is mainly divided into ** sync and command propagate:
- Synchronization: Updates the database state of the slave server to that of the master server.
- Command propagation: The write commands on the primary server are transmitted to the secondary server for execution, ensuring the consistency between the primary and secondary databases.
synchronous
When the slave server runs the SLAVEOF command, the first step is to synchronize the slave server database state to the master server database state.
The secondary server sends the SYNC command to the primary server to complete the synchronization. The SYNC command is executed as follows:
- The slave server sends messages to the master server
SYNC
Command. - Primary server received
SYNC
The command is executedBGSAVE
Command to generateRDB
File. In the generatedRDB
During the file, write commands are recorded in a buffer. - Master server sending
RDB
File file to the slave server, the slave server loads thatRDB
File. - The master server sends the build
RDB
Write commands during the file to the slave server and replay these commands from the server. The status of the secondary server is the same as that of the primary server, and the synchronization is complete.
Command transmission
After the synchronization is complete, subsequent write commands executed by the primary server are sent to the secondary server in command propagation mode to ensure the consistency between the primary and secondary databases.
Defects in the old copy function
At the time of the original copy, there was no problem with the old copy function. However, replication is still triggered when the primary and secondary servers are temporarily disconnected and reconnected due to network problems. Although this approach can bring the state of the primary and secondary databases back into line, the performance and efficiency will be low. Because of the replication after reconnection, the synchronization operation is still performed to generate the complete RDB file, which is still loaded from the server. In fact, only a few write commands are missing during the outage, which is obviously too inefficient to perform a full copy operation.
The new copy function is implemented
In order to solve the problem of the old version of the replication function in the redundant offline system, after the 2.8 version, Redis uses the PSYNC command instead of the original SYNC command to perform synchronization operations.
The PSYNC command supports full resynchronization and partial resynchronization modes.
- Full resynchronization: Used to handle the case of initial replication. With the old version
SYNC
The commands are basically the same, they are all generated and sent by the master serverRDB
Files and write commands during this time to the slave server to complete the synchronization operation. - Partial resynchronization: Used to deal with duplication after an offline call. When the secondary server is disconnected and reconnects to the primary server, if conditions permit, the primary server can only send write commands executed during the offline to the secondary server, so as to achieve the consistency between the primary and secondary databases.
Partial resynchronization is implemented
The partial resynchronization function consists of the following three parts:
- Replication offset of the primary server and replication offset of the secondary server
- Replication Backlog for primary server
- Server Running ID
Copy offset
The primary and secondary servers each maintain a replication offset:
- Each time the master propagates N bytes of data to the slave, it increments its own replication offset by N.
- The slave server also increments its own replication offset by N each time it receives N bytes of data propagated from the master.
If the primary and secondary databases are in the same state, their replication offsets must be the same; otherwise, the database states are inconsistent. Therefore, the replication offsets of the primary and secondary servers must be different in the event of a disconnection.
Copy the backlogged cache
The replication backlog cache is a fixed-length first-in, first-out (FIFO) queue maintained by the primary server, with a default size of 1MB. When the master server propagates commands, it not only sends write commands to all slave servers, but also writes write commands to the replication backlog buffer. Therefore, the copy backlog holds the most recently propagated write commands, and the copy backlog records the corresponding copy offset for each byte.
When the slave server is reconnected, the slave server sends its replication offset to the master server through the PSYNC command, and the master server uses this offset to decide which synchronization operation to use.
- if
offset
Partial resynchronization is used when data after the offset is still in the replication backlog. - if
offset
Data after the offset is not in the replication backlog, then full resynchronization is used.
Server Running ID
Each Redis server, whether primary or secondary, will have its own run ID. The run ID is generated when the server is started and consists of 40 random characters. When the secondary server performs the initial replication, the primary server running ID is also saved. With this run ID, you can determine whether the reconnected primary server was the original primary server before the outage, and if so, the previous partial resynchronization is attempted, or the full resynchronization is performed.
Implementation of the PSYNC command
The PSYNC command can be invoked in either of the following ways:
- If the slave server has not replicated any master servers before, the slave server sends when it starts a new replicate
PSYNC ? - 1
Command to proactively request full resynchronization. - Is sent when a new replicate is started, if the primary has already been replicated by the secondary
PSYNC <runid> <offset>
Command, master service based onrunid
andoffset
Parameter determines when to use the replication mode.
The primary server that receives the PSYNC command may return one of three results to the secondary server:
+FULLRESYNC <runid> <offset>
: Performs full resynchronization. Among themrunid
Represents the running ID of the primary server, which is saved by the secondary server;offset
Is the replication offset of the master server, which the slave server will use as its initial offset.+CONTINUE
: Performs partial resynchronization. The master server sends the missing data to the slave server.-ERR
: Replication error. Indicates that the primary server version is earlier than 2.8 and cannot be identifiedPSYNC
Command.
Deficiencies of master/slave replication
Master-slave replication solves data backup and performance issues (via read/write separation), but it still has some drawbacks:
- Synchronization takes time when RDB files are too large.
- In the case of one master and one slave or one master and many slaves, if the master server dies, the external services are unavailable, and the single point problem is not solved. If you manually switch the previous secondary server to the primary server each time, this will be time-consuming and time-consuming, and will cause service unavailability for a certain period of time.
Original is not easy, feel that the article is written well small partners, a praise 👍 to encourage it ~
Welcome to my open source project: a lightweight HTTP invocation framework for SpringBoot