Redis primary/secondary replication

What is master-slave replication

Primary/secondary replication allows users to set up multiple servers, several of which act as primary servers and provide write functions. The remaining servers act as slave servers and provide read functions. Whenever the primary server receives a write request, it also needs to send data to the secondary server. Ensure data consistency between the primary and secondary servers. With this mechanism, it is possible to build high-availability, high-concurrency clusters using inexpensive servers. Primary/secondary replication is an essential tool for constructing an HA cluster.

How does Redis implement master/slave replication

2.1 slaveof

In Redis, you can use the slaveof command to make one Redis instance copy the content of another Redis instance. Note that after instance A runs this command to copy the contents of instance B, the contents of instance A will be overwritten by the contents of instance B. Write commands sent to the slave server will be rejected while the slave server will be set to read-only. (You can also configure this command in redis.conf to initiate master/slave synchronization at startup.)

2.2 Principle of Master/Slave Replication V1

After the slave server initiates the slaveof command, the master and slave servers communicate with each other through the following steps:

The first full master-slave synchronization is complete. A TCP connection is maintained between the master and slave. Each time the master receives a new write command, it sends it to the slave server.

If the connection is broken, the above steps are repeated when the secondary server is reconnected to the master server. You can see that this is inefficient, because the primary server only needs to send write commands during disconnection to the secondary server, and does not need to regenerate the RDB file. Generating the RDB file is a time-consuming operation, designing the disk for reading and writing.

Note: The slave server is blocked while loading the RDB file and cannot process the client request.

2.3 Principle of Primary/Secondary Replication V2

For these reasons, especially the short disconnection times, Redis has introduced a new synchronization command, psync.

Psync divides the synchronization process into two parts: 1. Complete synchronization; 2. Partial synchronization.

Full synchronization is also called primary synchronization, which is the first master/slave synchronization. The steps are the same as v1.

Partial synchronization Synchronization after the main user disconnects and reconnects, it can send the write command during the disconnection to the secondary server, without the need for the entire RDB file, greatly saving resources. When the secondary server reconnects to the primary server, the psync command is sent, and the primary server replies with the continue name and sends the missing write command to the secondary server.

2.3.1 Principle of Partial synchronization

Redis relies on the following parts to complete partial synchronization:

1. The replication offset of the primary server

2. Replication offset from the server

3. Command cache (FIFO queue, default size 1MB)

4. Server running Id

Each time the master passes N bytes to the slave, it increments its own offset +N. The same goes for the slave server. The master server also writes commands to the command cache. The following steps occur when reconnecting from the server:

Each Redis has its own unique identification Id. Generated automatically at startup and consists of 40 random hexadecimal characters. When sending the first master/slave synchronization, the master sends its ID to the slave server, which saves it. In the case of disconnection and reconnection, the secondary server sends this ID to the primary server when requesting synchronization. The primary server determines whether the ID is consistent with its own ID. If so, the secondary server continues to perform the remaining steps of partial synchronization. Otherwise, complete synchronization is performed.

2.4 Heartbeat Detection

By default, the slave server sends REPLCONF_ACK

to the master server to report its status every second after the connection is established between the master and slave servers.

The master server can detect several problems from this command:

1. Network connection status between master and slave

If the primary server does not receive a heartbeat command from the secondary server within a specified period of time, a problem occurs between the primary and secondary servers. If you’re configured at this point

Min-rabes-to-write 3 min-rabes-max-lag 10 // If the number of slave servers is less than 3 or the heartbeat detection delay value of all three servers is greater than or equal to 10 seconds, the master server rejects the write commandCopy the code

2. Check whether the new write command is lost

Each time the master server receives the offset in the heartbeat command from the slave server, it compares it with its own offset, if less than its own. In this case, the master server will take the initiative to send the missing command to the slave server (if the missing command is still in the cache, it should initiate a complete synchronization if it is not speculated, which has not been verified).

3. Assist to realize Min-Slaves

2.5 note

After the primary and secondary servers establish a socket connection, the secondary server first launches the Ping command to check whether the read and write of the socket is normal. After receiving the Pong command from the master server, it proved normal. Then determine whether the master server needs identity authentication, initiation password. The replication process is then performed.