The article was first published on the public account “Mushroom Can’t Sleep”

preface

The master-slave replication of Redis is similar to that of MySQL, mainly playing the role of data backup, read/write separation, etc. So master slave replication is very important for Redis, and whether it is an interview or a job, it is very necessary to understand the underlying implementation of the master slave replication of Redis, then we will look at how to achieve the master slave replication of Redis.

What is Redis master-slave replication?

In Redis, you can use the SLAVEOF command or the SLAVEOF option to have one server replicate another. The replicated server is called the “master server”, the server that initiates the replication is called the “slave server”, and the mode consisting of the two servers is called the “master slave replication”.

Redis master slave replication has the following features:

  • Redis uses asynchronous replication, where the amount of data to be processed is confirmed asynchronously between the slave and master.
  • A master can have more than one slave.
  • The slave can accept connections from other slaves. In addition to multiple slaves being connected to the same master, slaves can also be connected to other slaves in a cascading structure. As of Redis 4.0, all sub-slaves will receive exactly the same replication stream from the master.
  • Redis replication is non-blocking on the master side. This means that the master can continue to process query requests during the initial or partial resynchronization of one or more slaves.
  • Replication on the slave side is also mostly non-blocking. Of course, this is configurable. If the redis.conf configuration is non-blocking, you can use the old data set to process query requests. If configured to block, the slave returns an error to the client.

How to implement master slave replication?

Suppose you have two Redis servers with addresses 127.0.0.1:6379 and 127.0.0.1:12345, run the following command on server 127.0.0.1:12345:

127.0.0.1:12345> SLAVEOF 127.0.0.1 6379
OK
Copy the code

So server 127.0.0.1:12345 is the secondary server for 127.0.0.1:6379. For example, the primary server stores data:

127.0.0.1:6379> set msg "hello world"
OK
Copy the code

The data can then be retrieved directly from the server:

127.0.0.1:12345 > get MSG "hello world"Copy the code

The same is true for deleting data.

Principle of master-slave replication

First of all, the replication of Redis is divided into sync and command propagate:

  • The synchronization operation is used to update the state of the slave database to the state of the master server.
  • Command propagation, on the contrary, is mainly used in the process of bringing the master and slave back to the same state when the database state of the master server changes.

Let’s talk more about these two types of replication.

synchronous

Text explanation:

  1. The client sends the SLAVEOF command to the slave server to check whether it is the first replication. Generally, the first replication is the creation of the master/slave relationship.
  2. Is it the first replication: does the slave send PSYNC to the master? The -1 command requests the primary server to perform a complete resynchronization operation.
  3. After receiving a full resynchronization request, the primary server executes the BGSAVE command in the background, generating an RDB file in the background and using a copy backlog buffer to record all write commands executed from now on.
  4. After executing the BGSAVE command, the master sends the RDB file and write commands recorded in the buffer to the slave, and returns +FULLRESYNC [primary server ID] [replication offset] (which is the same as the offset in the figure) to the slave.
  5. After receiving from the server, it will load the RDB file, and execute the write command given by the master server, so as to achieve the data state consistent with the master server.
  6. If the replication is not the first time, the secondary server may be disconnected, causing data status inconsistency between the secondary server and the primary server. Therefore, you need to synchronize data on the primary server. The slave server then requests partial synchronization by following these steps.
  7. Send PSYNC [primary server ID] [replication offset] to the primary server (this is sent from the primary server during the first replication). The primary server ID is the primary server before the disconnection, which is used to locate which primary server to synchronize. The replication offset is the location of the last synchronization and is used to locate the specific synchronization location.
  8. After the master server receives the command from the slave server and finds the corresponding synchronization location, it sends the +CONTINUE command to the slave server, indicating that it will perform partial synchronization on the slave server. After that, the master server sends all data stored after the replication offset corresponding to the replication backlog buffer to the slave server. However, if the data after the offset is not found, a full synchronization is performed so that the slave server can be brought to the same state as the master.

Command transmission

The primary server may execute a write command, causing data inconsistency between the primary server and secondary server.

To deal with this problem,The master server sends the write commands it executes to the slave server. When the slave server finishes executing these commands, the data on the master and slave servers is consistent.

During command propagation, the secondary server sends the REPLCONF ACK


command to the primary server once every second. Sending the REPLCONF ACK command has three effects on the primary and secondary servers:

  • Check the network status of the primary and secondary servers.
  • Assist to implement the Min-Slaves option.
  • The detection command is lost.

Key words explanation

  1. Primary server ID: Identifies a server.
  • Each server, whether primary or secondary, has its own unique server ID.
  • The ID is generated when the server starts and consists of 40 random hexadecimal characters.
  1. Replication backlog buffer: The replication backlog buffer is a fixed-length, first-in, first-out (FIFO) queue maintained by the primary server, with a default size of 1MB. As follows:

conclusion

Redis primary/secondary replication is implemented through the PSYNC command. Replication is classified into partial replication and full replication. Partial replication is achieved by copying offsets, copying backlogs, and server ids. Full copy is implemented through the RDB and the copy backlog buffer. Primary/secondary replication mainly solves the problems of data backup and read/write separation.

The last

If you think an article is helpful to you, like it, follow it, and forward it

You can go to the public number mushroom can’t sleep to see, more wonderful content waiting for you.