Redis’ persistence feature ensures data security to a certain extent, ensuring that data loss is minimal even in the event of server downtime. Typically, to avoid a single point of failure of the service, data is replicated to multiple copies on different servers, and these servers with data copies can be used to handle client read requests, extending overall performance

The following is an introduction to the configuration and implementation principle of the Redis primary-secondary replication. In the future, there will be high availability solutions of Redis: Sentinel, Cluster.

What is master-slave replication

You can use the slaveof
command or configure the slaveof option to replicate the contents of the specified master server. The slave server is called the master server. The slave server replicates the master server.

The master server can perform read and write operations. When data on the master server changes, the master sends a command flow to keep the salve updated. The slave server is usually read-only (as specified by slave-read-only). The slave cannot be used to write to the master server

A master can have multiple slaves. Slaves can also receive connections from other slaves in a cascading structure, and since Redis 4.0, all sub-slaves will also receive the exact same copy flow from the master. The following figure

Benefits of master-slave replication:

  • Data redundancy: implements hot backup of data
  • Fault recovery to prevent service unavailability caused by a single point of failure
  • Read/write separation, load balancing. The primary node is responsible for read and write, and the secondary node is responsible for read and write, increasing the concurrent amount of the server
  • The high availability foundation is the foundation of sentinel mechanism and cluster implementation

Configure the primary/secondary replication

Using and configuring master-slave replication is easy. Set the slaveof option in the slave configuration file or run the slaveof < masterIP >

command directly

The IP address of the primary server is 192.168.249.20. The IP addresses of the two secondary servers are 192.168.249.21 and 192.168.249.21 respectively. The port number is 6379

The master server does not need any additional configuration. Here, let’s list the areas that need to be changed for all three servers

# set to run in the background
daemonize yes
# save pid files. If you set up a master and slave on a machine, you need to distinguish between them
pidfile /var/run/redis_6379.pid
# host address bound, comment out here, open IP connection
# bind 127.0.0.1
Specifies the log file
logfile "6379.log"
Copy the code

Add config slaveof < masterPort >

option from server, using Replicaof instead of slaveof (github.com/antirez/red…) – Slaveof can still be used, but replicaof is recommended. If you use the command line to copy, it will be invalid after the restart

# replicaof <masterip> <masterport>Replicaof 192.168.249.20 6379Copy the code

After redis.conf is configured, we start the three servers and run the info replication command to view the replication information

192.168.249.20:6379 > info replication# ReplicationRole: master connected_slaves: 2 slave0: IP = 192.168.249.22, port = 6379, state = online, offset = 700, lag = 0 = 192.168.249.21 slave1: IP and port = 6379, state = online, offset = 700, lag = 0 master_replid: b80a4720c0001efb62940f5ad6abaf9cdaf7a813 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:700 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:700 192.168.249.21:6379 > info replication# ReplicationRole :slave master_host:192.168.249.20 master_port:6379 master_link_status:up master_last_io_seconds_ago:3 master_sync_in_progress:0 slave_repl_offset:854 slave_priority:100 slave_read_only:1 connected_slaves:0 master_replid:b80a4720c0001efb62940f5ad6abaf9cdaf7a813 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:854 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 Repl_backlog_first_byte_offset :57 REPL_backlog_HISTlen :798 192.168.249.22:6379> INFO replication# ReplicationRole :slave master_host:192.168.249.20 master_port:6379 master_link_status:up master_last_io_seconds_ago:6 master_sync_in_progress:0 slave_repl_offset:854 slave_priority:100 slave_read_only:1 connected_slaves:0 master_replid:b80a4720c0001efb62940f5ad6abaf9cdaf7a813 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:854 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:854Copy the code

We can then write data to the primary server and then read data from other servers

192.168.249.20:6379 >set test 'Hello World'OK 192.168.249.21:6379 > gettest
"Hello World"192.168.249.22:6379 > gettest
"Hello World"
Copy the code

When we try to write data from the server, we are prompted that we cannot write data from the server read-only

192.168.249.21:6379 >set test2 hello
(error) READONLY You can't write against a read only replica.
Copy the code

If we need the slave to validate the master’s replication, we can configure the requirePass < Password > option in the Master to set the password

To use this password on the slave server, run the config set masterauth command, or set masterauth in the configuration file.

Principle of master-slave replication

The configuration of master-slave replication is relatively simple. The implementation principle of master-slave replication is described below

The Redis master-slave replication process is generally divided into three stages: connection establishment, data synchronization, and command propagation

Establish a connection

This phase is mainly about how to establish a connection with the master server after the slaveof command is issued from the server to prepare for data synchronization.

1) After the slaveof command is executed, the slave server creates a socket connection to the master server based on the master IP address and port set. After the connection is successful, the slave server will associate a dedicated processor with the socket to handle the subsequent replication

2) After the connection is established, the secondary server will send the ping command to the primary server to confirm whether the primary server is available and whether the processing command is currently available. If the master server receives a Pong reply indicating that it is available, otherwise there may be a network timeout or the master server is blocked, and the slave server will disconnect and initiate a reconnection

3) Authentication. If the primary server has the requirePass option set, the secondary server must be configured with the Masterauth option and password consistency to pass authentication

4) After authentication, the slave server will send its listening port and the master server will save it

192.168.249.20:6379 > info replication... = 192.168.249.22 slave0: IP and port = 6379, state = online, offset = 700, lag = 0 = 192.168.249.21 slave1: IP and port = 6379, state = online, offset = 700, lag = 0...Copy the code

Data synchronization

After the primary and secondary servers establish a connection to confirm their identities, data synchronization begins. The secondary server sends the PSYNC command to the primary server to perform the synchronization and update its database status to the primary server’s database status

Master/slave synchronization in Redis can be divided into full resynchronization and partial resynchronization.

Complete resynchronization

There are two cases of complete resynchronization. One is when the master replicates the slave connection for the first time. The second is that if the master and slave are disconnected, when the replication is reconnected it may be fully resynchronized, which we’ll talk about later

Here are the steps to complete resynchronization

  • The secondary server connects to the primary server and sends the SYNC command
  • After the primary server receives the SYNC name, the execution beginsbgsaveThe command generates an RDB file and uses a buffer to record all subsequent write commands
  • The primary serverbasaveAfter the execution, the snapshot file is sent to all slave servers, and the write commands that are executed continue to be recorded during the sending
  • After receiving the snapshot file from the server, discard all old data and load the received snapshot
  • After the snapshot is sent, the primary server sends write commands in the buffer to the secondary server
  • After the snapshot is loaded from the server, it starts receiving command requests and executing write commands from the master server buffer

Partial resynchronization

Partial resynchronization is used to deal with the situation of repeated production after a disconnect. First, we introduce several parts for partial resynchronization

  • runid(Replication ID), the primary server running ID, Redis instance at startup, randomly generated a 40 length unique string to identify the current node
  • offset, the replication offset. The primary and secondary servers each maintain a replication offset that records the number of bytes transferred. When the primary node sends N bytes of data to the secondary node, the primary node offset is increased by N, and when the secondary node receives N bytes of data from the primary node, the secondary node offset is increased by N
  • replication backlog buffer, copy the backlog buffer. Is a fixed-length FIFO queue whose size is specified by the configuration parameterrepl-backlog-sizeSpecifies a default size of 1MB. Note that there is only one buffer maintained by the master and shared by all slaves. This buffer is used to back up the most recent data sent from the master to the slave

When the slave is connected to the master, PSYNC < rUNId >

is executed to send the rUNId (replication ID) and offset of the old master, so that the master can only send the incremental part missing from the slave. However, if the master does not have enough command records in the replication backlog cache, or the slave sends the wrong REPLICATION ID, a full resynchronization is performed, that is, the slave gets a full copy of the data set

The PSYNC command is used to perform complete and partial resynchronization

Command transmission

After data synchronization is complete, the data on the primary and secondary servers is temporarily consistent. After the primary server executes the write command from the client, the data on the primary and secondary servers is no longer consistent. To ensure data consistency between the primary and secondary servers, the primary server performs command propagation operations on the secondary server, that is, every time a write command is executed, the primary server sends the same write command to the secondary server

During the command propagation phase, the secondary server sends heartbeat detection to the primary server at a rate of once per second by default

REPLCONF ACK <replication_offset>
Copy the code

The command replication_offset is the replication offset of the current slave server. The command has three functions

  • Check the network connection status of the primary and secondary servers
  • Auxiliary implementationmin-slavesoptions
  • Detection command lost

Security of replication when persistence is turned off

With regard to the security of replication when persistence is turned off, I have taken a brief description from the official website www.redis.cn/topics/repl…

In Settings when using Redis replication, it is strongly recommended to enable persistence in master and slave. When it is not possible to enable, such as latency issues due to very slow disk performance, the instance should be configured to avoid automatic restart after reset.

To better understand why a master with persistence turned off and automatic restart configured is dangerous, check the following failure modes in which data is deleted from the master and all slaves:

  1. We set node A to master and turn off its persistence, and nodes B and C copy data from node A.
  2. Node A crashed, but it had some auto-restart system that could restart the process. However, since persistence is turned off, the data set is empty when the node is restarted.
  3. Nodes B and C will copy data from node A, but node A’s data set is empty, so as A result of the replication they will destroy their previous copies of data.

When The Redis Sentinel is used for high availability and the master turns persistence off, it is also dangerous to allow automatic restart of the process. For example, the master can be restarted fast enough that Sentinel does not detect a failure, so the above failure mode will also occur.

Data security is important at all times, so if the master is using replication and persistence is not configured, then automatic process restart should be disabled

Reference: Redis Design and Implementation, Redis Replication