Redis Deep Adventure is divided into two parts, stand-alone Redis and distributed Redis.

More articles can be found on my blog: github.com/farmerjohng…

This article is the first in the distributed Redis in-depth adventure series, focusing on Redis replication capabilities.

The replication function of Redis, like most distributed storage systems, is to support master-slave design. The benefits of master-slave design are as follows:

  • Read/write separation improves read/write performance
  • Data backup reduces the risk of data loss
  • High availability, avoiding single point of failure

Old copy implementation

Redis replication is divided into two steps: synchronization and command propagation:

Synchronization can be understood as full, which is to synchronize all data from the master server to the slave server at a certain time.

Command propagation can be understood as incremental. When the master server data is modified, the master server sends the corresponding data modification command to the slave server.

synchronous

The synchronization is divided into the following steps:

1. The SLAVE server sends the SYNC command to the master server (SYNC is also executed in the first step OF executing the SLAVE OF command)

2. When the master server receives a command from the slave server, it performs BGSAVE, which is to start a new child process to save the data in memory to an RDB file. Write commands executed from now on are recorded using a memory buffer whose purpose is to record the increments during the generation of the RDB file.

3. Send the RDB file to the secondary server

4. Send the write command in the buffer to the secondary server

Synchronization can be divided into two situations: the slave server connects to the master server for the first time, and the slave server reconnects to the master server and resynchronizes when the network connection between the service and the master server disconnects.

Command transmission

Command transmission implementation logic is simpler, when the primary server to perform the write command, to ensure the consistency of the data from the server and the main server from the primary server will write commands sent to the server, write command after it received from the server to perform the data can be consistent with the primary server (of course there will be a delay), note that from the server for the client are read-only, Therefore, all data from the slave server is propagated by the synchronization or command from the master server.

Problems with old copy

Let’s see what goes wrong with the above replication approach, given that the network environment between the Redis master and slave servers is not very reliable. Assume that there is primary server A and secondary server B, and the primary server has 10,000 pieces of data from 1 to 10,000.

1. During the initial connection, the secondary server synchronizes data from the primary server for the first time. After the synchronization, the secondary server also has 10,000 pieces of data from 1 to 10000.

2. Add data 10001,10002 on the primary server

3. Add data 10001,10002 to the secondary server by running commands

4. The network between the primary and secondary servers is disconnected

5. Data 10003 is added to the primary server. Because the network is disconnected, the secondary server cannot feel the data change

6. After the network is restored, connect the secondary server to the primary server again and run the SYNC command to synchronize data

7. Master sends all data to slave (1-10003)

As shown in the preceding steps, when the secondary server reconnects to the primary server, full synchronization is performed again, causing a large amount of UNNECESSARY I/O overhead. If the network environment is unstable, the primary server writes data in the memory to disks and then sends data to the secondary server.

New copy implementation

In order to solve the replication problem of the older version, the replication function has been optimized in Redis2.8. The implementation is as follows:

1. The master server maintains an offset that is incresed with N each time N bytes of data are propagated to the server, such as 0 at the beginning, and 13 at the end of a set KEY1 value1 (the true offset may not be 13, just for example). // You may need to look at code validation here

2. The slave server also maintains an offset. When the slave server receives N bytes of data from the master server, the offset is added to N.

3. The primary server maintains a buffer of fixed size, and writes corresponding commands to this buffer every time it receives a write command from the client. When the written content exceeds a fixed size, the original data is overwritten.

4. The primary server has a unique ID

5. When connecting to the primary server, the secondary server sends the ID and offset of the last connected primary server to the primary server. Here are several cases:

  1. If the secondary server does not transmit the ID or the ID does not match the current primary server, the primary server will transmit the full amount of data

  2. If the offset from the server cannot be found in the buffer (so far behind that the buffer has been overwritten by the new data), then full synchronization is also performed

  3. If offset can be found in the buffer, the master service starts at the offset and sends the buffer’s data to the slave server in turn. (Have you optimized pipeline?)

The above is the general idea of the new version of replication, it should be noted that the size of the main server buffer is very critical, if set too large will lead to space waste, if too small will lead to a bad network environment, it degrades to the old version of replication.

I’ve stepped in pits like this before: In the cloud, the Redis cluster is in two different machine rooms, and the network environment before the master and slave is not stable. However, the value stored on the Redis machine is relatively large, so it is easy to fill up the buffer, leading to full synchronization each time, forming a vicious circle, the slave server is backward and unreadable, and the master server is not able to write (when the slave server is backward too much, The primary Redis will reject writes. The parameters can be configured, as described below.

Therefore, it is recommended to set the buffer size to average reconnection interval * amount of data written per second *2

Master/slave heartbeat mechanism

By default, the secondary server sends heartbeat messages to the primary server once per second: REPLCONF ACK

Replication_offset indicates the current replication offset of the secondary server.

The heartbeat does three things:

1. Check the network connection between the primary and secondary servers

2. Realize min-Slaves function

3. The detection command is lost

Check the network connection between the primary and secondary servers

The master server records when the slave server last sent a heartbeat. Based on this time, we can tell whether the connection between the master and slave servers failed

Realize min-Slaves function

In order to ensure data security, Redis can configure the master server to reject write when the slave server is less than min-rabes-to-write or min-rabes-to-write slave server delay is greater than or equal to Min-rabes-max-lag.

Detection command loss

Replication between master and slave is actually achieved by the master server acting as the client of the slave server (in Redis, all data transfer between servers is this way). Suppose the master server sends a write command to the slave server, but the network is abnormal and the slave server does not receive the command. This can lead to inconsistent data states (you might want to send a command from the master server, but if it never fails, just send it again. If the command is successfully executed from the slave, but there is a problem when replying to the master, then the master will cause an exception if it resends the command. So the master server decides what data to send based on the heartbeat information. Here’s an example:

Initially, both the master and slave servers have offsets of 100.

The primary server receives the write command from the client and changes the offset to 110. At the same time, the primary server sends the write command to the secondary server. However, the secondary server does not receive the write command due to network reasons, and the offset is still 100. According to the heartbeat, the master server finds that the offset of the slave server is 100, so it resends data between 100 and 110.

Seeing this, you might wonder if the above scheme is correct: before receiving data from the server at 100-110, it sends a heartbeat packet telling the master that it is currently offset at 100, and then receives data at 100-110. At this time, before the next heartbeat is sent, the master server thinks that the slave server is behind it and sends 100-110 data again. As a result, the slave server writes 100-110 data again, causing data exceptions.

If you are thinking of this question, you are really thinking about it

There is no such thing because Redis is single-threaded! Remember the word “single thread” and go back to the problem description

reference

<

> by Huang Jianhong, highly recommended!