Redis Master/slave Architecture Notes

Redis acts as a cache, and when the system needs to support 100,000 + high concurrency, performance bottles will appear due to the standalone version. In the face of such a high concurrency situation where read far outnumbers write, redis architecture is generally used to design a master-slave architecture with read and write separation: the master service supports data writing and the slave service supports high concurrency reading. As the number of concurrent reads increases, it can scale horizontally from the server to cope.

Since the slave service only receives read commands and all data comes from the master service, the replication function provided by Redis ensures data consistency between the two. In Redis, you can use the saveof command or configure saveof to make one Redis service copy another. Assume two Redis services: A: 127.0.0.1:6379, B:127.0.0.2:6379. Send saveof 127.0.0.2:6379 in service A. Service A becomes the master node and service B becomes the slave node of service A. Data consistency is maintained by copying SERVICE B and SERVICE A.

Copy the old version

In the earlier version of Redis, the implementation of replication is mainly divided into two operation synchronization (sync) and command propagation (propagate).

Synchronization: Updates the database state of the slave service to the current database state of the master service.
Command propagation: After the database status of the master service is changed, the command propagation is used to ensure the consistency of the database status of the slave service.

synchronous

After saveof is executed, the primary and secondary services first need to synchronize data to ensure consistency between the two services.

Process:

Slave sends the sync command to the master.
After receiving the sync command, the master runs the bgsave command to generate an RDB file based on the state of the master database at that time. During the execution of the BGsave command, the master writes the write commands in this period to a cache.
The RDB file is sent to the slave, and the slave synchronizes data.
After RDB synchronization is complete, the cache commands are sent to the slave to synchronize data.
The data of the master and slave is consistent.

Command transmission

After the synchronization operation is complete, the master and slave service data agree on a state based on the current state. However, if the master continues to receive write commands, the operation is transmitted through the command to ensure consistency between the master and slave.

defects

The current replication function can achieve data consistency between the primary and secondary services. However, when the slave goes offline and reconnects to the master service, synchronization needs to be completed again to ensure data consistency. The synchronization operation is a very performance intensive operation, and perhaps the slave already has more than half the data and does not need the full RDB file. In this case, it would be inefficient to have the master and slave perform a synchronization operation again in order to make up a small portion of the missing data.

The new copy

In order to solve the problem that the replication function of the old version is inefficient in handling the disconnection and duplicate cases, Redis uses the PSYNC command instead of the SYNC command to perform the synchronization operation during replication from version 2.8. The PSYNC command supports full resynchronization and partialresynchronization modes.

Full resynchronization: Synchronizes the slave with the Master service for the first time. The synchronization is basically the same as sync.
Partialresynchronization: the slave reconnects to the master after disconnection. The master sends commands generated during the disconnection to the slave to synchronize the lost data.

Partial resynchronization

Partial resynchronization is a good solution to the problem of old copy. Its implementation mainly depends on:

Offsets of the master and slave servers
Replication backlog cache for the master service
Service RUNID (RUNID)

The offset

After replication is performed, both the master and slave services maintain a respective replication offset. The master service command propagates N bytes and the offset increases by N. Receives N bytes of command from the service, increasing the offset by N. If the offsets of the primary and secondary services are the same, the primary and secondary data are consistent.

Duplicate the backlog cache

During command propagation from master to slave, the master maintains a fixed-size (1 MB by default) FIFO queue as the replication backlog cache.

Each time command propagation occurs, the command is both sent to the slave service and written to the backlogged cache.
After entering the queue, the queue sets the current offset for each byte.
When the queue is full, the first entry command pops up.

When the slave service is disconnected and reconnected, it preferentially searches the queue with the offset of the slave. If the offset exists in the queue, commands following the offset are sent to the slave service to partially resynchronize data. In charge, perform full synchronization operation.

If you need to customize the size of the backlogged cache, the size cannot be too large or too small. You can determine the value based on the average disconnection reconnection time (S)seconds and the average write command data volume (total length of write commands in protocol format) writeSize:

Configuration:

// Default 1M # repl-backlog-size 1MBCopy the code

Run the ID

The run ID is automatically generated when the server starts and consists of 40 random hexadecimal characters. When the slave makes the first replication to the master, the master passes its run ID to the slave, and the slave saves the run ID.

When the slave service is disconnected and reconnected, the slave service sends the running ID of the master service to the master service, and the master service compares the running ID with its own.

Run the same ID, the primary service is the service before disconnection, try to perform partial resynchronization.
If the running ID is different and the active service is restarted or not the active service before disconnection, complete resynchronization is performed.

psync

First replication from the server, send psync? The -1 command indicates the first connection and complete resynchronization is performed.
To reconnect from service disconnection, send the psync command:
1. The active service receives the same RUNID as itself, searches for offset in the backlogged cache, and performs partial resynchronization.
2. If the rUNID is different or the offset buffer does not exist, complete resynchronization is performed.

The authentication

Authentication can be set for the primary and secondary services before synchronization.

From the service

# masterauth <master-password>
masterauth GGuoLiang
Copy the code

The main service

# requirepass foobared
requirepass GGuoLaing
Copy the code

When masterauth is set from the service, authentication is performed. If the primary and secondary services have the same configuration values, the authentication succeeds.
When the slave service is not set to Masterauth, no authentication is performed.

Min – slaves configuration

Redis’ min-rabble-to-write and Min-rabble-max-lag options prevent the master server from executing write commands in unsafe situations.

# min-slaves-to-write 3
# min-slaves-max-lag 10
Copy the code

When the number of slave servers is less than three, or the lag values of the three slave servers are all greater than or equal to 10 seconds, the master service rejects writing commands.

Replication implementation

Process:

Send the saveof host:port command from the service and save the IP address and port number of the master service.
The master and slave servers establish socket connections.
Send ping from service master service returns pong response;
Master/slave authentication succeeded;
The slave service sends the port number, and the master service saves this property;
The synchronization operation ensures the consistency of current data.
Command propagation ensures consistency of subsequent data.

The heartbeat

During command propagation, the slave server by default sends the heartbeat detection command replconf ack at a rate of one second. Offset is the offset of the current slave service. The main role of the heartbeat detection

Check the network connection of the master/slave service.
Auxiliary min-slave configuration.
The detection command is lost, the offset is compared, and the missing command is sent again.

Reference: Redis design and implementation

This article is formatted using MDNICE