preface
Only a bald head can be strong
Ok, today we are going to the platinum stage, if you have not experienced bronze and silver and gold stage, you can go to the experience and then come back:
- Learn from zero single row Redis bronze
- Learn Redis from zero
- Learn from zero single row Redis [gold]
This article focuses on Redis master-slave replication. Because Redis cluster of knowledge points a little more, so platinum share to several ~
The text does its best to simply tell each point of knowledge, I hope we can have a harvest after reading
1. Master-slave architecture
1.1 Why master-slave Architecture
Like relational data (MySQL), Redis can’t hold up if there are too many requests.
Because if Redis only had one server, then as more and more requests come in:
- Redis has limited memory and may not be able to hold that much data
- The amount of concurrency supported by a single Redis is also limited.
- If the Redis fails and all requests go to the relational database, it will be even worse.
Obviously, the above problem is because one Redis server is not enough, so it is ok to build several Redis servers
In order to realize the high availability of our service, these Redis servers can be managed as master and slave servers
Redis author has changed the Master/Slave architecture to Master/Replica
1.2 Characteristics of master-slave architecture
Here’s a look at the master-slave architecture features of Redis:
- The primary server is responsible for receiving write requests
- The slave server is responsible for receiving read requests
- Data from the slave server is copied to the master server. The data on the primary and secondary servers is consistent
Benefits of master-slave architecture:
- Read/write separation (primary writes, secondary reads)
- High availability (if one slave server hangs up, other slave servers can continue to receive requests without affecting services)
- Handle more concurrency (each slave server can receive read requests, so read QPS go up)
In addition to the above form, the master-slave schema also has the following (though less frequently) :
Second, the replication function
One of the characteristics of the master-slave architecture is that data on the master and slave servers is consistent.
Since the primary server can receive write requests, what does the primary server do to ensure consistency between the primary and secondary data after processing write requests? What happens if the primary and secondary servers are disconnected and then reconnected after a while? Details will be found below
In Redis, you can make one server replicate another by executing the SALVEOF command or setting the SALVEOF option. We call the replicated server a master server. The server that replicates the primary server is called a slave server.
2.1 Implementation of the replication function
The replication function is divided into two operations:
- Synchronization (sync)
- Update the database state of the slave server to the database state of the master server
- Command propagate
- The database status of the primary server is changed, causing the database status of the primary and secondary servers to be inconsistent.
The synchronization between the secondary server and the master server can be divided into two cases:
- Initial synchronization: The secondary server has not replicated any primary server, or the primary server to be replicated from the secondary server is different from the primary server to be replicated.
- Synchronization after disconnection: The replication between the primary server and secondary server is interrupted due to network reasons. The secondary server reconnects to the primary server through automatic reconnection and continues to replicate the primary server
Prior to Redis2.8, copying after disconnection was missing only part of the data, but it was inefficient to get the master and slave servers to re-execute SYNC. (The SYNC command is used to resynchronize all data, not just the lost data.)
Let’s take a closer look at how replication is implemented after Redis2.8:
2.1.1 Pre-replication
First let’s take a look at the front-loading tasks:
- Secondary server Sets the IP address and port of the primary server
- Establish a Socket connection to the primary server
- Send PING command (check whether Socket read and write is normal and communicate with the master server)
- Authentication (see if the corresponding authentication configuration is set)
- The slave server sends port information to the master server, and the master server records the listening port
As mentioned earlier, prior to Redis2.8, SYNC was reexecuted after disconnection, which was very inefficient. Let’s take a look at how synchronization works after Redis2.8.
Starting from version 2.8, Redis uses the PSYNC command instead of the SYNC command to perform synchronization during replication.
The PSYNC command has both full and partial resynchronization modes.
2.1.2 Complete resynchronization
Here’s how full resynchronization works:
- The secondary server sends the PSYNC command to the primary server
- The primary server receiving the PSYNC command executes the BGSAVE command to generate an RDB file in the background. A buffer is used to record all write commands executed from now on.
- When the BGSAVE command of the primary server is finished, the generated RDB file is sent to the slave server, which receives and loads the RBD file. Update the state of your database to the state when you run the BGSAVE command with the master server.
- The master server sends all buffer write commands to the slave server, and the slave server executes these write commands to achieve final data consistency.
2.1.2 Partial resynchronization
Let’s take a look at partial resynchronization. Partial resynchronization allows us to reconnect after disconnection and only synchronize the missing data (instead of synchronizing all data before Redis2.8), which is logical!
Partial resynchronization consists of the following parts:
- Replication offset of the primary and secondary servers
- Replication backlog buffer for the primary server
- Server running ID(RUN ID)
First let’s explain the above nouns:
Replication offset: Both parties performing replication maintain a replication offset
- The master server adds N to its own replication offset each time it propagates N bytes
- Each time the slave server receives N bytes from the master, it adds N to its own replication offset
By comparing the offsets of the master/slave replication, it is easy to know whether the data on the master/slave server is in a consistent state!
The e secondary server sends the PSYNC command to the primary server and reports that the offset is 36. Should the primary server perform full or partial resynchronization on the secondary server? This is left up to the replication backlogs.
When the master server propagates commands, it not only sends write commands to all slave servers, but also queues the write commands into the replication backlog buffer (this size can be adjusted). Partial resynchronization is performed if data with a missing offset exists in the replication backlog, otherwise full resynchronization is performed.
The run ID of the server is actually used to check whether the IDS are the same. If not, then the primary server replicated before the secondary server was disconnected is the same as the primary server currently connected, which causes a full resynchronization.
So the process goes something like this:
2.1.3 Command Propagation
When synchronization is complete, the master/slave server enters the command propagation phase. In this case, the master server only needs to send its own write command to the slave server, and the slave server receives and executes the write command sent by the master server, so that the master and slave servers can keep the data consistent!
During command propagation, the secondary server sends the REPLCONF ACK
command to the server once every second, where Replication_offset is the current replication offset of the server
Sending this command has three main functions:
- Check the network status of the primary and secondary servers
- Assist to implement the Min-Slaves option
- Detection command loss
Five, the last
After drawing for a long time, I finally finished.
Throw in a question: If the slave hangs, it doesn’t matter, we usually have multiple slave servers, and other requests can be sent to the slave server that is not hung. What if the primary server goes down? Because our write requests are handled by the primary server, there’s only one primary server, so we can’t process write requests, right?
The problem will be solved in the next chapter
References:
- Redis Design and Implementation
- Redis In Action
If you think my writing is good, know this:
- Stick to the original technical public account: Java3y.
- The table of contents navigation of this article (exquisite brain map + massive video resources) : github.com/ZhongFuChen…