This is the sixth day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

There are three threads running on the primary node (log dump thread) and the other two (I/O thread and SQL thread) running on the secondary node, as shown in the following figure:

1. How to achieve master-slave consistency

Binary log dump thread

When the slave node connects to the master node, the master node creates a log dump thread to send the contents of the binlog. When an operation in binlog is read, the thread locks the binlog on the master node, and when the read is complete, the lock is released before being sent to the slave node.

(2) I/O threads from the node

After executing the start slave command on the slave node, the slave node creates an I/O thread to connect to the master node and request the updated binlog from the master library. After receiving updates from the binlog dump process on the primary node, the I/O thread saves them in the local relay-log.

(3) SQL thread from node

The SQL thread is responsible for reading the contents in the relay log, parsing them into specific operations, and executing them to ensure the consistency between master and slave data.

2. One master and multiple slave synchronization?

For each master-slave connection, three processes are required to complete it. If the primary node has multiple secondary nodes, the primary node creates a binlog dump process for each secondary node that is currently connected. Each secondary node has its own I/O process, or SQL process.

The slave node uses two threads to separate the fetch and execution of updates from the main library so that the performance of the read operation is not degraded while the data synchronization task is performed. For example, if the slave node is not running, the I/O process can quickly get updates from the master node even though the SQL process has not yet executed.

If the node service is stopped before the SQL process executes, at least the I/O process has pulled the latest changes from the primary node and saved them in the local relay log. When the service is up again, the data can be synchronized.

3. The basic process of master/slave replication

(1) Connect to the master node from the I/O process on the node and request the log content from the specified location in the specified log file (or from the original log);

(2) After the master node receives the I/O request from the slave node, the I/O process responsible for replication reads the log information after the specified location according to the request information and returns the log information to the slave node. In addition to the information contained in the log, the returned information also includes the binlog file and binlog position of the returned information.

(3) After receiving the log content from the I/O process of the node, update the received log content to the local relay log, and save the binlog file name and location to the master-info file. So that the next read can clearly tell the Master “I need a certain binlog from which position to start the log content, please send me”;

(4) When the SQL thread of the Slave detects the new content added to the relay log, it will parse the content of the relay log into the operations actually performed on the master node and execute them in the local database.

MySQL master-slave replication mode

MySQL primary/secondary replication is asynchronous by default. All MySQL additions, deletions, and changes are recorded in the binlog. When connecting to the master, the slave node obtains the latest binlog file from the master. And SQL relay in bin log.

Mysql async-mode mysql async-mode

Principle: the client to submit the COMMIT after the main library, not need to wait from the library to return any results, but the results back to the client directly, the advantage is not affect the efficiency of the main library to write, but there might be the main library downtime (cold), and Binlog haven’t synchronization to from the situation of the library, which is the main library at this time and inconsistent data from library.

At this point, a new master is selected from the library, and the new master may be missing committed transactions from the original master server. Therefore, data consistency is weakest in this replication mode.

Mysql semi-sync

Principle: After the client commits a COMMIT, the result is not directly returned to the client. Instead, it is returned to the client after at least one Binlog is received from the database and written into the relay log.

This has the advantage of improving data consistency and, of course, adding at least one more network connection delay than asynchronous replication, reducing the efficiency of primary library writes. MySql5.7 supports setting the number of reply slave libraries to ensure that N slave libraries are returned after synchronization.

The semi-synchronization mode is not built-in to mysql. Since mysql 5.5 is integrated, the master and slave plug-ins need to enable the semi-synchronization mode.

(3) Full synchronization mode

In fully synchronous mode, the primary and secondary nodes return a success message to the client only after committing and confirming the commit.

– END –

Author: The road to architecture Improvement, ten years of research and development road, Dachang architect, CSDN blog expert, focus on architecture technology precipitation learning and sharing, career and cognitive upgrade, adhere to share practical articles, looking forward to growing with you. Attention and private message I reply “01”, send you a programmer growth advanced gift package, welcome to hook up.

Thanks for reading!