MySQL master-slave replication is the basis for building high availability MySQL. Replication is to synchronize the data of one server with other servers. A primary database can be synchronized to multiple secondary databases, and the secondary database can also serve as the primary database of another server. There are many different combinations of primary and standby libraries.
A master-slave replication
1) The master library records the binary log, every time before the completion of the database update, the binary log is first recorded, after recording the binary log, the master library will tell the storage engine can commit the object
2) The secondary database copies the binary log of the primary database to the local relay log. First, the secondary database starts a worker process, called THE IO worker thread, which is responsible for establishing a common client connection with the primary database. If the process catches up with the main library, it will go to sleep and will not wake up until the main library notifies it of new events and logs the received events to the relay log.
When the SQL thread catches up with the IO thread, the relay log is usually recorded in the system cache, so the overhead of the relay log is very low. SQL threads can also decide whether to write to their own binary logs based on configuration options.
Semi-synchronous replication
How do I resolve the data loss caused by the breakdown of the primary MySQL database?
Use semi-synchronous replication. Before the master commits, the binlog must be synchronized to the slave. The master can set the expiration time of the binlog synchronization. After the binlog is copied to the slave, the slave will replay the relay logs. However, this also increases client latency. In addition, you need to install the MySQL plug-in.
Semisync_xx. so is a semi-synchronous plugin for MySQL
For details, see my previous blog: MySQL replication details
Copy the way
Based on GTID and logging
- Log: Traditional way, default way. Depends on the binary log, according to the log offset. The offset of the binary log changes as transactions are committed. The slave library needs to tell the master library exactly where it copied to the offset.
- GTID: global transaction ID. A GTID is unique within a cluster. GTID= source_id:transcation_id, source_id is the GTID on that machine.
Log-based replication | Based on the GTID |
---|---|
Good compatibility | Incompatible with older versions |
Supports MMM and MHA architectures | Only the MHA architecture is supported |
It is difficult to find a new sync point after preparing to switch | Based on transaction ID replication, it is easy to find outstanding transaction ids |
Replication can be easily skipped | You can only skip operations by placing empty transactions, which is a little more complicated |
You are advised to use THE GTID mode to perform failover more securely.
Primary/secondary replication delay
What is the cause of the delay?
- The master node can have a significant impact on master-slave latency if it executes a large transaction (updating tens of millions of lines, or a long transaction)
- The network is delayed, the logs are large, and the number of slaves is excessive.
- Multithreaded writes on the master node, single-threaded recovery on the slave node
Treatment methods:
- Large transactions: Break large transactions into smaller transactions and update data in batches.
- Reduce the number of slaves, not more than 5, to reduce the size of a single transaction.
- After MySQL 5.7, you can use multi-threaded replication, using the MGR replication architecture
reference
- MySQL > create MySQL