background
Some time ago, I encountered an online problem. After a long time of investigation, I found that it was caused by the delay of master and slave synchronization. Therefore, I will write an article today to summarize this problem, hoping it will be useful for you. If you like it, add a thumbs-up
Mind mapping
Mind mapping
Common master-slave architectures
With the increasing number of visitors, the capacity of a single database has been stretched. Hence the master-slave architecture of separating read and write data from the master library and write data from the master library was derived.
A master from
From more than a master
One master, one slave and one master, many slave are the most common master-slave architectures, which are simple and effective to implement. They can not only achieve high availability, but also achieve read and write separation, thus improving the concurrency of the cluster.
More from Lord
Double master replicates
Cascade copy
Master slave synchronization principle
To understand how master-slave synchronization works, there are two important log files to keep in mind
-
Binary log file
-
Relay log file
Master/slave synchronization process
-
Generate two threads from the library, one I/O thread and one SQL thread;
-
The I/O thread requests the master binlog and writes the resulting binlog to the relay log file.
-
The primary library generates a log dump thread that passes binlogs to the secondary LIBRARY I/O thread
-
The SQL thread will read the logs in the relay log file and parse them into specific operations to achieve the consistency of the master and slave operations and the consistency of the final data.
How can I determine whether the master/slave is delayed
You can view the value of Seconds_Behind_Master displayed in the show slave status command.
-
NULL: indicates that io_thread or SQL_thread is faulty.
-
0: the value is zero, indicating that the master/slave replication is good.
-
A positive value indicates that the master/slave delay has occurred. A larger value indicates that the slave library delay is more serious.
Primary/secondary delay cause
Random playback
MySQL master-slave replication is a single-thread operation. The master database writes all DDL and DML logs to the binlog. Since the binlog is sequential, it is very efficient. The Slave SQL Thread replays the DDL and DML operation events of the master library in the Slave. DML and DDL IO operations are random, not sequential, and cost a lot more. So SQL Thread threads can’t keep up with the master library’s binlog learning speed, resulting in master-slave latency
Lock wait
On the other hand, since SQL threads are also single-threaded, when the concurrency of the master library is high, the number of DML generated exceeds the speed that the SLAVE SQL threads can process, or when a large query statement in the slave generates a lock wait, the delay occurs.
Master-slave delay solutions
The parallel copying
Since the speed of a single thread of SQL replay is limited, can we use a multi-threaded way to replay? MySQL 5.6 provides a parallel replication approach by converting SQL threads into multiple work threads for replay, which solves the master-slave delay problem
Reduce the concurrent
If you understand how random replay can cause master-slave delays, then it is easier to understand. Controlling the write speed of the master library makes master-slave delays less likely to occur.
Read the main library
If you are in a very real-time business like payment, the most direct way is to read the main library directly.
A few words about
Hi, I’m Xiao Fan, a back-end engineer. If you feel that the article is a little help to you, welcome to share with your friends, but also to small fan point a like, interested in my public account – programmer small fan. Thank you and see you next time!