- Getting started with MySQL (1) : Internal implementation of queries and updates
- MySQL introduction (2) : indexes
- MySQL Introduction (3) : Transaction isolation
- MySQL > lock MySQL > lock
- MySQL Introduction (5) : Replication
Abstract
In this article, I will start with why MySQL needs master-slave replication and then cover the premise of MySQL replication, bin log.
The advantages and disadvantages of each of the three formats of bin log will be explained here.
I’ll talk about master-slave delays later, and I’ll offer some ideas on how to cause delays from several angles.
1 Why is replication needed
MySQL’s built-in replication capabilities are the foundation for building large, high-performance applications. With the increase of concurrency, the standalone MySQL server gradually cannot handle these requests, so the MySQL server also needs to be expanded.
MySQL replication can not only improve availability, but also be used for disaster recovery, data warehouse, etc.
2 How To Replicate data
When it comes to replication, the key is how long and accurately data can be copied from the master to the slave.
For MySQL, replication uses bin log.
You’re familiar with bin logs. We talked about this when we talked about MySQL’s “two-phase commit”.
In other words, MySQL sends the bin log recorded by the master database to the slave database, and then the slave database “replay” the operations performed by the master database according to the bin log content to achieve the purpose of synchronization.
Let’s first elaborate on what the bin log records.
2.1 Statement-based Replication (SBR)
In this mode, the bin log records all SQL statements executed. In other words, if the statement bin log is used, the SQL statement executed by the master library will be executed in its entirety in the slave library.
However, such an approach is likely to lead to inconsistency between master and slave.
For example, the following statement:
delete from t where a >= 1 and b => 2 limit 1;
Copy the code
Such statements do not necessarily have the same effect in the slave library as in the master library. Since we can’t be sure whether the same index is followed in the slave database and the first data is searched is the same as that in the master database, the deleted data may not be the same row.
Or the SQL statement executed by the master library may contain some lock-related statements, which may cause master/slave inconsistency.
Note, however, that the NOW() function can be executed correctly because the timestamp is recorded in the bin log statement.
In other words, based on the statement pattern, it is possible to cause data inconsistency in different contexts.
As for other unsafe situations, you can refer to the official documentation, which is not described here.
2.2 ROW-based Replication (RBR)
Since data inconsistencies occur in statement schemas, is there a schema that is context-free?
Hence the row pattern.
In this mode, the bin log records only the changes to the rows being operated on, down to a single row.
For example, if you update A row, it will be recorded in the bin log that you change the value of A field from A to B in the row where id is equal to something, where field is equal to something.
Even A delete operation records the deletion of A row of data equal to id, field A, and field B.
You may think this is very convenient, and very precise, and the master will never be inconsistent again. Even delete library do not need to run, just need to view the bin log can restore the corresponding data.
But there are also problems with the row pattern. For example, if you execute a delete from T where ID < 10000 SQL statement in the master database, the bin log will record only 10000 SQL statements. If you use row mode, the bin log will record 10000 SQL statements. It takes up a lot of space.
2.3 MBR (MIXD-Based Replication)
Hence the Mixd pattern.
By combining the advantages of the two modes, MySQL uses statement when there is no ambiguity and row when there is.
3 Specific process of replication
In this chapter, we will talk about the entire replication process.
Let’s take a picture from High Performance MySQL to illustrate:
There are three threads involved:
-
Binlog Dump thread This thread reads the contents of the bin log in the main MySQL library and pushes the contents to the SLAVE I/O process.
-
I/O Thread This thread is responsible for setting up a long connection with the master database and storing the bin log data to the slave database relay log.
-
SQL Thread This thread is also in the slave library of MySQL. It is responsible for reading the contents of the relay log and then executing these statements to apply changes to the data in the slave library.
In simple terms, the master library saves the changes in the bin log after two phases of commit, and then sends the bin log to the slave library, so that the slave library also executes once for the purpose of synchronization.
The reason for using relay logs here is that the speed at which bin logs are consumed from the library is inconsistent with the speed at which bin logs are produced by the master library, so a relay log is needed as a buffer.
4 Possible replication problems
A common problem with MySQL replication is latency.
-
Suppose we set the point at which the bin log falls to T1 after a transaction is committed
-
Set the time node t2 for the bin log of new transaction writes from the slave to the master and write to the relay log
-
Set the time node to t3 when the new transaction completes from the library
Then the delay for executing a transaction from the library can be considered as (T3-T1).
That is, if we need to analyze the cause of master-slave latency, we should consider two aspects: the transfer process, and the speed at which the slave consumes the relay log.
4.1 Network Faults
For example, the bandwidth of the master library or the slave library is full, or the bin log of the master library is set to row format, resulting in a large amount of data to be transmitted. As a result, the bin log of the master library is not synchronized to the slave library in time, resulting in the master/slave delay.
4.2 Machine Performance
But in addition to the network, more from the library consumption speed, can not keep up with the production speed of the main library.
There are a number of reasons for this, such as the machine configuration of the slave may be lower than that of the master, because someone might think that since it is the standby, there is no request, so the standby is configured on a poor machine.
Or in the background data analysis, the CPU full.
In summary, if you need a large number of query analysis operations in slave libraries, you need to consider multiple slave libraries.
4.3 large transaction
If the master library executes a long transaction, then the transaction is sent to the slave library and may take the same amount of time to execute. At this point, there is no way for the slave to continue consuming new relay logs. This creates master-slave delays.
4.4 the lock
As we mentioned earlier, not only write data can be locked, but using current read can be locked as well.
So, if something like SELECT is performed on the slave library… For UPDATE, or some DDL statements, may also cause slave libraries to lock, causing master-slave delays.
4.5 concurrent
In our introduction above, SQL threads are single-threaded, so if SQL threads can be consumed concurrently, master/slave latency can be significantly reduced.
The concurrent replication strategy for MySQL has been officially supported since MySQL5.6. This article will not explain it in detail.
Write in the last
First of all, thank you for being here.
This article, in fact, is not much content, most of the content is some theoretical nature, the purpose is to have some general understanding of MySQL master-slave replication, and know which direction to consider the delay aspect of the problem.
More specific operations, how to tune, and deeper principles will be covered in future installments.
And that’s the end of the MySQL Primer series. Hopefully these five articles will help you understand MySQL a little better.
Of course, IN the process of learning MySQL, I may have some wrong understanding, if there is something wrong, I hope you can point out, thank you!
PS: If you have other questions, you can also find me in the public account, welcome to find me to play ~