24 | MySQL is how to ensure the main case of consistent?

In the previous article, we mentioned that binlog can be used for archiving as well as master standby synchronization, but what does it look like? Why does the secondary database perform binlog to be consistent with the primary database?

I’ll introduce it to you formally today. It is no exaggeration to say that Binlog is responsible for making MySQL the most popular open source database available today.

In the beginning, MySQL was favored by developers as a highly available architecture that was easy to learn and convenient. Almost all of its highly available architectures rely directly on Binlog. While these high-availability architectures have become increasingly complex, they have evolved from the most basic master/standby.

In this article, I will introduce you to the basic principles of master/slave. Now that you understand the design principles behind it, you can also use them from a business development perspective.

Basic principles of MySQL active/standby

The following figure shows the basic active/standby switchover process.

In state 1, the client accesses node A directly, and node B is the standby database of NODE A. All updates from node A are synchronized and executed locally. This keeps the data of nodes B and A the same.

When you need to switch, you cut to state 2. In this case, the client reads and writes data from node B, and node A is the standby database of node B. In state 1, ALTHOUGH node B is not directly accessed, I still recommend that you set node B (the standby repository) to read-only mode. In doing so, there are several considerations:

  1. Sometimes the query statements of some operation classes will be put into the standby database to check, and set to read-only to prevent misoperations.
  2. Prevent the switchover logic from bugs. For example, the switchover logic may be double-write, which may cause inconsistency between the active and standby nodes.
  3. You can use the readonly state to determine the role of a node.

How can I keep up to date with the primary library when I set the standby library to read-only? Don’t worry. Because the readOnly setting is not valid for super users, the thread used to synchronize updates has super privileges. What is the internal flow of the line from nodes A to B? This is A complete flow diagram of an UPDATE statement being executed on node A and then synchronized to node B.

As you can see, when the master library receives the update request from the client, it performs the internal transaction update logic and writes the binlog.

Standby database B maintains A long connection with primary database A. Primary library A has an internal thread dedicated to servicing this long connection for standby library B. The complete process of a transaction log synchronization looks like this:

  1. Run the change master command on standby library B to set the IP, port, username, password of primary library A, and the location from which the binlog request should start. This location contains the file name and log offset.
  2. Run the start slave command on slave library B. In this case, the slave library starts two threads, namely io_thread and SQL_thread, as shown in the figure. Io_thread is responsible for establishing connections with the primary library.
  3. After verifying the user name and password, primary database A reads the binlog from the local database to the location sent by secondary database B and sends the log to secondary database B.
  4. Standby database B obtains the binlog and writes it to a local file, which is called the relay log.
  5. Sql_thread reads transfer logs, parses commands in logs, and executes them.

Later, due to the introduction of multi-threaded replication scheme, SQL_thread evolved into multiple threads.

After analyzing the logic of this long connection, let’s take a look at the question: what is the contents of the binlog, and why can be directly executed by the backup database.

Comparison of the three binlog formats

A binlog has two formats: statement and row. You may see a third format in other sources called mixed, which is actually a mix of the first two. To describe the differences between the three formats of binlog, I create a table and initialize a few rows of data.

mysql> CREATE TABLE `t` ( `id` int(11) NOTNULL, `a` int(11) DEFAULTNULL, `t_modified` timestamp NOTNULL DEFAULTCURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `a` (`a`), KEY `t_modified`(`t_modified`) ) ENGINE=InnoDB; Insert into t values (1, 1, '2018-11-13'); Insert into t values (2, 2, '2018-11-12'); Insert into t values (3, 3, '2018-11-11'); Insert into t values (4, 4, '2018-11-10'); Insert into t values (5, 5, '2018-11-09');Copy the code

If we want to delete a row from a table, let’s look at the binlog of the delete statement. Note that the following statement contains comments. If you are using the MySQL client to do this, please remember to add -c, otherwise the client will remove comments automatically.

mysql> delete from t /*comment*/ where a>=4 and t_modified<='2018-11-10' limit 1;

If binlog_format is set to statement, the original SQL statements are recorded in binlog. Mysql > show binlog events in ‘master.000001’

Now, let’s look at the output below.

  • SET @@session.GTID_NEXT=’ANONYMOUS ‘SET @@session.GTID_NEXT=’ANONYMOUS’
  • The second line is a BEGIN, which corresponds to the fourth line commit, indicating a transaction in the middle;
  • The third line is the actual statement. As you can see, there is also a “use ‘test” command before the actual delete command is executed. This command is not executed by us, but is added by MySQL based on the database where the current table resides. This ensures that the log will be correctly updated to table T of the test library, regardless of which library the current worker thread is in, when it is sent to the standby library for execution. The delete statement following the use ‘test ‘command is the original SQL we entered. As you can see, the binlog faithfully records the SQL commands, even the comments.
  • The last line is a COMMIT. You can see it says xID =61. This XID was mentioned in article 15.

To illustrate the difference between statement and row, let’s look at the delete command:

The delete command generates a warning because the current binlog is in statement format and has limits. Therefore, the command is unsafe. Why do you say that? This is ** because delete with limit, it is possible to have data inconsistency between master and slave. ** For example:

  1. If the delete statement uses index A, the first row that meets the condition will be found based on index A.
  2. But if the index T_modified is used, then the line t_modified=’2018-11-09 ‘(a=5) is deleted.

In the statement format, the original statement is recorded in the binlog. Therefore, it is possible to execute the statement in the primary database using index A. The index T_modified is used when the standby database executes this SQL statement. Therefore, MySQL considers it risky to write this way.

So, if I change the format of binlog to binlog_format= ‘row’, does this problem not exist? Let’s take a look at what’s in Binog at this point.

Compared with statement binlog, BEGIN and COMMIT are the same. However, the row binlog replaces the SQL statement with two events: Table_map and Delete_rows.

  1. Table_map event, used to indicate that the next table to be manipulated is table T of the test library;
  2. Delete_rows event, used to define the deletion behavior.

In fact, we can not see the details from the graph, but also use the mysqlBinlog tool, using the following command to parse and view the contents of the binlog. Because the information in the diagram shows that the transaction’s binlog starts at position 8900, you can use the start-position parameter to specify that the logs at that position should be pared.

mysqlbinlog -vvdata/master.000001 --start-position=8900;

From this figure, we can see the following information:

  • Server ID 1, which indicates that the transaction is executed on the library server_id=1.
  • Each event has a CRC32 value because I set the binlog_checksum parameter to CRC32.
  • The Table_map event, as seen in the figure, shows the table to be opened next, mapped to the number 226. Now our SQL statement only operates on one table. What if we want to operate on multiple tables? Each table has a corresponding Table_map event that maps to a single number that distinguishes operations on different tables.
  • In mysqlbinlog, we use the -vv parameter to parse out the contents, so we can see the values of each field in the result (for example, @1=4, @2=4).
  • Binlog_row_image is set to FULL by default, so Delete_event contains all field values for the deleted row. If binlog_ROW_image is set to MINIMAL, only necessary information is recorded, in this case, only id=4.
  • The final Xid event is used to indicate that the transaction was committed correctly.

As you can see, when binlog_format uses row format, the primary key ID of the actual deleted row is recorded in the binlog, so that when the binlog is sent to the standby database, it will delete the row whose ID is 4.

Why mixed binlog?

Based on the above information, let’s discuss a question: why does the mixed binlog format exist? The reasoning goes like this: Some statement binlogs may cause master/slave inconsistency, so use row. The downside of the ROW format, however, is that it takes up a lot of space. For example, if you use a DELETE statement to delete 100,000 rows of data, the statement is an SQL statement stored in a binlog that takes up tens of bytes of space. But if you use a row binlog, you write all 100,000 records to a binlog. This will not only take up more space, but also consume I/O resources to write binlog, affecting the execution speed.

So, MySQL has taken a compromise, which is to have mixed binlogs. The mixed format means that MySQL itself determines whether the SQL statement is likely to cause a master/slave inconsistency. If so, use the row format; otherwise, use the statement format. That is, mixed formats can take advantage of the benefits of Statment without the risk of inconsistent data.

In our example, if we set it to mixed, it will be recorded as row; If a statement is executed without limit 1, it is recorded as a statement. Of course, I should say that more and more scenarios now require MySQL to set the binlog format to row. There is an immediate benefit to doing this: data recovery.

summary

There are three main types of mysql replication: SQL statement-based replication (SBR), Row-based replication (RBR), and mixed-mode replication (MBR). A binlog can be in three formats: STATEMENT, ROW, and MIXED.

① STATEMENT Mode (SBR)

Each SQL statement that modifies data is logged in the binlog. The advantages are that each SQL statement and data changes in each row do not need to be recorded, reducing the amount of binlog logs, saving I/O, and improving performance. Disadvantages are inconsistencies in master-slave in some cases (sleep(), last_insert_id(), user-defined functions(UDF), etc.)

② ROW mode (RBR)

Instead of recording the context information for each SQL statement, only which data was modified and to what extent. And there are no problems with stored procedures, or functions, or triggers being called and fired incorrectly in certain cases. The disadvantage is that a large number of logs are generated, especially when alter table is used.

③ MIXED mode (MBR)

If the above two modes are used together, the STATEMENT mode is used to store binlogs. If the STATEMENT mode cannot be used, the ROW mode is used to store binlogs. The MySQL database selects a log storage mode based on the SQL STATEMENT.

Cyclic replication problem

From the basic understanding of the binlog in MySQL above, you can now see that the binlog feature ensures that the same binlog can be executed in the standby library and get the same state as the primary library.

Therefore, we can assume that the master and slave data are consistent under normal circumstances. In other words, the contents of nodes A and B in Figure 1 are consistent. In fact, the m-S structure I drew in Figure 1 is more commonly used in actual production is the double M structure, that is, the master/standby switchover process as shown in the following figure.

In fact, the difference between the double M structure and the M-S structure is only one more line, that is, nodes A and B are always mutually primary and standby. In this way, you do not need to change the master/slave relationship during the switchover.

However, the double M structure still has a problem to solve. The service logic updates A statement on node A and then sends the generated binlog to node B. Node B also generates A binlog after executing the update statement.

If node A is also the standby database of node B, it executes the newly generated binlog of node B again, and then repeats the update statement between node A and node B, i.e. cyclic replication. So how do we solve this?

As you can see in Figure 6 above, MySQL records in binlog the server ID of the instance where this command was executed for the first time. Therefore, we can use the following logic to solve the problem of cyclic replication between two nodes:

  1. Specify that the server ids of two libraries must be different. If they are the same, the relationship between them cannot be set as master/slave.
  2. A standby database receives a binlog and, during playback, generates a new binlog with the same server ID as the original one.
  3. After receiving the log from the primary library, each library determines the server ID. If the log is the same as its own, it indicates that the log is generated by itself and directly discards the log.

Following this logic, if we set up the double M structure, the log execution flow would look like this:

  1. For transactions updated from node A, the binlog contains the server ID of node A.
  2. After the binlog is sent to node B, the server ID generated by node B is also the server ID of node A.
  3. Node A determines that the server ID is the same as its own and does not process the log. So, the loop is broken here.