Binlog is responsible for making MySQL the most popular open source database today.
1 Basic principles of MySQL active/standby
In state 1, the client accesses node A directly, and node B is the standby database of NODE A. All updates from node A are synchronized and executed locally. This keeps the data of nodes B and A the same. When you need to switch, you cut to state 2.
In state 1, although node B is not directly accessed, it is recommended to set node B (standby repository) to readonly mode for the following considerations:
- Sometimes the query statements of some operation classes will be put into the standby database to check, and set to read-only to prevent misoperations.
- Prevent the switchover logic from bugs. For example, the switchover logic may be double-write, which may cause inconsistency between the active and standby nodes.
- You can use the readonly state to determine the role of a node.
Because the readOnly setting is not valid for super users, the thread used to synchronize updates has super privileges and can keep updates synchronized with the master library
2 internal flow of the line from node A to node B
Figure 2 is A complete flowchart of an UPDATE statement executed on node A and then synchronized to node B.
After receiving the update request from the client, the master library performs the update logic of the internal transaction and writes the binlog at the same time.
Standby database B maintains A long connection with primary database A. Primary library A has an internal thread dedicated to servicing this long connection for standby library B.
The complete process of a transaction log synchronization is as follows:
- Pass through standby library B
change master
Command to set the IP, port, username, password of primary library A, and the location from which to start requesting binlog, which contains the file name and log offset. - Run this command on standby database B
start slave
Command, the standby library will start two threads, as shown in the figureio_thread
andsql_thread
. Among themio_thread
Responsible for establishing connections with the master library. - After verifying the user name and password, primary database A reads the binlog from the local database to the location sent by secondary database B and sends the log to secondary database B.
- Standby database B obtains the binlog and writes it to a local file, which is called the relay log.
sql_thread
Read the relay log, parse out the command in the log, and execute it.
3 Comparison of the three binlog formats
CREATE TABLE `t` (
`id` int(11) NOT NULL,
`a` int(11) DEFAULT NULL,
`t_modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP.PRIMARY KEY (`id`),
KEY `a` (`a`),
KEY `t_modified`(`t_modified`)
) ENGINE=InnoDB;
insert into t values(1.1.'2018-11-13');
insert into t values(2.2.'2018-11-12');
insert into t values(3.3.'2018-11-11');
insert into t values(4.4.'2018-11-10');
insert into t values(5.5.'2018-11-09');
Copy the code
Delete a row of data using the following statement:
delete from t /*comment*/ where a> =4 and t_modified< ='2018-11-10' limit 1;
Copy the code
3.1 the statement
If binlog_format is set to statement, the original SQL statements are recorded in binlog.
show binlog events in 'master.000001';
Copy the code
View the contents of binlog.
- The first line
SET @@SESSION.GTID_NEXT='ANONYMOUS'
This parameter is related to active/standby switchover and is ignored. - The second line is a BEGIN, which corresponds to the fourth line commit, indicating a transaction in the middle;
- The third line is the actual executed statement. Before the delete command is actually executed, there is one more
use 'test'
Command. This command is added by MySQL itself based on the database where the table to be operated resides. This ensures that when the log is sent to the standby repository for execution, it will be correctly updated to whatever repository the current worker thread is intest
Table T of the library. use 'test'
The delete statement following the command is the original SQL statement. Binlog even records the comments.- The last line is a COMMIT. Xid = 61.
Running the delete command produces a warning because the current binlog is in statement format and has limits, so the command is likely to be unsafe.
- If the delete statement uses index A, the first row that meets the condition (a=4) will be found based on index A.
- If the index T_modified is used, then it is deleted
t_modified='2018-11-09'
That’s the row a is equal to 5.
In the statement format, the original statement is recorded in the binlog. Therefore, it is possible to execute the statement in the primary database using index A. The index T_modified is used when the standby database executes this SQL statement. Therefore, MySQL considers it risky to write this way.
3.2 the row
binlog_format='row'
Copy the code
Compared with statement binlog, BEGIN and COMMIT are the same. The row format binlog replaces the SQL statement with two events:
- Table_map event: used to indicate that the next table to be manipulated is table T of the test library;
- Delete_rows event: Used to define the deletion action.
You can use the mysqlBinlog tool to parse and view the details of the binlog with this command. Because the information in Figure 5 shows that the transaction’s binlog starts at position 8900, you can use the start-position parameter to specify that the logs at that position should be parsed.
mysqlbinlog -vv data/master.000001 --start-position=8900;
Copy the code
- Server ID 1, which indicates that the transaction is in
server_id=1
Executed on this library. - Each event has a CRC32 value, as the parameter
binlog_checksum
Set to CRC32. - The Table_map event is the same as you see in Figure 5, showing the table to open, mapped to the number 226. If multiple tables are to be operated on, each table has a corresponding Table_map event that maps to a single number used to separate operations on different tables.
- In the mysqlbinlog command, the -vv parameter is used to parse out the contents, so you can see the values of each field in the result (e.g., @1=4, @2=4).
- The final Xid event is used to indicate that the transaction was committed correctly.
When the format of binlog_format is row, the primary key ID of the deleted row is recorded in the binlog. In this way, when the binlog is sent to the standby database, the primary key ID of the deleted row will be deleted.
3.3 mixed
Some statement binlogs may cause inconsistency between the master and slave files. Therefore, use row.
The downside of the ROW format is that it takes up a lot of space. A delete statement deletes 100,000 rows of data. In a statement, an SQL statement is recorded in a binlog. But if you use row, you write all 100,000 records to a binlog. This will not only take up more space, but also consume I/O resources to write binlog, affecting the execution speed.
So MySQL chose a compromise: mixed binlog. MySQL determines whether the SQL statement may cause an inconsistency between the master and the slave. If so, use row; otherwise, use statement.
In the above example, setting it to mixed will record it as row; If a statement is executed without limit 1, it is recorded as a statement.
More and more scenarios now require MySQL to set the binlog format to row for a number of reasons, such as data recovery.
Delete perspective: As you can see from Figure 6, even when a DELETE statement is executed, the row format binlog stores the entire row of the deleted row. If you find that the wrong data is deleted, you can directly convert the delete statement recorded in the binlog to insert, and then restore the data.
In insert format, the binlog of the INSERT statement records all field information that can be used to locate exactly the row that was just inserted. In this case, we simply change the INSERT statement to a DELETE statement and delete the row that was inserted by mistake.
Update: The binlog records the entire row before modification and the entire row after modification. If an update statement is mistakenly executed, simply swap the two lines before and after the event and execute it in the database to restore the update.
insert into t values(10.10, now());
Copy the code
This statement will record it in row format under the mixed setting.
SET TIMESTAMP=1546103491 Because NOW() is present in the binlog, the original event is recorded with an additional command: SET TIMESTAMP=1546103491. It uses the SET TIMESTAMP command to specify the return time of the following now() function.
Therefore, the value of the row inserted by the INSERT statement is the same whether the binlog is executed by the standby library one minute later or used to restore the library’s backup three days later. In other words, with the SET TIMESTAMP command, MySQL ensures the consistency of primary and secondary data.
When replaying binlog data: use mysqlbinlog to parse the log, and then copy the statement directly from the log. This approach is risky. Because the execution result of some statements is dependent on the context command, the result of direct execution is likely to be wrong.
So, the standard way to recover data using a binlog is to use the mysqlBinlog tool to parse it and send the whole result to MySQL for execution.
Mysqlbinlog master. 000001 - start - position = 2738 - stop - position = 2973 | mysql - h127.0.0.1 - P13000 - u $user - p $PWD;Copy the code
MySQL > select * from master.000001 where 2738th to 2973 bytes are parsed.
4 Problem of cyclic replication
The binlog feature ensures that when the same binlog is executed in the standby library, the same state is obtained as in the primary library. Therefore, we can assume that the master and slave data are consistent under normal circumstances. The contents of nodes A and B in Figure 1 are consistent. In fact, m-S structure is shown in Figure 1, but double M structure is more commonly used in actual production, that is, the master/standby switchover process shown in Figure 9.
There is another problem with the double-M structure: the business logic updates A statement on node A and then sends the generated binlog to node B, which also generates A binlog after executing the update statement. (If log_slave_updates is set to ON, the standby database generates a binlog after executing the relay log.) If node A is also the standby database of node B, it executes the newly generated binlog of node B again, and then repeats the update statement between node A and node B, i.e. cyclic replication.
As you can see in Figure 6 above, MySQL records in binlog the server ID of the instance where this command was executed for the first time. Therefore, the following logic can be used to solve the problem of cyclic replication between two nodes:
- Specify that the server ids of two libraries must be different. If they are the same, the relationship between them cannot be set as master/slave.
- A standby database receives a binlog and, during playback, generates a new binlog with the same server ID as the original one.
- After receiving the log from the primary library, each library determines the server ID. If the log is the same as its own, it indicates that the log is generated by itself and directly discards the log.
Following this logic, the log execution flow would look like this:
- For transactions updated from node A, the binlog contains the server ID of node A.
- After the binlog is sent to node B, the server ID generated by node B is also the server ID of node A.
- Node A determines that the server ID is the same as its own and does not process the log. So, the loop is broken here.
Further discussion of circular replication
Mysql breaks the dead loop by determining the server ID. This mechanism is not complete. In some scenarios, dead loops may occur.
In one scenario, a command is used after a master library updates a transactionset global server_id=x
The server_id is changed. When the log is returned, the server_id is different from your own server_id. Another scenario is that with three nodes, trx1 is executed on node B, so the server_id on binlog is B, the binlog is passed to node A, and then A and A ‘build A double M structure, and circular replication occurs.
This three-node replication scenario occurs during database migration. If circular replication occurs, you can run the following command on either A or A ‘:
Stop slave; CHANGE MASTERTO IGNORE_SERVER_IDS=(server_id_of_B);
start slave;
Copy the code
Then the node will not execute after receiving the log. After some time, execute the following command to change the value back.
Stop slave; CHANGE MASTERTO IGNORE_SERVER_IDS=(a);start slave;
Copy the code