1. Principle of MySQL master-slave replication

2. How Canal works

  1. Canal simulates the interaction protocol of MySQL Slave, pretending to be MySQL Slave and sending dump protocol to MySQL Master
  2. The mysql Master received a dump request and started pushing the binary log to slave. Procedure
  3. Canal parses binary log objects (originally byte streams)

MySQL binary log

MySQL’s binary log is arguably the most important log in MySQL. It records all DDL and DML statements (except for data query statements) in the form of events. It also contains the time taken for the statements to execute.

In general, turning on binlog logs has a performance penalty of about 1%. Binlog logs have two most important usage scenarios:

  1. MySQL Replication turns on binlog on Master and Mster transfers its binary log to slaves to achieve master-slave data consistency.
  2. The natural thing is to restore the data, using the mysql binlog tool to restore the data

4. Binary log format

There are three types of binlog: Statement, row, and Mixed

4.1 the statement

Statement level

Binlog will record the statement of each write operation. Note that the statement is recorded. Salve will automatically re-execute the statement of the write operation to achieve the consistency with master

Advantages: Space saving

Disadvantages: it may cause data inconsistency (for example: random number), such as update TT set create_date=now(), if the binlog log is used to restore, because the execution time may produce different data.

4.2 the row

Row level

Binlog records the changes in each row after each operation

Advantages: Absolute consistency of data. Because no matter what the SQL is or what function it refers to, it only records the effect after execution.

Disadvantages: Takes up large space. If the execution of a statement causes many lines to change, many records will be generated.

4.3 Upgraded version of Mixed Statement, to some extent, to solve the problem of inconsistent statement patterns caused by some cases

In some cases it will be processed as ROW

  1. Contains the UUID ()
  2. When the table containing the AUTO_INCREMENT field is updated
  3. When the INSERT Delayed statement is executed
  4. When using a UDF

Advantages: save space, at the same time give consideration to a certain consistency.

Disadvantages: There are still very few cases where inconsistencies can occur, and statement and mixed are not convenient for cases where monitoring of binlogs is required.

Since canal is the change of monitored data, the format of binlog needs to be set to row format.