Detailed use of Canal middleware

What is a Canal

First of all, Canal is an open source project of Alibaba, developed in pure Java. The main purpose of Canal is to provide incremental data subscription and consumption based on incremental log parsing of MySQL database. Let’s take a look at the introduction of the official website:

Canal disguised itself as a MySQL slave and sent the dump protocol to MySQL Mater. MySQL Mater received Canal’s dump request and began pushing binary logs to Canal. Canal then parses the binary log and sends it to storage destinations such as MySQL, Redis,Kafka, Elastic Search, and so on.

The working principle of

Principle of MySQL master/slave replication

  • MySQL master writes data changes to binary log (binary log events)
  • MySQL slave copy master binary log events to its relay log
  • MySQL slave replays events in the relay log to reflect data changes to its own data

Working principle of Canal

  • Canal emulated the interaction protocol of the MySQL slave, disguised itself as the MySQL slave, and sent the dump protocol to the MySQL master
  • MySQL master receives dump request and starts pushing binary log to slave
  • Canal parses binary log objects (originally byte streams)

What can Canal do

Rather than ask what Canal can do, ask what data synchronization can do. But Canal’s data synchronization was not full, but incremental. Based on binary log incremental subscriptions and consumption, Canal can do:

  • Database mirroring
  • Real-time Database backup
  • Index building and real-time maintenance (split heterogeneous index, inverted index, etc.)
  • Service Cache Refresh
  • Incremental data processing with business logic

Current canal supports source MySQL versions including 5.1.x, 5.5.x, 5.6.x, 5.7.x, 8.0.x

How to build a Canal

Take Windows as an example

MySQL installation is not demonstrated here, it is relatively simple, there are many tutorials online.

1. The default MySQL installation does not have binlog enabled, so you need to manually modify the MySQL configuration file to enable binlog.

2. Check whether binlog is enabled. Log in to the MySQL client and view the log_bin variable

3. Add the following users and permissions to MySQL

4. Download Canal Service, and then decompress the file to the directory, the overall structure is as follows:

5. Modify the configuration file conf/example/instance properties

6. Start the Canal service and run startup.bat in the bin directory

If you see this screen, congratulations, the Canal service has started successfully. The incremental data is then processed according to your business scenario.

Application Scenarios

In daily development, we often use Redis, ES and so on to reduce the overhead of the database and improve the overall performance of the program. In this process, we will encounter a problem, how to ensure the real-time, consistency and accuracy of the data between each other?

In general, we use Canal middleware to synchronize MySQL data to other places in real time and to ensure consistency and accuracy. After Canal receives the changed data, it can push the correct data directly to the destination.

If there are multiple service scenarios, multiple slaves will cause extra management overhead for the Master, and the network card traffic will double. Therefore, in multiple service scenarios, data can be pushed to RabbitMQ middleware.

1. Direct push

Introduce the core development package in the program, here using C# as an example, different languages corresponding to different packages, refer to the official website

The main program implementation retrieves and parses the changed data in the database.

Run the program and see the following interface, indicating that the data is captured successfully, and then you can process the captured data arbitrarily.

2. Use MQ middleware to push

Canal itself integrates kafka, Rocket, and Rabbit MQ middleware, under the Plugin folder. So pushing data to MQ is very convenient.

1. Modify the canal.properties file

2. Modify the instance.properties file

Then configure your MQ:

New Exchanges

And I’m gonna create a new queue

Binding queue

Note that the Routing key must be the same as that in the Canal configuration file

Otherwise, when your MySQL data changes, there will be a corresponding message in the MQ queue.