@[TOC]

If the log execution speed of the standby database continues to be lower than the log generation speed of the primary database, the data delay of the primary and secondary databases may be in the hour level. For a master with a consistently high level of stress, the standby may never catch up with the master.

This requires the parallel replication capability of the standby library.


1 Parallel Replication

When it comes to the parallel replication capabilities of master and slave, focus on the areas circled in the diagram. One arrow represents client writes to the primary library, and the other arrow represents the SQL_thread execution relay log on the secondary library. If the parallelism is represented by the thickness of the arrow, the first arrow is significantly thicker than the second arrow.

On the main library, concurrency is affected by locks. Since InnoDB engine supports row locking, concurrency is possible except for extreme scenarios such as “hot row”.

Log execution on the standby database is the logic used by sql_thread to update DATA on the standby database. If a single thread is used, the log application of the standby database is not fast enough, resulting in the active/standby delay.

Prior to MySQL5.6, MySQL only supported single-threaded replication, which resulted in serious master/standby latency issues when the master library had high concurrency and QPS.

The evolution from single-threaded replication to the latest version of multi-threaded replication has taken several versions. All multithreaded replication mechanisms are designed to have only one thread as shown in Figure 1sql_thread, disassembled into multiple threads, in line with the multi-thread model:


Coordinator is the original SQL_thread, but now it is only responsible for reading and forwarding logs and distributing transactions. The worker thread updates the log. The slave_parallel_workers parameter determines the number of work threads.

Coordinators must meet the following two basic requirements:

  1. Cannot cause update overwrite. That is, two transactions updating the same row must be distributed to the same worker.
  2. The same transaction cannot be split and must be placed in the same worker.

Interviewer: How would you design a parallel replication strategy

2.1 Distribute policies by table

If two transactions update different tables, they can be parallel. Because it is guaranteed that two workers will not update the same line.

If there are cross-table transactions, the two tables should still be considered together.


Each worker thread corresponds to a hash table that holds the tables involved in transactions that are currently in the worker’s execution queue. The key of the hash table is the library name. Table name, value is a number that indicates how many transactions in the queue modify the table.

When a transaction is assigned to the worker, the table involved in the transaction will be added to the corresponding hash table. When the worker completes execution, the table is removed from the hash table.

Hash. Table 1 indicates that there are four transactions in the queue for worker1 involving table DB1.t1 and one transaction involving table DB2.t2. Hash. table 2 indicates that there is now a transaction in Worker2 that updates the data in table T3.

Suppose that in the case shown in the figure, the coordinator reads a new transaction T from the pivot log and changes rows involving tables T1 and T3.

Because transaction T involves modifying table T1, a transaction in the worker1 queue is modifying table T1, and transaction T and a transaction in the queue need to modify the data of the same table, so transaction T and Worker1 conflict.

According to this logic, the conflict relationship between transaction T and each worker queue is judged sequentially, and it will be found that transaction T also conflicts with worker2.

The transaction T conflicts with more than one worker, and the coordinator thread enters the wait.

Each worker continues to execute while modifying the Hashtable. Db1.t3 is removed from Hash. table 2, assuming that the transaction in hash.table 2 that involves modifying table t3 completes first.

The coordinator then finds that only Worker1 is in conflict with transaction T and assigns it to Worker1.


When each transaction is distributed, it involves the following three situations:

  1. If it does not conflict with all workers, the coordinator thread will assign the transaction to the least idle Woker.
  2. If it conflicts with more than one worker, the coordinator thread enters the waiting state until only one worker is left in conflict with the transaction.
  3. If it is in conflict with only one worker, the coordinator thread will assign the transaction to the conflicting worker.

This distribution by table scheme works well in scenarios where multiple tables are evenly loaded. However, if hot table is encountered, for example, all update transactions will involve a certain table, all transactions will be assigned to the same worker, which will become single-thread replication.

2.2 Row distribution Policy

The core idea of row-by-row replication is that if two transactions do not update the same row, they can execute in parallel on the standby repository.

At this time, to judge whether a transaction T and worker conflict, the rule is not to modify the same table, but to modify the same row.

The data structure for row replication is similar to that for table replication, which also assigns a hash table to each worker. In order to implement row-by-row distribution, the key must be the library name + table name + unique key value.

Mysql > select * from table where primary key id = 1; mysql > select * from table where primary key id = 1;

CREATE TABLE `t1` (
  `id` int(11) NOT NULL,
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL.PRIMARY KEY (`id`),
  UNIQUE KEY `a` (`a`)
) ENGINE=InnoDB;

insert into t1 values(1.1.1), (2.2.2), (3.3.3), (4.4.4), (5.5.5);
Copy the code

Suppose you want to execute these two transactions in the primary library:


These two transactions update rows with different primary keys, but if they are assigned to different workers, it is possible that session B’s statement will execute first. In this case, row A with id=1 is still 1, and a unique key conflict is reported.

Therefore, based on the row-based strategy, the transaction hash table also needs to consider unique keys, i.e. the key should be the name of the library + the name of the table + the name of index A + the value of A.

Update T1 set A =1 WHERE id=2 update T1 set A =1 where id=2 update T1 set A =1 where id=2 update T1 set A =1 where id=2

Therefore, when a coordinator parses the statement’s binlog, the transaction hash table has three items:

  1. Key = hash_func (db1 + t1 + + 2) "PRIMARY", value = 2Value =2 because the row ID is the same as before and after the modification.
  2. Key = hash_func (db1 + t1 + + 2) "a", value = 1, which affects rows in the table a=2.
  3. Key = hash_func (db1 + t1 + "a" + 1), value = 1, which affects rows in the table where a=1.

Compared to the table parallel distribution strategy, the row parallel strategy consumes more computing resources when determining the thread distribution. Both schemes have some constraints:

  1. Be able to parse out table names, primary key values, and unique index values from the binlog. The binlog format of the main library must be ROW;
  2. Tables must have primary keys;
  3. No foreign keys. If there are foreign keys on the table, cascaded updated rows are not recorded in the binlog, so conflict detection is not accurate.

The row-by-row distribution strategy has a higher degree of parallelism, and has two problems with large transactions involving many rows:

  1. Memory consumption. For example, if a statement deletes 1 million rows, the hash table records 1 million items.
  2. Take the CPU. Parse the binlog and compute the hash value.

Therefore, a threshold value will be set when implementing this policy. If a single transaction exceeds the set threshold of number of rows (for example, if the number of rows updated by a single transaction exceeds 100,000 rows), it will temporarily degenerate into single-threaded mode. The logic of degradation process is as follows:

  1. The coordinator temporarily holds the transaction;
  2. Wait for all workers to complete execution and become an empty queue;
  3. A coordinator executes the transaction directly;
  4. Restore parallel mode.

3 Parallel replication policy of MySQL 5.6

Parallel replication is supported in MySQL5.6, but at a granularity of library-by-library parallelism. In the hash table used to determine the distribution policy, the key is the database name. This strategy works well if you have multiple DBS on the primary database and the pressure is evenly distributed across the DBS.

This strategy has two advantages over distribution by table and row:

  1. It’s quick to hash, just the name of the library; And you don’t have so many DB’s on an instance that you need to construct a million items.
  2. The binlog format is not required. Because statement binlogs can easily get library names.

However, if the tables on the main database are all in the same DB, this strategy will not work; Or if the hot spots of different DBS are different, such as one for business logic library and one for system configuration library, parallelism will not work.

4 Parallel replication strategy of MariaDB

MariaDB’s parallel replication strategy takes advantage of redo log group commit optimization:

  1. Transactions that can be committed in the same group must not change the same line;
  2. Transactions that can be executed in parallel on the primary must also be executed in parallel on the standby.

Concrete implementation:

  1. For transactions committed together in one set, there is one same commit_id, and the next set is commit_id+1.

Commit_id writes directly to binlog. 2. When transferring to the standby database, transactions with the same commit_id are distributed to multiple workers for execution; 3. After the group is executed, the coordinator can fetch another batch. The goal of MariaDB’s strategy is to “emulate the parallelism pattern of the main library.” But it does not achieve the goal of truly simulating the concurrency of the main library. On the master library, when a set of transactions is committed, the next set of transactions is simultaneously in the “executing” state.


Suppose there are three sets of transactions executing in the main library, and trx4, TRx5, and TRx6 are executing when TRX1, TRX2, and TRx3 commit. As a result, the next set of transactions will enter the COMMIT state shortly after the first set of transactions are committed.

According to MariaDB’s parallel replication strategy, the execution effect on the standby database is shown below:


When executing on the standby database, the second set of transactions cannot be executed until the first set of transactions has been completed, which results in insufficient system throughput.

Plus, it’s easy to get bogged down by big things. If trX2 is a very large transaction, trx1 and TRx3 can only be executed when trX2 is complete. At this time, only one worker thread is working, which is an absolute waste of resources.

5 Parallel replication policy of MySQL 5.7

MySQL5.7 provides a similar function with the slave-parallel type parameter to control the parallel replication policy:

  • Configured toDATABASE, indicates that the library-by-library parallel policy of MySQL 5.6 is used.
  • Configured toLOGICAL_CLOCK, represents a MariaDB – like policy.

Can all transactions that are simultaneously in the “execution state” be parallel?

Can’t.

This is because there may be transactions that are in a lockwait state due to lock conflicts. If these transactions are assigned to different workers on the standby database, there will be inconsistency between the standby database and the primary database.

The core of MariaDB’s strategy is that “all transactions in the commit state” can be parallel. A transaction in the COMMIT state has passed the lock conflict test.


Once the redo log prepare stage is reached, the transaction has passed the lock conflict test without waiting until the COMMIT stage

MySQL 5.7 parallel replication strategy

  1. A prepared transaction can be executed concurrently in the standby database.
  2. A prepared transaction can be executed in parallel with a COMMIT transaction in the standby database.

The group commit of binlog takes two parameters:

  1. binlog_group_commit_sync_delayRepresents the number of microseconds before fsync is invoked;
  2. binlog_group_commit_sync_no_delay_countRepresents the cumulative number of times before calling fsync.

These two parameters are used to deliberately lengthen the time between write and fsync to reduce the number of disk writes in the binlog. They can be used to create more “prepare transactions simultaneously”. This increases the parallelism of the standby copy.

These two parameters can intentionally make the primary commit slower and the secondary commit faster, which is interesting.

6 Parallel Replication policy in MySQL 5.7.22

MySQL 5.7.22 adds a new parallel replication strategy — WRITESET based parallel replication.

A new parameter, binlog-transaction-dependency tracking, was added to control whether to enable this new policy. The optional value is:

  1. COMMIT_ORDER is a policy that determines whether parallelism is allowed by entering prepare and COMMIT at the same time.

  2. WRITESET, which means that the hash value of each row involved in the transaction update is calculated to form the set WRITESET. If two transactions do not operate on the same row, that is, their writesets do not intersect, they can be parallel.

  3. WRITESET_SESSION is an additional constraint based on WRITESET, that is, two transactions executed in the same thread on the primary database must be executed in the same order on the standby database.

Of course, for unique identification, the hash value is calculated by “library name + table name + index name + value”. If there are other unique indexes on a table besides the primary key index, then the writeset corresponding to the INSERT statement will add an additional hash value for each unique index.

This implementation has significant advantages over the previous row-by-row distribution strategy:

  1. Writeset is directly written to the binlog after the primary library is generated. In this way, when the secondary library is executed, there is no need to parse the binlog content (row data in the event), saving a lot of computation.

  2. It is not necessary to scan the binlog of the whole transaction to decide which worker to distribute to, which saves memory more;

  3. Since the distribution policy of the standby database does not depend on the contents of the binlog, the binlog in statement format is acceptable.

For “no primary key on table” and “foreign key constraint” scenarios, the WRITESET policy is also not parallel and can temporarily degenerate into a single-threaded model.