26 why | for library will be delayed several hours?
In my last article, I talked to you about several possible reasons for back-up delays. You’ll notice that in these scenarios, whether it’s the occasional query pressure or the backup, the impact on the standby latency is usually on the order of minutes and can be caught up after the standby is back to normal.
If the standby database continues to execute logs at a slower rate than the primary database generates logs, the delay may be on the hour scale. And for a master with a consistently high level of stress, the standby will probably never catch up with the master. This brings us to the topic I want to introduce you to today: standby parallel replication capabilities.
Let’s take a look at article 24: How does MySQL guarantee primary/secondary consistency? Master/standby flow chart of
When it comes to the parallel replication capability of master and slave, the two black arrows are the ones to focus on. One arrow represents client writes to the primary library, and the other arrow represents the SQL_thread execution relay log on the secondary library.
If the parallelism is represented by the thickness of the arrow, then the real situation is as shown in the figure, where the first arrow is significantly thicker than the second arrow.
On the main library, concurrency is affected by locks. Since the InnoDB engine supports row locking, it is friendly to business concurrency except for the extreme scenario where all concurrent transactions are updating the same row (hot row). Therefore, when you test performance, you will find that the overall throughput of 32 concurrent threads is higher than that of a single thread. The log execution on the standby database is the DATA updating logic of sql_thread on the standby database in the figure. If single-threaded logs are used, the standby database does not apply logs fast enough, resulting in active/standby delay.
Prior to MySQL5.6, single-threaded replication was only supported, and serious master/standby latency problems occurred when the primary library had high concurrency and HIGH TPS. From single-threaded replication to the latest version of multi-threaded replication, let’s take a look at the evolution of MySQL multi-threaded replication. Sql_thread (); sql_thread (); sql_thread ();
The coordinator in the figure is the original SQL_thread, but it no longer updates data directly. Instead, the coordinator reads and transfers logs and distributes transactions. It is the worker thread that actually updates the log. The number of work threads is determined by the slave_parallel_workers parameter. It is best to set this value to between 8 and 16 (in the case of a 32 core physics machine), since the standby library may also provide read queries and do not eat up all the CPU.
Next, we need to consider a question: can transactions be distributed to each worker in a polling mode, i.e. the first transaction is distributed to worker_1 and the second transaction is distributed to worker_2?
Actually, no. After transactions are distributed to workers, different workers will execute them independently. However, due to the CPU’s scheduling policies, it is likely that the second transaction will end up executing before the first. However, if the two transactions update on the same row, it means that the two transactions on the same row are executed in the opposite order on the primary and standby, leading to the problem of master/standby inconsistency.
Next, please consider another question: can multiple update statements of the same transaction be distributed to different workers to execute?
Also not line. For example, a transaction to update the each row in the table table t1 and t2, if the two update statement is assigned to the different worker, although the end result is the main equipment consistent, but if the table t1 completes, a query for library, you will see this transaction “updated half results”, destroyed the transaction logic isolation.
Coordinators must meet the following two basic requirements:
- Cannot cause update overwrite. This requires that two transactions updating the same row must be distributed to the same worker.
- The same transaction cannot be split and must be placed in the same worker. All versions of multithreaded replication follow these two basic principles. Next, let’s look at each version of the parallel replication strategy.
Parallel replication strategy for MySQL 5.5
MySQL 5.5 does not support parallel replication. The standby repository has only single thread replication. Then I wrote two versions of the parallel strategy. The two versions of the parallel policy are the table distribution policy and the row distribution policy.
Distribute policies by table
The basic idea of distributing transactions by table is that if two transactions update different tables, they can be parallel. Since the data is stored in the table, it is distributed according to the table to ensure that the two workers will not update the same line.
As you can see, each worker thread corresponds to a hash table that holds the tables involved in transactions currently in the worker’s “execution queue”. The key of the hash table is the library name. Table name “, value is a number that indicates how many transactions in the queue modify the table.
When a transaction is assigned to the worker, the table involved in the transaction will be added to the corresponding hash table. When the worker completes execution, the table is removed from the hash table.
When each transaction is distributed, it has a conflict relationship with all workers (for example, a transaction in the worker_1 queue is modifying table T1, and a transaction T and a transaction in the queue need to modify the data of the same table. In this case, we say that transaction T and worker_1 are in conflict.) Including the following three situations:
- If it does not conflict with all workers, the coordinator thread will assign the transaction to the least idle Woker.
- If it conflicts with more than one worker, the coordinator thread enters the waiting state until only one worker is left in conflict with the transaction.
- If it is in conflict with only one worker, the coordinator thread will assign the transaction to the conflicting worker.
This distribution by table scheme works well in scenarios where multiple tables are evenly loaded. However, if hot table is encountered, for example, all update transactions will involve a certain table, all transactions will be assigned to the same worker, which will become single-thread replication.
Distribute policies by row
In order to solve the parallel replication problem of hotspot table, a parallel row replication scheme is needed. The core idea of row-by-row replication is that if two transactions do not update the same row, they can execute in parallel on the standby repository. Obviously, this pattern requires that the binlog format be ROW.
At this point, we judge whether a transaction T conflicts with worker by using the rule of “modify the same table”, but “modify the same row”. The data structure copied by row is similar to that copied by table, and a hash table is also assigned to each worker. To implement row-by-row distribution, the key must be “library name + table name + unique key value”.
Table T1 has a unique index (a) and a primary key (A); table T1 has a unique index (A) and a primary key (B);
CREATE TABLE `t1` ( `id` int(11) NOTNULL, `a` int(11) DEFAULTNULL, `b` int(11) DEFAULTNULL, PRIMARY KEY (`id`), UNIQUE KEY `a` (`a`) ) ENGINE=InnoDB; ,1,1 I nsert into t1 values (1), (2,2,2), (3) filling,,4,4 (4), (5,5,5);Copy the code
Suppose you want to execute these two transactions in the primary library:
As can be seen, the primary key values of the rows to be updated by the two transactions are different, but if they are assigned to different workers, it is possible that session B’s statement will be executed first. In this case, row A with id=1 is still 1, and a unique key conflict is reported. (A unique key is not a primary key. A primary key cannot be repeated, but a unique key is not equal to a primary key. The function is to avoid adding duplicate data. A table can have only one primary key, but a table can have multiple unique keys.
Therefore, based on the row-based strategy, the transaction hash table also needs to consider unique keys, i.e. the key should be “library name + table name + the name of index A + the value of A”. Update T1 set A =1 WHERE ID =2; update T1 set a=1 where id=2; update T1 set a=1 where ID =2;
Therefore, when a coordinator parses the statement’s binlog, the transaction hash table has three items:
- Key = hash_func (db1 + t1 + + 2) “PRIMARY”, value = 2; Value =2 because the row ID is the same as before and after the modification.
- Key =hash_func(db1+ T1 + “a” +2), value=1, indicating that rows a=2 will be affected.
- Key =hash_func(db1+ T1 + “a” +1), value=1;
As you can see, the row-by-row strategy consumes more computing resources when determining thread distribution than the table-by-table parallel distribution strategy.
Both schemes have some constraints:
- Be able to parse out table names, primary key values, and unique index values from the binlog. That is, the binlog format of the main library must be ROW;
- Tables must have primary keys;
- No foreign keys. If there are foreign keys on the table, cascaded updated rows are not recorded in the binlog, so conflict detection is not accurate.
The row distribution strategy has a higher degree of parallelism when comparing the table distribution and row distribution. However, for large transactions with many rows, the row-by-row strategy has two problems:
- Memory consumption. For example, if a statement deletes 1 million rows, the hash table records 1 million items.
- Take the CPU. Parsing the binlog and then computing the hash value is still expensive for large transactions.
So, when I implement this strategy, I set a threshold. If a single transaction exceeds the set threshold of number of rows (for example, if the number of rows updated by a single transaction exceeds 100,000 rows), it will temporarily degenerate into single-threaded mode. The logic of the degradation process is something like this:
- Coordinator holds the transaction for the time being;
- Wait for all workers to complete execution and become an empty queue;
- A coordinator executes the transaction directly;
- Restore parallel mode.
Reading this, you may be wondering why these two strategies have not been officially combined. The purpose of these two strategies is to help you understand the community version strategy that will be introduced later.
MySQL 5.6 parallel replication strategy
The official MySQL5.6 release supports parallel replication, but at a granularity of library-by-library parallelism. Similar to the table and row distribution policies described above, the key is the database name in the hash table used to determine the distribution policy.
The parallel effect of this strategy depends on the pressure model. This strategy works well if you have multiple DBS on the primary database and the pressure is evenly distributed across the DBS. This strategy has two advantages over distribution by table and row:
- It’s quick to hash, just the name of the library; And you don’t have so many DB’s on an instance that you need to construct a million items.
- The binlog format is not required. Because statement binlogs can easily get library names.
However, if all the tables on your main database are in the same DB, this strategy will not work. Or if the hot spots of different DBS are different, such as one for business logic library and one for system configuration library, parallelism will not work.
MariaDB’s parallel replication strategy
In article 23, WE introduced redo log group commit optimization, and MariaDB’s parallel replication strategy takes advantage of this feature:
- Transactions that can be committed in the same group must not change the same line;
- Transactions that can be executed in parallel on the primary must also be executed in parallel on the standby.
In implementation, MariaDB does this:
- For transactions committed together in one set, there is one same commit_id, and the next set is commit_id+1.
- Commit_id writes directly to binlog.
- When transferred to the standby database application, transactions of the same commit_id are distributed to multiple workers for execution.
- After the group is executed, the coordinator goes to fetch another batch.
At the time, this strategy was pretty amazing when it came out. This is because the previous thinking in the industry was “analyze binlog and split it into worker”. In MariaDB, the goal is to “emulate the parallelism pattern of the main library.”
One problem with this strategy, however, is that it does not achieve the goal of “truly simulating the concurrency of the main library.” On the master library, when a set of transactions is committed, the next set of transactions is simultaneously in the “executing” state.
As shown in the figure, assuming the execution of three sets of transactions in the main library, you can see that trx4, TRx5, and TRx6 are executing when TRX1, TRX2, and TRx3 commit. As a result, the next set of transactions will enter the COMMIT state shortly after the first set of transactions are committed.
Figure 1 shows the implementation of parallel transactions in the primary library, and Figure 2 shows MariaDB parallel replication with parallel effects in the secondary library.
As you can see, when executing on the standby database, the second set of transactions cannot start executing until the first set of transactions has been fully executed, so the throughput of the system is insufficient.
Plus, it’s easy to get bogged down by big things. If trX2 is a very large transaction, then when trX1 and trx3 are completed, the next group can only be executed after TRX2 is completed. During this time, only one worker thread is working, which is a waste of resources.
Even so, the strategy is a beautiful innovation. Because, it changes the original system very little, the implementation is elegant.
MySQL 5.7 parallel replication strategy
Following the MariaDB parallel replication implementation, the official MySQL5.7 version provides similar functionality with the slaveparallel-type parameter to control the parallel replication strategy:
- If MySQL 5.6 is set to DATABASE, the library-by-library parallel policy is used.
- Set to LOGICAL_CLOCK to indicate a MariaDB – like policy.
But MySQL 5.7 has been optimized for parallelism. This optimization idea is also interesting. You can start with the question: Can all transactions that are in “execution” at the same time be parallel? The answer is no.
This is because there may be transactions that are in a lockwait state due to lock conflicts. If these transactions are assigned to different workers on the standby database, there will be inconsistency between the standby database and the primary database.
The core of the MariaDB strategy mentioned above is that “all transactions in the commit state” can be parallel. A transaction in the COMMIT state has passed the lock conflict test. At this point, you can review the two-phase commit process diagram again.
Once the redo log prepare stage is reached, the transaction has passed the lock conflict test.
Therefore, the idea of MySQL 5.7 parallel replication strategy is:
- A prepared transaction can be executed concurrently in the standby database.
- A prepared transaction can be executed in parallel with a COMMIT transaction in the standby database.
- The binlog_group_COMMIT_SYNc_delay parameter, indicating the number of microseconds before fsync is called;
- The binlog_group_COMMIT_SYNC_NO_delay_count parameter indicates how many times fsync should be called.
In the parallel replication strategy of MySQL5.7, they can be used to create more “simultaneous prepare transactions”. This increases the parallelism of the standby copy.
These two parameters can either “intentionally” make the primary commit slower or the secondary commit faster. When MySQL5.7 handles standby database latency, you can adjust these two parameters to improve standby database replication concurrency.