MySQL: How to Solve the out-of-date read problem with read/write separation

directory

Two kinds of architecture

Two architectural features

Forcibly remove the master library scheme

Sleep solutions

Determine the no-delay scheme in active/standby mode

Cooperate with semi – sync

Wait for the main repository point scheme

GTID scheme

Two kinds of architecture

How to solve the problem of read/write separation caused by the active/standby delay based on the read/write separation with one primary and multiple secondary nodes?

The main objective of read/write separation is to divide the pressure of stall master database.

There are two types of architecture:

1. The client takes the initiative to perform load balancing and puts the database connection information in the client connection layer. The client selects the back-end database for query.

2. Add an intermediate proxy layer between MySQL and the client. The client only connects to the proxy, and the proxy determines the distribution route of the request according to the request type and context

Two architectural features

1. Client direct connection scheme, less a layer of proxy forwarding, better query performance, simple structure.

Since you need to understand the details of back-end deployment, the client will sense the operation such as active/standby switchover and library migration and adjust the database connection information. The client reassigns a component to manage the back end, leaving the business side to focus only on business logic development

2, with proxy architecture, so that the client does not need to pay attention to back-end details.

The connection maintenance and back-end information maintenance are all done by proxy. But the architecture as a whole is relatively complex

Both architectures suffer from the “expired read” problem:

Because of the possible master-slave delay, the client can initiate a query immediately after executing an update transaction. If the query selects the slave library, it is possible to read the state before the update transaction.

The client expects the result of querying data from the slave library to be the same as that of querying data from the master library.

Here’s the solution:

Forced out of the main library scheme; Sleep; Determine the master-standby no-delay scheme; Cooperate with semi_SYNC program; Equal main repository point scheme; GTID scheme;Copy the code

In practice, these schemes can be mixed.

For example, the client now classifies requests into those that can accept expired reads and those that cannot accept expired reads at all. Then, for statements that cannot accept expired reads, the equal GTID or equal locus scheme is used,

Overdue reads are essentially caused by multiple reads on one write. To avoid overdue reads, there are only two options:

1. Give up over time

2. Go to the master database for query

Forcibly remove the master library scheme

Categorize query requests:

1. Force requests that require the latest results to be sent to the main repository.

For requests that can read old data, send it to the slave.

This scheme is tricky but useful.

The biggest problem of this scheme is that when the business needs high real-time performance, the read and write separation should be abandoned. All the read and write pressure is in the master library, which is equivalent to giving up the scalability.

Sleep solutions

After the master library updates, sleep before reading the slave library. For example, run a select sleep(1) command.

Assumption of the scheme: In most cases, the master/slave delay is within 1s, and doing a sleep can have a high probability to get the latest data

The scheme is not precise:

Sleep (1) will also wait 1s if the query request gets the correct result from the library in 0.1s

2. If the delay exceeds 1s, an expired read will still occur

Determine the no-delay scheme in active/standby mode

There are three ways to ensure there is no delay in the standby:

The seconds_behind_master parameter in the show slave status result can be used to measure the length of the master/slave delay.

The first way

Check whether seconds_behind_master is 0 each time before executing a query request from the library. If seconds_behind_master is not 0, wait until this parameter is 0 before executing the query operation.

The second way

Compare sites to ensure there is no delay between master and slave.

Master_Log_File and Read_Master_Log_Pos, which represent the most recent site read from the master library;
Relay_Master_Log_File and Exec_Master_Log_Pos represent the latest point of execution for the standby library;

If the values of Master_Log_File and Relay_Master_Log_File are the same, and the values of Read_Master_Log_Pos and Exec_Master_Log_Pos are the same, the received logs have been synchronized.

The third way

Compare the GTID collection to ensure that there is no delay between master and slave

Auto_Position = 1Indicates that the GTID protocol Retrieved_Gtid_Set is used for the primary/secondary relationship. It is the GTID set Executed_Gtid_Set of all logs received by the secondary library. It is the GTID set of all executed logs of the secondary libraryCopy the code

If the two collections are the same, the logs received by the standby database are synchronized.

State of a transaction’s binlog between primary and secondary databases:

1. After the execution of the master library is complete, write binlog and feed back to the client;

2. The binlog is sent from the primary database to the standby database. The standby database receives the binlog

3. Run the binlog command in the standby database.

The logic we used above to determine that there is no delay in active/standby is “all logs received by the standby database have been executed”, but some logs will be in the state that the client has received the commit confirmation, but the standby database has not received the log.

Three transactions trx1, TRX2, and TRx3 have been executed on the master library. The first two have been passed to the slave library and completed. Trx3 is finished in the master library and has been replied to the client, but has not been sent to the slave library. If the query request is executed on the slave library B, according to the logic of the above three methods, the slave library will think that there is no synchronization delay, but trx3 will still not be found.

Cooperate with semi – sync

To solve the above problem, introduce semi-synchronous replication, known as semi-sync replication

Semi-sync works like this:

1. When a transaction commits, the master sends the binlog to the slave.

2. After receiving a binlog from the host, send an ACK to the host

3. After the master library receives this ACK, it can return a “transaction completed” acknowledgement to the client

That is, if semi-sync is enabled, all standby transactions that have sent acknowledgments to the client have received the log.

The semi-sync+ site judgment scheme is valid in the one-master, one-standby scenario. In the one-master, multi-slave scenario, the master library only needs to wait for the ACK of a slave library and then start to send confirmation to the client.

But there are problems:

1, the query falls on the slave library that did not receive the latest log, resulting in an expired read.

2. In the peak period of business update, the loci or GTID set of the master library is updated rapidly, and the equivalence judgment of the two loci is always untenable. It is likely that the query request cannot be responded to from the library

Wait for the main repository point scheme

select master_pos_wait(file, pos[, timeout]);
Copy the code

The logic of this command is as follows:

1. Execute from the slave library

The file and pos arguments refer to the file name and location of the main library

3. Timeout Optional. If this parameter is set to a positive integer N, the function waits at most N seconds

Return a positive integer M, representing the number of transactions executed from the start of the command to the binlog position where file and pos are applied

The return value also has the following exception result:

1. NULL. During execution, an exception occurs on the standby database synchronization thread.

2. -1: The waiting time exceeds N seconds

3, 0, this position has already been executed

Using this method steps:

1, after the transaction trx1 is updated, execute show master status immediately to obtain the File and Position executed by the current master database

2. Select a slave library to execute the query

Select master_pos_wait(File,Position,1)

4, If the return value is a positive integer >=0, execute the query in the slave library

5. Otherwise, execute the query statement in the primary database.

Assuming that each select waits at most 1s on the slave library, if master_POS_wait returns an integer >=0 within 1s, then the result of the query executed from the slave library must contain TRX1 data.

If each slave library is delayed more than 1s, the query pressure is shifted to the primary library.

However, in order not to allow expired read, there are only two methods: 1, timeout discard 2, go to the primary database query

GTID scheme

 select wait_for_executed_gtid_set(gtid_set, 1);
Copy the code

The command logic is:

1. Wait until the transaction executed by the library contains the gtid_set passed in, returning 0

2. Return 1 after timeout

The execution process of GTID is as follows:

1. After the transaction trx1 is updated, the transaction GTID is directly obtained from the return package, denoted as gTId1

2. Select a slave library to execute the query

Select wait_for_executed_gtid_set(gtid1,1) from wait_for_executed_gtid_set(gtid1,1)

4. If the return value is 0, execute the query from the slave library

5. Otherwise, execute the query statement in the primary database

MySQL > return package with GTID after transaction

(1. Set session_track_gTIds to OWN_GTID 2. Mysql_session_track_get_first = mysql_session_track_get_first

MySQL: How to Solve the out-of-date read problem with read/write separation

directory

Two kinds of architecture

Forcibly remove the master library scheme

Determine the no-delay scheme in active/standby mode

Wait for the main repository point scheme

Related Posts