How does MySQL implement repeatable reads?

Photo by picography.co

MySQL Practice 45 Notes

Simple comprehension can be repeated

Repeatable read means that the data seen during the execution of a transaction is always the same as the data seen at the start of the transaction.

At repeatable read isolation, a transaction takes a snapshot when it is started. Note that this snapshot is based on the entire library.

At this point, you might be thinking, if a library is 100 GIGABytes, then I start a transaction and MySQL has to copy 100 gigabytes of data out, how slow this process is. But my usual business takes place quickly.

In fact, we don’t need to copy 100 gigabytes of data. Let’s take a look at how snapshot is implemented.

Take a snapshot

Each transaction in InnoDB has a unique transaction ID called transaction ID. It applies to InnoDB’s transaction system at the start of a transaction and is applied in a strictly incremental order.

Each record is updated with an undo log, which records the transaction ID of the current transaction (row trx_id). The latest value on the record can be rolled back to the value of the previous state.

As shown in the figure below, the latest value of a row is K =22 after it is updated by multiple transactions. Assume that transaction A is started after the transaction trx_id=15 is committed. When transaction A wants to read the row, undo log is used to calculate the value of the row at the moment the transaction is started as k=10.

At the repeatable read isolation level, InnoDB constructs an array of transaction ids that are currently “active” at the time a transaction is started. Active means it’s started but not yet committed.

The minimum transaction ID in the array is the low watermark, and the maximum transaction ID that has been created in the current system is the high watermark.

This array of views, together with the high water level, constitutes a read-view of the current transaction.

This view array divides all row trx_id into several different cases.

If trx_id is less than the low watermark, the version was committed before the transaction started.
If trx_id is greater than high water, this version is generated after the transaction is started and is not visible.
If trx_id is greater than the low water level and less than the high water level, there are two cases:
1. If trx_id is in the array, this version has not been committed at the time of transaction startup and is not visible.
2. If trx_id is not in the array, this version is already committed when the transaction is started.

InnoDB uses the combination of undo log and trx_ID to create snapshots in seconds at the start of a transaction.

Take a chestnut

Initialization statement

CREATE TABLE `t` (
  `id` int(11) NOT NULL.  `k` int(11) DEFAULT NULL.  PRIMARY KEY (`id`)
) ENGINE=InnoDB;
insert into t(id, k) values(1.1), (2.2); Copy the code

The following table shows the execution flow of transactions A, B and C

A transaction	Transaction B	Transaction C
START TRANSACTION WITH CONSISTENT SNAPSHOT;
	START TRANSACTION WITH CONSISTENT SNAPSHOT;
		UPDATE t SET k=k+1 WHERE id=1;
	UPDATE t SET k=k+1 WHERE id=1;
	SELECT k FROM t WHERE id=1;
SELECT k FROM t WHERE id=1;
COMMIT;
	COMMIT;

We assume that the TRx_id of transactions A, B, and C is 100, 101, and 102, respectively. Only 99 transaction ids were active before transaction A started, and trx_id=90 for the row ID =1.

According to the hypothesis, we get the view array of transaction start moment: transaction A: [99, 100], transaction B: [99, 100, 101], transaction C: [99, 100, 101, 102].

Trx_id =102; trx_id=102;
Trx_id =101; trx_id=101; trx_id=101;
Transaction B finds that the latest record is 3 and trx_id=101 through the query statement, which meets the isolation conditions and returns k=3.
Transaction A uses A query statement:
1. The latest record is 3, trx_id=101, which is higher than the high watermark and invisible.
2. Using the undo log command, you can find the previous historical version, trx_id=102, which is larger than the high watermark and cannot be seen.
3. Trx_id =90, lower than low water level, visible.

Question: why can we see changes in transaction C when transaction B is updated?

If we assume that transaction B is updating and cannot see transaction C’s changes, what happens?

Transaction B detects that the latest record is 2, trx_id=102, and the watermark is larger than the high watermark.
Using the undo log command, you can find the previous version, trx_id=90, which is smaller than the low watermark.
Return k=1, execute k=k+1, update k to 2, trx_id=101.

If this is the case, transaction C may be confused: “why did MY update get lost?” Transaction B overrides the update for transaction C.

Therefore, InnoDB uses a rule when updating data: data is read before it is written, and this read can only read the current value, called “current read”.

Therefore, transaction B takes the latest data when it updates and makes updates based on it. (trx_id=101); (trx_id=101);

Let’s imagine another scenario:

After transaction B is updated, transaction C is immediately updated, transaction B is rolled back, and transaction C commits successfully.

Transaction B	Transaction C
START TRANSACTION WITH CONSISTENT SNAPSHOT;
	START TRANSACTION WITH CONSISTENT SNAPSHOT;
UPDATE t SET k=k+1 WHERE id=1;
	UPDATE t SET k=k+1 WHERE id=1;
	SELECT k FROM t WHERE id=1;
ROLLBACK;
	COMMIT;

If the current read definition is followed, the following accident occurs, assuming the current K=1:

Transaction B updates k to 2;
Transaction C reads the current latest value, k=2, updated to 3;
Transaction B rollback;
Transaction C commits.

At this point, transaction C finds that it wants to perform a +1 operation, which turns out to be a “+2” operation.

InnoDB does not allow this to happen. When transaction B executes an update statement, it will lock the row and release the lock until transaction B ends.

summary

InnoDB row data has multiple versions, each with row trx_id.
The transaction builds a consistent view that meets the current isolation level based on undo log and TRx_ID.
The core of repeatable reads is consistent reads, and when a transaction updates data, it can only use the current read. If the row lock of the current record is occupied by another transaction, it needs to enter the lock wait.

reference

03 | transaction isolation: why you changed I still can’t see? – Geek time

08 | transaction is isolated or not? – Geek time

This article was first published on my personal blog Chaohang.top

The author is Zhang Xiaochao

Public account [Super Super can’t fly]

Please indicate the source of reprint

Please follow my wechat official account [Chaochao Won’t fly] to get the latest updates.

This article is formatted using MDNICE

How does MySQL implement repeatable reads?

Simple comprehension can be repeated

Take a snapshot

Take a chestnut

Question: why can we see changes in transaction C when transaction B is updated?

summary

reference

Related Posts

JDBC — Use JDBC to connect to the MySQL database

New Year’s Eve Countdown posters in Python: Are you ready for the New Year?

After more than a month’s effort, the frequency of FGC was optimized by 400 times