Innodb transaction isolation implementation principle, do you understand? But many students are still confused after watching it. Recently, I made a study plan with a small team. I chose the way to answer the questions and found a topic for us to study together. The effect was very good.

By sharing this question, you can also check to see if you understand isolation. Through this question, you can master three parts:

  • The function of Undo log

  • Snapshot read and current read

  • Visibility rule

The default isolation level for MySQL is repeatable reads.

The title

Suppose we have a table mvcccase with two rows of (2,2)(3,3)

CREATE TABLE `mvcccase` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT 'on the ID'.`value` bigint unsigned NOT NULL DEFAULT '0' COMMENT 'value',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB

Copy the code

Start two transactions and do the following:

SessionA SessionB
start transaction; start transaction;
select * from mvcccase; (1) select * from mvcccase; (2)
update mvcccase set id = 4 where id =2; (3)
select * from mvcccase; (4)
select * from mvcccase; (5)
commit; (6)
select * from mvcccase; (7)
update mvcccase set id=5 where id = 4; (8)
select * from mvcccase; (9)
commit; (10)
select * from mvcccase; (11)

There are 11 operations in total. If you can answer the result of each operation correctly, you have mastered it well.

Basic knowledge of

To answer these questions, recall some key points.

Undo log

Undo Log was mentioned in how InnoDB Redo, undo, and binlog work together. The Undo Log is recorded before each row is operated on, so that the data can be rolled back.

The same transaction changes the data several times or multiple transactions change the same data. These changes are linked in chronological order. Therefore, undo log can be used to discover the history of data changes.

For a row of data, there is a hidden DB_TRX_ID field that records which transaction changed the row.

Snapshot read and current read

With Undo Log, we know that a row of data may have multiple versions.

Select lock in share mode, select for update; Update, INSERT, and delete are all current reads.

Snapshot reads may not necessarily read the latest version of data, but may read the previous historical version. Select is a snapshot read.

Visibility rule

A snapshot of the current database system is generated at the moment that a snapshot of the transaction is read.

The algorithm for generating a snapshot is relatively simple. When generating a snapshot, three data are generated: the ID array of the active transaction, the low water level, and the high water level. The low water level is derived from the minimum value of the array, and the high water level is calculated from the maximum transaction ID+1.

If the active id is 9,12,14, and the current maximum transaction id is 16, then the low water level is 8, the high water level is 17, and the active transaction array is {9,12,14}, as shown below:

  1. If it falls in green, the version is a committed transaction and the data is visible;

  2. If the transaction ID is its own value, the data is visible;

  3. If it falls in red, it indicates that the version was generated by a future transaction and is definitely not visible.

  4. If it’s in yellow, there are two things

    A. If row trx_id is in an array, this version was generated by a transaction that has not yet committed.

    B. If row trx_id is not in the array, this version is generated by a committed transaction.

If the current version is not visible, it looks for the previous version based on the linked list and judges the visibility rule again until it finds visible data.

The problem solving

Assume that the transaction ids of sessionA and sessionB are 9,12, (2,2), (3,3) and the values are executed by transaction 6.

SessionA-9 SessionB-12
start transaction; start transaction;
select * from mvcccase; (1)

(2,2)(3,3)
select * from mvcccase; (2)

(2,2)(3,3)
update mvcccase set id = 4 where id =2; (3)

Rows matched: 1 Changed: 1 Warnings: 0
select * from mvcccase; (4)

(3, 3) (4, 2)
select * from mvcccase; (5)

(2,2)(3,3)
commit; (6)
select * from mvcccase; (7)

(2,2)(3,3)
update mvcccase set id=5 where id = 4; (8)

Rows matched: 1 Changed: 1 Warnings: 0
select * from mvcccase; (9)

(2) (3, 3) (5, 2)
commit; (10)
select * from mvcccase; (11)

(5, 2) (3, 3)

1, 2,

(1)(2) Is a snapshot read, and the data in the DB is:

According to the visibility rules, transactions 9 and 12 can see the data from transaction 6, so the results are (2,2)(3,3).

3

The update operation for transaction 12 does two things

  • Delete the row where ID=2 as deleted

  • Create a new ID=4

The data in the DB is in the following format:

4

After the select command is executed, (3,3)(4,2) is displayed. According to the visibility rule, rows with ID=2,3, and 4 are visible to transaction 12, but ID=2 has been removed, so only (3,3)(4,2) is displayed.

5

Select (2,2)(3,3)

  • If ID=2, the latest record is changed by transaction 12, which is not visible to transaction 9. Therefore, it can only be visible after backtracking to find the version changed by transaction 6

  • The value ID=3 is always the version changed by transaction 6

  • If ID=4, only the version changed by transaction 12 is not visible

6

Only 12 submitted after the transaction, the other affairs can update ID = 2, 4, 12 to write locks, because the transaction other affairs to operation, will be blocked. As for the lock part, we will find examples to explain it later.

7

(2,2)(3,3) is still shown, although transaction 12 committed, the data in the table has not changed. The analysis process is the same as operation 5.

8

Although ID=4 is not shown in the transaction 7 snapshot, the update succeeds. Why?

  • Write lock on ID=4 has been released because SessionB committed.

  • Update =4; update =4;

The update operation for transaction 7 does two things:

  • Delete the row where ID=4 as deleted

  • Create a new ID=5

After the update, the data in the DB is in the following format:

9

Why does (2,2)(3,3)(5,2) produce such a strange result?

There are now four pieces of data in the table according to the visibility rule:

  • ID=2, the latest record is changed by transaction 12, not visible to transaction 9, so we backtrack to find the version changed by transaction 6, and the value is (2,2).

  • The value ID=3 is always the version changed by transaction 6

  • If ID=4, the latest version of transaction 9 is visible, but it has been deleted, so it is not displayed

  • If ID=5, the latest version of transaction 9 is visible

10

After sessionA commits, the Undo Log is also destroyed, and the final DB data is:

11

Select (3,3), (5,2), (3,3), (5,2);

conclusion

Hopefully, this topic will help you understand the isolation of transactions. For this kind of problem, mainly rely on the three axe: undo log, snapshot read and current read, visibility rules, once master the principle, you can solve the problem.

The final question is, if instead of changing the ID, you change the value in the table, what will happen at each step? Welcome to discuss.

The last

If you like my article, you can follow my public account (Programmer Malatang)

My personal blog is shidawuhen.github. IO /

Review of previous articles:

  1. Design patterns

  2. recruitment

  3. thinking

  4. storage

  5. The algorithm series

  6. Reading notes

  7. Small tools

  8. architecture

  9. network

  10. The Go