This is the sixth day of my participation in the August More text Challenge. For details, see: August More Text Challenge
I wrote a long time ago about atomicity, consistency and persistence of transactions. This article will talk about the implementation of Innodb transaction isolation. The default isolation level is repeatable read. Mainly talk about the overall context, after the context is clear, details can be supplemented by themselves.
1. Knowledge
To understand how transaction isolation works, there are a few things you need to know.
1.1 the snapshot read
Snapshot reads: Also known as consistent non-locking reads or consistent reads, that is, non-blocking reads without locking
Prerequisite: Snapshot reads are performed only when the isolation level is not serial. Snapshot reads at serial level degrade to the current read
Advantages: Based on the consideration of improving concurrency performance, the lock operation is avoided and the overhead is reduced
Implementation: The implementation of snapshot read is based on multi-version concurrency control, namely MVCC
Effect: Since it is based on multiple versions, the snapshot read may not be the latest version of the data, but may be a previous historical version
Syntax: General select is snapshot read
1.2 the current reading
Effect: The latest version of the record is read
Implementation: when reading, it is also guaranteed that other concurrent transactions cannot modify the current record. The record read will be locked
Syntax: select lock in share mode, select for update; Update, INSERT, DELETE (exclusive lock). They all need to be locked.
1.3 transaction ID
Each transaction in InnoDB has a unique transaction ID called transaction ID. It is applied to InnoDB’s transaction system at the beginning of the transaction, and it is applied in strict ascending order.
1.4 the Undo Log
Undo Log is mentioned in how InnoDB Redo, Undo, and Binlog work together. The Undo Log is recorded before each row of data is operated on, so that the data can be rolled back.
If the same transaction modifies the data multiple times or multiple transactions modify the same data, these changes will be linked in chronological order, so the undo log can discover the history of data modification.
Undo logs don’t always exist. There are two types of Undo logs:
Insert undo log: Represents the undo log generated when a transaction inserts a new record. It is required only when the transaction is rolled back and can be discarded immediately after the transaction is committed
Update undo log: Undo log generated when a transaction is updating or deleting. Not only for transaction rollback, but also for snapshot read; Therefore, you cannot delete logs at will. The purge thread deletes logs only when the snapshot read or transaction rollback is not involved
1.5 line structure
For each row of the table, in addition to our custom fields, there are implicitly defined DB_TRX_ID,DB_ROLL_PTR,DB_ROW_ID and other fields.
DB_TRX_ID: 6byte, transaction ID last modified (modified/inserted) Record ID of the transaction that created the record/last modified the record
DB_ROLL_PTR: 7byte, a pointer to the previous version of the record (stored in the ROLLBACK segment), essentially utilizing undo log capabilities
DB_ROW_ID: 6 bytes. If the table has no primary key, InnoDB will automatically generate a cluster index using DB_ROW_ID
1.6 Consistency View (read-view)
1.6.1 definition
At the moment a snapshot is read from the transaction, a snapshot of the current database system is generated. Note that this snapshot is based on the entire library. The ids of the current active transactions in the system are recorded and maintained. The minimum transaction ID is denoted as the low watermark, and the maximum transaction ID that has been created in the current system plus 1 is denoted as the high watermark. They form a read-view of the current transaction.
1.6.2 Generation time
The begin/start transaction commands are not the starting point of a transaction. The transaction starts only after the first statement that operates the InnoDB table. If you want to start a transaction immediately, use the start transaction with consistent snapshot command.
In the first boot mode, the consistency view is created when the first snapshot read statement, such as select, is executed.
In the second startup mode, the consistency view is created when start Transaction with consistent snapshot is executed.
1.7 MVCC
MVCC is called Multi-Version Concurrency Control. MVCC is a kind of concurrency control method, generally in the database management system, to achieve concurrent access to the database, in the programming language to achieve transaction memory.
2. Data visibility rules
Take a look at the data visibility rules in terms of the Undo Log and consistency views mentioned above.
-
If it falls in the green area, the version is a committed transaction and the data is visible.
-
If the transaction ID is its own value, the data is visible.
-
If it falls in red, the version is generated by a future transaction and is definitely not visible.
-
If it falls in yellow, there are two cases
A. If row trx_id is in an array, this version is generated by an uncommitted transaction and is not visible.
B. If row trx_id is not in the array, this version is generated by a committed transaction.
If the current version is not visible, the previous version is looked for against the linked list, and the visibility rules are judged again until visible data is found.
By following this process, it is clear which value the transaction will read and how transaction isolation is implemented. You can also see that the snapshot read is not a copy of all the data, but is restricted by visibility rules.
On the visibility rule, let’s look at this example. Select (id:1,k:1); select (id:1,k:1); select (id:1,k:1);
3. The row locks
Transaction B finds 3 and transaction A finds 1.
It is easy to understand if transaction A looks up 1. According to the visibility rule, transaction B and transaction C’ are both unstarted transactions for transaction A, so when transaction A looks up, it will keep backtracking and find (1, 1).
Transaction B is interesting. Shouldn’t it also be 1? Why 3? There are two details in this.
-
Select lock in share mode, select for update; Update, INSERT, and DELETE (exclusive locks) are all current reads that lock the row that has been modified. Transaction B’s update cannot continue until transaction C’ commits, otherwise transaction B blocks.
-
DB_TRX_ID specifies the transaction ID of transaction B, and the value of DB_TRX_ID is changed to 3 according to the visible rule.
So two examples can be inferred:
-
Select * from t where id=1; select * from t where id=1; This often leads people to think that snapshot reads fail, but this is in accordance with data visibility rules.
-
Select k from T where id=1 lock in share mode select k from T where id=1 lock in share mode If transaction B commits, lock in share mode will be blocked.
4. To summarize
The core of repeatable reads is consistent read; When a transaction updates data, only the current read can be used. If the current record row lock is occupied by another transaction, a lock wait is required.
Transaction isolation is concatenated by consistent reads, current reads, and row locks.
data
-
MVCC multi-version concurrency control
-
Mysql-innodb transaction-consistent read (snapshot read)
-
Mysql Consistency View (MVCC)
-
MySQL45 speak
The last
If you like my article, you can follow my public number (programmer Malatang)
My personal blog is shidawuhen.github. IO /
Review of previous articles:
-
Design patterns
-
recruitment
-
thinking
-
storage
-
The algorithm series
-
Reading notes
-
Small tools
-
architecture
-
network
-
The Go