Transaction characteristics
-
Atomicity
A transaction must be treated as an indivisible, minimal unit of work in which all operations are either committed successfully or rolled back on failure.
-
Consistency
A database always transitions from one consistent state to another, and a transaction must be in a consistent state both before and after execution.
-
Isolation
Typically, changes made by one transaction are not visible to other transactions until they are finally committed. Databases provide multiple levels of isolation for transactions.
-
They offer “Durability”
Once a transaction commits, its changes are permanently stored in the database. Even if the database system encounters a failure, it will not be lost.
Problems with concurrent transactions
- Dirty read
A transaction is modifying a record whose data is in an inconsistent state before the transaction completes and commits. At this point, another transaction reads the same record, and if left unchecked, the second transaction reads the “dirty” data and performs further processing accordingly, resulting in uncommitted data dependencies.
time | A transaction | Transaction B |
---|---|---|
T1 | Open the transaction | Open the transaction |
T2 | The query account balance is 1000 | |
T3 | Top up 500 and change the balance to 1500 | |
T4 | Query balance is1500 | |
T5 | Undo the transaction and change the balance back1000 | |
T6 | Remit 500 and change the balance to2000 | |
T7 | Commit the transaction |
- Unrepeatable read
At some point after reading some data, a transaction reads previously read data again, only to discover that the read data has changed or that some records have been deleted.
time | A transaction | Transaction B |
---|---|---|
T1 | Open the transaction | Open the transaction |
T2 | Select * from user where user_id=100 Assume xiaoming’s user information | |
T3 | Example Change the age of the user whose user_id is 100 to18 | |
T4 | Commit the transaction | |
T5 | The user is found againAge changethe | |
T6 | . | |
T7 | Commit the transaction |
- Phantom read
A transaction re-reads previously retrieved data under the same query criteria, only to find that another transaction inserts new data that meets its query criteria.
time | A transaction | Transaction B |
---|---|---|
T1 | Open the transaction | Open the transaction |
T2 | Select * from user where age=18 | |
T3 | Insert a new record into the user table with age=18 | |
T4 | Commit the transaction | |
T5 | A second query yields three records | |
T6 | . | |
T7 | Commit the transaction |
The difference between illusory and unrepeatable reads
- The point of non-repeatable reads is modification: in the same transaction, the same condition, the first and second read data is inconsistent (other transactions have committed changes in the middle).
- The focus of illusionary reading is to add or delete records: in the same transaction, the number of records read on the first and second time is different under the same condition (other transactions are committed to add or delete records in the middle).
Transaction isolation level
The SQL standard defines four classes of isolation levels, each of which specifies what changes are made within a transaction, which are visible within and between transactions, and which are not.
- Read Uncommited
All transactions can see the execution results of other uncommitted transactions, and this isolation level is generally not used.
- Read Committed (RC)
A transaction can only see changes made by committed transactions.
- Repeatable Read (RR)
Ensure that multiple instances of the same transaction see the same rows when reading data concurrently.
- Serializable
Fully serialized reads, each read requires a table-level shared lock, and reads and writes block each other.
Isolation level | Dirty read | Unrepeatable read | Phantom read |
---|---|---|---|
Read Uncommited | Yes | Yes | Yes |
Read Committed | No | Yes | Yes |
Repeatable Read | No | No | Yes |
Serializable | No | No | No |
Concurrent transaction solution
Dirty read, unrepeatable read and phantom read are all database read consistency problems, which need to be solved by the transaction isolation mechanism provided by the database.
(1) Lock mechanism
Resolve writ-write conflicts. Before reading data, lock it to prevent other transactions from modifying the data.
-
Pessimistic locking
Often rely on the locking mechanism provided by the database.
-
Optimistic locking
Most of them are based on the data version recording mechanism.
(2) MVCC multi-version concurrency control
Resolve read-write conflicts. A mechanism to generate a snapshot of consistent data at the point in time of the data request without locking, and use this snapshot to provide consistent reads at a certain level (statement level or transaction level). In this way, the read operation does not need to block the write operation, and the write operation does not need to block the read operation.
MVCC multi-version concurrency control
Most of Mysql’s transactional storage engine implementations are not simple row-level locking, but generally implement MVCC multi-version concurrency control based on concurrency performance considerations. MVCC is implemented by keeping a snapshot of the data at a point in time. The transaction sees the same data regardless of how long the transaction executes.
A read operation
There are two types of read operations: snapshot read and current read.
Snapshot read: A simple select operation is snapshot read and does not lock.
- select * from table where ? ;
Current read: special read operations, such as insert/update/delete operations, belong to current read and need to be locked.
- select * from table where ? lock in share mode ;
- select * from table where ? for update ;
- update table set ? where ? ;
- delete from table where ? ;
Data is stored
Each row of innoDB storage engine contains hidden fields: DB_ROW_ID, DB_TRX_ID, DB_ROLL_PTR, and DELETE_BIT.
- DB_TRX_ID: specifies the id of the transaction that last modified the row record. Internally, the delete operation is an update operation that updates the delete identifier bit DELELE_BIT in the row.
- DB_ROLL_PTR: Indicates the undo log that points to the current data. This pointer is used to retrieve the data before it was updated.
- DB_ROW_ID: contains a row ID that increases monotonically as new rows are inserted. When innoDB automatically generates a clustered index, the clustered index includes the value of this row ID. Otherwise, this row ID will not appear in any index.
- DELELE_BIT: indicates whether the record is deleted.
Data manipulation
-
insert
Create a record DB_TRX_ID for the current transaction ID and DB_ROLL_PTR for NULL.
-
delete
Set the current line DB_TRX_ID to the current transaction ID and DELELE_BIT to 1.
-
Update a new row DB_TRX_ID is the current transaction ID, DB_ROLL_PTR refers to the previous version of the record, and DB_ROLL_PTR is set to NULL after the transaction is committed.
-
select
1. Only look for records that were created before the current transaction ID to ensure that the current transaction reads rows that were created or modified by the current transaction;
2. When the DELETE BIT of the row is 1, find records that are deleted later than the current transaction ID to ensure that the row is not deleted before the current transaction starts.
Read consistency
Consistent reads in Mysql are implemented through the Read View structure. Read View is primarily used for visibility purposes. It maintains other currently active transactions that are not visible to this transaction. The earliest transaction ID is up_limit_id, and the latest transaction ID is low_limit_id.
trx_id_t low_limit_id;
/ *! < The read should not see any transaction with trx id >= this value. In other words, this is the "high water mark". */
trx_id_t up_limit_id;
/ *! < The read should see all trx ids which are strictly smaller (<) than this value. In other words, this is the "low water mark". */
Copy the code
How to understand low_limit_id
Can refer to zhihu this answer to understand. Low_limit_id should be the next transaction ID that has not yet been assigned by the current system (easier to understand from this semantics), which is the maximum transaction ID that has occurred so far +1.
How does MySQL achieve read non-blocking at RC isolation level? Ha ha a smile 100 mei sheng’s answer
Visibility judgment
Assuming that the last committed transaction ID of the row to be read (i.e. the stable transaction ID of the current data row) is trx_id, the visibility comparison process is as follows:
- Trx_id < up_limit_id => The last modification of this record before the read View is created jumps to Step 5;
- Trx_id > low_limit_id => The last modification of this record after the read View is created, jump to Step 4;
- Up_limit_id <= trx_id <= low_limit_id => Traverse from up_limit_id to low_limit_id. If trx_id is equal to one of the transaction ids, the last modification of this record has not been saved. Go to Step 4. Otherwise, go to Step 5.
- Assign the undo log DB_TRX_ID to the undo log to which the DB_ROLL_PTR pointer is pointing (the last modification of this record). Go to Step 1 and restart the calculation of visibility.
- If DELELE_BIT of this record is false, it indicates that the record has not been deleted. Otherwise, this record is not returned.
RR and RC isolation levels
Repeatable Read and Read Committed isolation levels are implemented based on Read View. The differences are as follows:
-
Repeatable Read
The read View is created immediately after the first SELECT statement in the transaction is executed, and all subsequent selectsreuse this object, so consistency is guaranteed with each read. (Repeatable semantics)
-
Read Committed
Each SELECT statement in a transaction creates a Read View so that it can read what other transactions have committed.
For InnoDB, Repeatable Read has a higher isolation level than Read Committed, but the overhead is relatively low.
The resources
Database transactions and MySQL transactions summary
Mysql-innodb-mvcc Multiversion concurrency control
MySQL InnoDB MVCC in-depth analysis
What is the difference between Optimistic Locking and MVCC?
How does MySQL achieve read non-blocking at RC isolation level?