preface
MySQL transaction isolation level: MySQL transaction isolation level: MySQL transaction isolation level: MySQL transaction isolation level: MySQL transaction isolation level
The transaction
What is a transaction?
Transactions, consisting of a limited sequence of database operations that are either all or none executed, are an indivisible unit of work.
If A transfers 100 yuan to B, first deduct 100 yuan from A’s account and then add 100 yuan to B’s account. If after deducting 100 yuan from A, there is no time to add it to B, the banking system will be abnormal, and finally the balance of A will decrease, but the balance of B will not increase. So you need A transaction to roll back A’s money, that’s it.
Four characteristics of transactions
- Atomicity: The transaction is executed as a whole, and all or none of the operations on the database contained within it are executed.
- Consistency: data will not be destroyed before and after the transaction. If account A transfers 10 yuan to account B, the total amount of account A and B will remain the same regardless of success or failure.
- Isolation: When multiple transactions are accessed concurrently, the transactions are isolated from each other. One transaction should not be disturbed by other transactions, and multiple concurrent transactions should be isolated from each other.
- Persistence: Indicates that operational changes made by the transaction to the database will be persisted in the database after the transaction is committed.
Problems with transaction concurrency
What is the problem with concurrent transaction execution, in other words, how can one transaction interfere with another? Let’s take an example
Suppose we now have tables:
CREATE TABLE `account` (
`id` int(11) NOT NULL,
`name` varchar(255) DEFAULT NULL,
`balance` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `un_name_idx` (`name`) USING BTREE ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Copy the codeCopy the code
Table has data:
Dirty read
Suppose we now have two transactions A and B:
- Assuming that A’s balance is now 100, transaction A is preparing to query Jay’s balance
- At this point, transaction B deducts Jay’s balance by 10
- Finally, A reads the balance after deduction
As can be seen from the figure above, transaction A and B are executed alternately, and transaction A is interfered by transaction B, because transaction A reads the uncommitted data of transaction B, which is called dirty read.
Unrepeatable read
Suppose we now have two transactions A and B:
- Transaction A queries Jay’s balance and finds 100
- At this time, transaction B deducts Jay’s account balance, deducts 10, and submits the transaction
- Transaction A then queries Jay’s account balance and finds that it’s 90
Transaction A is interfering with transaction B again! In transaction A, two identical queries that read the same record but return different data are called non-repeatable reads.
Phantom read
Suppose we now have two transactions A and B:
- Transaction A first queries the account whose ID is greater than 2, and then obtains two records with id=2 and ID =3
- At this point transaction B opens, inserts a record with ID =4, and commits
- Transaction A executes the same query again and returns three records with id=2,3,4.
Transaction A queries the result set of A range, another concurrent transaction B inserts/deletes data into the range and commits it silently, then transaction A queries the same range again and gets A different result set from the two reads. This is A phantom read.
Four isolation levels for transactions
Since there are problems with concurrent transactions such as dirty reads, non-repeatable, phantom reads, etc., what kind of transaction isolation levels does InnoDB implement?
- Read Uncommitted
- Read Committed
- Repeatable Read
- Serializable
Read Uncommitted
The best way to learn a knowledge point is to practice it. Let’s go to the database and give it a read uncommitted isolation level
Set transaction isolation level to Read uncommitted, start transaction A, and query data whose ID is 1
set session transaction isolation level read uncommitted;
begin;
select * from account where id =1; Copy the codeCopy the code
The results are as follows:
Set transaction isolation level to Read Uncommitted, start transaction B, and perform the update
set session transaction isolation level read uncommitted;
begin;
update account set balance=balance+20 where id =1; Copy the codeCopy the code
Select * from transaction A where id=1; select * from transaction A where id=1;
You can see that at the Read Uncommitted isolation level, a transaction can Read data that another transaction has not committed, which is a dirty Read problem. Transaction B has not committed to the database yet, and transaction A has read it. In fact, read uncommitted is the lowest level of isolation.
READ COMMITTED
To avoid dirty reads, the database has a higher isolation level than read uncommitted, that is, committed reads.
Set the current transaction isolation level to READ COMMITTED, start transaction A, and query the data whose ID is 1 in the account
set session transaction isolation level read committed;
begin;
select * from account where id =1; Copy the codeCopy the code
Open another window to open mysql, set transaction isolation level to Read COMMITTED, start transaction B, and do the following
set session transaction isolation level read committed;
begin;
update account set balance=balance+20 where id =1; Copy the codeCopy the code
Select * from transaction A; select * from transaction A; select * from transaction A;
Let’s go to transaction B’s window and commit:
commit; Copy the codeCopy the code
Finally return to transaction A window query, found that the data changed:
From this, it can be concluded that when the isolation level is set to READ COMMITTED, dirty reads are no longer a problem and the current transaction can only READ data COMMITTED by other transactions. But, if you think about it from transaction A’s point of view, are there other problems?
What is the problem with the isolation level of the commit read?
In the same transaction A, the same query SQL, read the same record (id=1), the read result is different, that is, cannot repeat read. As a result, when the isolation level is set to Read COMMITTED, there is a concurrency problem of unrepeatable reads.
Repeatable Read
What if your boss requires that the query results be consistent in the same transaction, that is, your boss asks you to solve the problem of unrepeatable concurrency? Can’t do it, boss? Practice the isolation level of Repeatable Read
Repeatable Read (repeatable read) : repeatable read (repeatable read) : repeatable read (repeatable read) : repeatable read (repeatable read) : repeatable read (repeatable read) : repeatable read (repeatable read
Does the RR level solve the illusionary problem?
Let’s take a look at a hot issue on the Internet, there is about RR level, whether to solve the illusion problem? Let’s put it into practice:
The query result set of step 2 and Step 6 does not change. It seems that the illusion problem has been solved at the RR level.
Update account set balance=200 where id=5; Select * from account where id>2; select * from account where id>2; SQL > update transaction; SQL > update transaction; SQL > update transaction
This question, my dear friend, do you think it is an illusory problem?
Serializable
Each of the previous three database isolation levels has some concurrency problems. Now practice the SERIALIZABLE isolation level.
Set transaction isolation level to Serializable, enable transaction A, and query account table data
set session transaction isolation level serializable;
select @@tx_isolation;
begin;
select * fromaccount; Copy the codeCopy the code
Set transaction isolation level to Serializable, open transaction B, insert a data:
set session transaction isolation level serializable;
select @@tx_isolation;
begin;
insert into account(id,name,balance) value(6.'Li'.100); Copy the codeCopy the code
The result is as follows:
As can be seen from the figure, when the database isolation level is set to Serializable, transaction B writes to the table while transaction A reads the table. In fact, this is the strictest isolation level, and concurrent reads and writes are not allowed. It guarantees the best security, but performance is a problem
MySql isolation level implementation principle
There are two main ways to implement an isolation mechanism:
- Read-write lock
- Consistency snapshot read, namely, MVCC
MySql uses different Locking strategies /MVCC to achieve four different isolation levels. The implementation principles of RR and RC are related to MVCC, RU and Serializable are related to locking.
Read Uncommitted
Official:
SELECT statements are performed in a nonlocking fashion, but a possible earlier version of a row might be used. Thus, using this isolation level, such reads are not consistent.
The read is not committed, the read is not locked principle.
- Transaction reads are unlocked and do not block reads and writes of other transactions
- Transaction writes block other transaction writes, but do not block other transaction reads;
Serializable
The official line:
InnoDB implicitly converts all plain SELECT statements to SELECT … FOR SHARE if autocommit is disabled. If autocommit is enabled, the SELECT is its own transaction. It therefore is known to be read only and can be serialized if performed as a consistent (nonlocking) read and need not block for other transactions. (To force a plain SELECT to block if other transactions have modified the selected rows, disable autocommit.)
- All SELECT statements are implicitly converted to SELECT… FOR SHARE, that is, add the SHARE lock.
- Read and shared lock, write and exclusive lock, read and write mutually exclusive lock. If rows are being modified by uncommitted transactions, all statements that select those rows will block.
Realization principle of MVCC
MVCC (Multi-version concurrency control) is a mechanism to reduce concurrent transaction conflicts and improve concurrency performance by reading historical version data. Its implementation relies on implicit fields, undo logging, snapshot Read & current Read, and Read View, so let’s get started.
Implicit field
For the InnoDB storage engine, each row has two hidden columns DB_TRX_ID and DB_ROLL_PTR, and a third hidden primary key column DB_ROW_ID if there are no primary keys or non-null unique keys in the table.
- DB_TRX_ID specifies the transaction ID of each row that was last modified (modified/updated). The size is 6 bytes.
- DB_ROLL_PTR. This hidden column is a pointer to the undo log of the rollback segment. The size is 7 bytes.
- DB_ROW_ID, monotonically increasing row ID, 6 bytes in size;
The undo log
When the transaction is not committed, the mirror of the modified data (the old version before the modification) is saved to the Undo log. So that when the transaction is rolled back, the old version data can be restored and the impact of uncommitted transaction data on the database can be cancelled. Undo logs are logical logs. You can assume that when a record is deleted, the Undo log records a corresponding INSERT record, and when an update record is updated, it records a corresponding reverse update record. The rollback section is where undo logs are stored.
When multiple transactions operate on a row in parallel, changes made to the row by different transactions generate multiple versions, and then an Undo log chain is linked by the rollback pointer (DB_ROLL_PTR).
Let’s look at ~ by example
mysql> select * from account ;
+----+------+---------+
| id | name | balance |
+----+------+---------+
| 1 | Jay | 100 |
+----+------+---------+
1 row in set (0.00SEC) copies the codeCopy the code
- Suppose the table Accout now has only one record, and the transaction Id of the record inserted is 100
- If transaction B (transaction Id 200) updates the row with Id =1 and changes the balance value to 90
After transaction B is modified, the Undo Log chain is as follows:
Snapshot read & Current read
The snapshot to read:
Select (); select (); select ();
select * from account where id>2; Copy the codeCopy the code
The current reading:
Read is the latest version of the record data, showing that the lock is the current read
select * from account where id>2 lock in share mode;
select * from account where id>2 forupdate; Copy the codeCopy the code
Read View
- Read View is the Read View generated when a transaction performs a snapshot Read.
- Transaction snapshot reads generate a snapshot of the current database system, record the active read and write transactions in the current system, and put them in a list.
- The Read View is used to determine which version of the data is visible in the current transaction
To facilitate the discussion of Read View visibility rules, let’s define a few variables
M_ids: the ids of active read and write transactions in the current system. The data structure is a List. Min_limit_id: m_IDS Specifies the minimum transaction ID in the transaction list. Max_limit_id: specifies the maximum transaction ID in the m_IDS transaction list
- If DB_TRX_ID < min_limit_id, this indicates that the transaction that generated this version was committed before the ReadView was generated (because the transaction ID is incremented), so this version can be accessed by the current transaction.
- If DB_TRX_ID > m_IDS is the largest transaction ID in the list, the transaction that generated this version was generated after ReadView was generated, so this version is not accessible by the current transaction.
- If min_limit_id =
Attention!! The biggest difference between RR and RC isolation levels is that RC generates a ReadView each time it reads data, while RR generates a ReadView only on the first time it reads data.
READ COMMITTED An analysis process in which unrepeatable reads exist
In my opinion, the best way to understand a new knowledge point is to analyze the context of the existing problem/phenomenon. RC is also related to MVCC, RC is a repeat read concurrency problem, so let’s analyze a wave of RC, first look at the implementation process
Suppose there are two transactions executing in the system, A and B, with transaction IDS 100 and 200 respectively, and suppose there are old data, insert transaction ID is 50 ha ~
Transaction A performs query 1 first
Transaction ID = Transaction ID100begin ; The query1: select *from account WHERE id = 1; Copy the codeCopy the code
Transaction B performs the update operation with id =1. The following undo log chain is recorded
begin;
update account set balance =balance+20 where id =1; Copy the codeCopy the code
Go back to transaction A and perform query 2
begin ; The query1: select *from account WHERE id = 1; The query2: select *from account WHERE id = 1; Copy the codeCopy the code
Query 2 Perform analysis:
- Transaction A regenerates A ReadView when executing the SELECT statement. Transaction B (200) is active, so ReadView’s m_IDS list is [200].
- As can be seen from the above undo log chain, the latest version of balance is 1000, and its transaction ID is 200, which is in the active transaction list, so the current transaction (transaction A) is not visible.
- Balance =100; transaction Id = 50; transaction Id = 50; transaction Id = 200
We go back to transaction B and commit with the undo log chain unchanged
begin;
update account set balance =balance+20 where id =1; Commit Copy codeCopy the code
Go back to transaction A again and perform query 3
begin ; The query1: select *from account WHERE id = 1; The query2: select *from account WHERE id = 1; The query3: select *from account WHERE id = 1; Copy the codeCopy the code
Query 3 Perform analysis:
- Transaction A regenerates A ReadView at the time of the SELECT statement. Since transaction B (200) has committed and is not active, ReadView’s m_IDS list contents are empty.
- So transaction A directly reads the latest record and reads the balance =120 version of the data.
So, this is how RC has the problem of unrepeatable reading. If you don’t understand something, you can read it several times
Repeatable Read An analysis to solve the Repeatable Read problem
How does RR isolation solve the problem of unrepeatable read concurrency
You might think two concurrent transaction examples are too easy, ok! Let’s get a little exciting and start three transactions
Suppose there are two transactions A, B, and C executing in the system with transaction ids 100, 200, and 300 respectively, and the transaction IDS of the stock data inserted are 50~
Transaction ID = Transaction ID100
begin ;
UPDATE account SET balance = 1000 WHERE id = 1; Copy the codeCopy the code
Transaction ID = Transaction ID200
begin ; // open a transaction, occupy the pit firstCopy the codeCopy the code
In this case, the following undo log chain is recorded in the account table with id =1:
Transaction ID = Transaction ID300
begin ;
// query 1: select * from account WHERE id = 1;Copy the codeCopy the code
Query 1 Execution process analysis:
- Transaction C creates a ReadView when executing a SELECT statement. Because transactions A (100) and B (200) are active, the ReadView m_IDS list is [100, 200].
- As can be seen from the undo log chain above, the latest version of balance is 1000, and its transaction ID is 100, which is in the active transaction list, so the current transaction (transaction C) is not visible.
- Query 1 returns balance=100 ~~ (balance=100, transaction Id = 50)
Next, we commit transaction A:
Transaction ID = Transaction ID100
begin ;
UPDATE account SET balance = 1000 WHERE id = 1; commit; Copy the codeCopy the code
In transaction B, perform the update operation and change the record balance of id=1 to 2000. After the update, undo log chain is as follows:
Transaction ID = Transaction ID200
begin ; // open a transaction, occupy the pit first
UPDATE account SET balance = 2000 WHERE id = 1; Copy the codeCopy the code
Go back to transaction C and execute query 2
Transaction ID = Transaction ID300
begin ;
// query 1: select * from account WHERE id = 1;
// query 2: select * from account WHERE id = 1;Copy the codeCopy the code
Query 2: Perform analysis:
- For query 2 at RR level, the ReadView is already generated, so the active transaction list is [100,200].
- As can be seen from the above undo log chain, the latest version of Balance is 2000, and its transaction ID is 200, which is in the active transaction list, so the current transaction (transaction C) is not visible.
- Moving on to the next version, the balance line is 1000 and the transaction Id is 100, which is also in the active transaction list, so the current transaction (transaction C) is not visible.
- Balance =100; balance=100; transaction Id = 50; transaction Id = 50; active transaction Id =100
Lock related concepts supplement (Attached) :
Shared locks and exclusive locks
InnoDB implements standard row-level locks, including two types: shared locks (s-locks) and exclusive locks (X-locks).
- Shared lock (S lock) : Allows a locking transaction to read a row.
- Exclusive lock (X lock) : Allows a locked transaction to update or delete a row.
If transaction T1 holds the s lock on row R, then another transaction T2 requests the lock on row R and does the following:
- T2’s request for the S lock is immediately granted, resulting in both T1 and T2 holding the S lock on row R
- T2 requesting x lock was not allowed immediately
If T1 holds the x lock on R, then T2 cannot request the X and S locks on r immediately. T2 must wait for T1 to release the X lock, because the X lock is incompatible with any locks.
Record Locks
- A record lock is the simplest row lock, holding only one row. SELECT c1 FROM t WHERE c1 = 10 FOR UPDATE
- A record lock is always placed on an index. Even if a table does not have an index, InnoDB implicitly creates an index and uses that index to enforce a record lock.
- Blocks other transactions from inserting, updating, or deleting it
Lock_mode X locks REc but not GAP
RECORD LOCKS space id 58 page no 3 n bits 72 index `PRIMARY` of table `test`.`t`
trx id 10078 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 00000000274f; asc 'O;; 2: len 7; hex b60000019d0110; asc ;; Copy the codeCopy the code
Gap Locks
- A gap lock is a lock placed between two indexes, either before the first index or after the last index.
- Using gap locks locks an interval, not just each piece of data in that interval.
- Gap locks only prevent other transactions from inserting into the gap, they do not prevent other transactions from acquiring a gap lock on the same gap, so gap X lock and GAP S lock have the same effect.
Next-Key Locks
- A next-key lock is a combination of a record lock and a gap lock. It refers to a lock placed on a record and the gap in front of that record.
Phantom read analysis exists at RC level
Since RC has illusory problems, we first cut to the RC isolation level and analyze a wave of ~
Suppose the account table has four entries.
- Start transaction A, perform current read, and query all records whose ID is >2.
- Then open transaction B and insert data with id=5.
- After transaction B successfully inserts data, modify the record whose id=3
- Go back to transaction A and execute the current read query with ID >2 again
- Transaction B can insert data with id=5, but cannot update data with ID =3, and is blocked. Prove the transaction A at the time of execution of the current read in id = 3 and id = 4 added A lock on this two records, but there is no to id > 2 locking ~ this range
- After transaction B is blocked, A deadlock occurs while cutting back to transaction A to perform the current read operation. Because the transaction at the time of the insert, B will be on the new record (id = 5) lock, so the transaction execution of the current read again, want to get id > 3 records, you need the id = three, four, five lock on this article 3 records, but the id = 5 this record has been transaction B locked, so the transaction by transaction B block, Transaction B is also waiting for transaction A to release the lock on id = 3, resulting in A deadlock.
As a result, we can see that in RC isolation level, select, UPDATE, delete, etc., the lock is using record lock, other transaction inserts can still be performed, therefore phantom read! MySQL interview 答 案 : MySQL interview 答 案 : MySQL interview
RR level solves illusionary analysis
Because RR is to solve the illusory problem, how to solve the problem, analyze a wave ~
Assume that the account table has four entries, RR.
- Start transaction A, perform current read, and query all records whose ID is >2.
- Then open transaction B and insert data with id=5. Transaction B blocks the insert operation because transaction A is performing select… Select * from share mode where id = 3 and id = 4; select * from share mode where id > 2;
For RR isolation, select, UPDATE, delete and other locked statements use a gap lock and a temporary key lock to lock ranges between index records and avoid inserting records between ranges to avoid phantom row records.
Original link: juejin.cn/post/684490…