Transactions, isolation, and MySQL locks cover most concurrency control scenarios. So let’s take a closer look at MySQL concurrency control.
First, let’s review transactions, isolation levels, and locks.
1. Review transactions, isolation levels, and locks
1.1. Transactions and transaction characteristics
A database transaction is a sequence of data operations that access and may operate on various data items, all of which may or may not be performed, and is an indivisible unit of work. A transaction consists of all database operations performed at the beginning of a transaction and before the end of a transaction.
Transaction characteristics:
- Atomicity refers to the fact that a transaction is an indivisible unit of work in which all or none of the operations occur.
- Consistency, the transaction must change the database from one consistent state to another;
- Isolation: the execution of a transaction cannot be disturbed by other transactions, that is, the operations and data used within a transaction are isolated from other concurrent transactions, and the concurrently executed transactions cannot interfere with each other.
- Durability, changes to data in the database are permanent and cannot be rolled back once transactions are committed.
1.2. Isolation level
MySQL implements standard isolation levels:
- Read uncommitted (read_uncommited), which allows other transactions to read uncommitted modification data of the current transaction;
- Read Committed (read_commited), allowing other transactions to read the current transaction committed changes;
- Repeatable_read (REPEATable_read) ensures that a transaction can read the same value from a field many times, that is, other transactions are not allowed to modify the value before the current transaction is committed.
- Serializable ensures that a transaction can read the same data from the table multiple times, and that no other transaction can modify (add, modify, or delete) the table during the execution of the current transaction.
Problems with database concurrency:
- Dirty reads are changes in a transaction that are visible to other transactions even if they are not committed.
- Non-repeatable Read (NON-repeatable Read), the transaction can Read the changes that have been committed, which will cause the difference between the previous and previous queries if there are multiple queries in the transaction.
- Phantom Read. When a transaction reads a record in a range, another transaction inserts a new record in that range. When a transaction reads a record in that range again, a Phantom Row is generated.
Serialization solves all problems with concurrent operations, but only one update operation (add, modify, delete) can be performed at a time.
Repeatable reads solve the problem of non-repeatable reads (oddly, HHH). Read commits solve the problem of dirty reads, read commits solve nothing.
1.3. MVCC
MVCC, multi-version concurrency control, is a key technology to solve the problem of unrepeatable reads. When a transaction is started, the transaction ID is recorded. Then, if the data is updated by other transactions, the current data can be used to push back the data when the transaction is started through undo log. This is called “consistent read”.
“Consistent reads” apply to query statements (things in which no data is updated).
On the contrary, there is “current read”, for the scenario where there is an update statement in the transaction, before the update statement, the query (read the query of the committed type), and then the update operation. The main purpose of this operation is to avoid the data modified by other committed transactions being overwritten.
1.4. Table and row locks
Table and row locks are the two most basic types of locks in MySQL (note that row locks are implemented by the storage engine itself, some storage engines do not support them, such as MyISAM). Table locks and row locks are easy to understand by definition. Based on read/write features, there are four types of locks:
Table locks | Row locks | |
---|---|---|
Read lock (shared lock) | Share – Table lock | Share – Row lock |
Write lock (exclusive lock) | Exclusive – table locks | Exclusive – Row lock |
Exclusive locks exclude any type of lock, so shared locks can only be shared to shared locks. .
Read lock (shared lock) | Write lock (exclusive lock) | |
---|---|---|
Read lock (shared lock) | Square root | x |
Write lock (exclusive lock) | x | x |
1.5 a deadlock
A deadlock is a vicious cycle in which two or more transactions occupy each other’s resources and request to lock each other’s resources.
Deadlock scenarios typically involve holding on to resources while simultaneously robbing each other for resources.
2. Phantom reads
Now that you have reviewed the general content of the previous three chapters, you should remember them all (I also have more than 1600 words). So let’s get to the main topic of this time, phantom reading, gap locking and next-key Lock. First let’s look at illusory reading.
The concept of phantom reading has already been explained. If you don’t understand it, let’s look at an example. Suppose we have the following table and insert the following data:
CREATE TABLE `test` (
`a` int NOT NULL,
`b` int DEFAULT NULL,
`c` int DEFAULT NULL.PRIMARY KEY (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `MySQL_45`.`test` (`a`, `b`, `c`) VALUES (1.1.1);
INSERT INTO `MySQL_45`.`test` (`a`, `b`, `c`) VALUES (3.3.3);
INSERT INTO `MySQL_45`.`test` (`a`, `b`, `c`) VALUES (5.5.5);
INSERT INTO `MySQL_45`.`test` (`a`, `b`, `c`) VALUES (7.7.7);
INSERT INTO `MySQL_45`.`test` (`a`, `b`, `c`) VALUES (9.9.9);
Copy the code
Now we enable two transactions A and B, with the following sequence (assuming only table and row locks are currently available) :
- T1 time, start A, B two transactions;
- At T2 time, transaction A query 3 <= C <= 5;
- At T3, transaction B inserts a data of A, B, and C are all 6;
- T4 time, commit transaction B;
- T5 time, transaction A query 3 <= c <= 5;
- T6 time, commit transaction A.
Earlier in MVCC and Repeatable Reads, we said that at the isolation level of repeatable reads, MySQL creates snapshots and queries data according to consistent read rules when using a query statement, and modifies data according to the current read rules when using an UPDATE statement.
Therefore, in transaction A, T5 time, the query results are 2, T6 time, the query results are 3, T6 time, the query results are more than expected, this is the phantom read.
As for actual execution, you can try it out.
In transaction A, we expect the data to be consistent without the update statement, but the illusion breaks the data consistency scenario, so how do we solve this problem?
Let’s take A look at this problem, which is essentially caused by the concurrent execution of transaction A and transaction B. For the concurrency problem, the easiest thing we can think of is to add A lock, but how to add this lock?
- Select * from c in (3, 5); select * from c in (3, 5); select * from c in (3, 5);
- Adding a table lock can solve this problem, but at the cost (from Gul ‘dan’s interrogation), concurrency performance is greatly reduced.
How does MySQL solve this problem?
3. Gap Lock and next-key Lock
Clearance lock
InnoDB has introduced a new type of lock called Gap Locks. Here’s how it’s defined:
A gap lock is a lock on a gap between index records, Or a lock on the gap before the first or after the last index record.
Simply put, this is a gap lock on an index record, or a lock on a gap before the first or after the last index record. In other words, gap lock, the lock is a space, gap lock is a front open after open interval.
Next-Key Lock
Next-key Locks, which consist of row Locks plus gap Locks, are a range of open front and closed back.
A next-key lock is a combination of a record lock on the index record and a gap lock on the gap before the index — MySQL 5.7 official Manual
Case 4.
Range queries
No index case
In this case, transaction B, either modified or inserted, cannot be executed even if it is not in the range of 3 <= C <= 5.
In the case of no index, add full table gap lock. This is also confirmed in the official manual:
If ID is not indexed or has a nonunique index, the statement does lock the preceding gap.
One point to mention here is that at the RC isolation level, locks are performed using non-index fields, such as:
select * from test where b = 5 for update;
update test set b = 6 where b = 5;
Copy the code
In this case, the entire table is scanned and each row is locked. If the row does not meet the conditions, the lock is released in time. However, at the RR isolation level, to ensure the order of records in the binlog, this condition locks the entire table and will not be released until the transaction ends.
Plain index (not unique)
Alter table test;
alter table `test` add index `c`(`c`);
Copy the code
In this example, transaction B can execute normally at time T3 and T4, and is blocked at time T5 and T6 (when tested, these two processes are unexecuted). Let’s draw the lock scope:
In fact, the Lock range is (1, 7), indicating a next-key Lock.
Primary key and unique index
Let’s go ahead and modify the table:
alter table `test` drop index `c`, add unique index `c`(`c`) using btree;
Copy the code
Continuing with the test content in the normal index, you can see that the results are the same, the locking range is (1,7), and the same open before closed interval.
Equivalent query
The above tests are based on range queries, but what about equivalent queries? Let’s do the next test, because the case of no index is full table scan lock, so in the process of equivalent query, we skip the case of no index.
Plain index (not unique)
Restore the data to its original condition and add a normal index to C.
The execution down is as follows:
It can be executed normally at T3, indicating that the locking range is (3, 5).
T4 is blocked, indicating the existence of clearance lock (3,5] and [5,7);
Under normal indexes, two gap locks (3,5) and [5,7) are locked, plus the row lock for a = 5.
Primary key and unique index
Next we upgrade C to a unique index. InnoDB’s Next Key Lock is changed from (3,7) to (c = 5). InnoDB’s Next Key Lock is changed from (3,7) to (c = 5).
Missed data
All of the above examples are hits, but what about misses? What if we lock the data for c = 4?
From the previous example, it is easy to imagine that in the case of a normal index, a gap lock (3,5) is added.
What about in the case of a unique index?
According to the previous example, if a hit, it will degenerate into a row lock, so there is no hit data here, how to lock here?
In effect, this too degenerates into a gap lock (3,5). You can try it by changing the type of C to double.
To sum up:
- Unique index, hit data, next-key Lock degenerates to row Lock;
- Unique index, missed data, next-key Lock degenerates into a gap Lock.
A gap lock conflict
Teacher Lin Xiaobin mentioned a sentence in “MySQL Actual Combat 45” :
A conflict with gap locking is the operation “insert a record into the gap”. There is no conflict between gap locks. — Lin Xiaobin, Talk 45 of MySQL In Action
This conclusion is also mentioned in the MySQL 5.7 Official Manual:
Gap locks in InnoDB are “purely inhibitive”, which means that their only purpose is to prevent other transactions from inserting to the gap. Gap locks can co-exist. A gap lock taken by one transaction does not prevent another transaction from taking a gap lock on the same gap. There is no difference between shared and exclusive gap locks. They do not conflict with each other, And they perform the same function.
To put it simply, gap locks can coexist, and there is no conflict between gap locks. What will conflict with gap locks is the insertion of data into the gap.
5. A little advice
I’ve already mentioned two optimizations for deadlocks:
- Logical optimization, put the statements that may generate locks as far as possible in the latter part of the transaction, reduce the lock holding time;
- MySQL > alter table lock; alter table lock; alter table lock;
Let me talk about recommendations for isolation levels and where conditional statements.
What isolation level is selected?
If you have a DBA, it’s up to the DBA. If not, it’s up to the company.
If you are a large company with high concurrency, such as Alibaba and Tencent, you will usually select RC and set binlog_format=row, as long as the business code is guaranteed to avoid unrepeatable reading.
For small – and medium-sized factories with a small number of concurrent services, the RR does not affect performance but has higher reliability.
Before a small and medium-sized enterprises, a to plan discussion, everyone will shout high concurrency, the performance of the distributed and data security issues, said, our architecture, let’s now is about 2 w deal every day, even on 8 PM and 9 PM this one hour, per second on average less than 6 pens, concurrent can afford it? Even if the volume doubles in two years, a dozen deals is not a problem?
How do I write where conditions
Only concurrent locking is considered here, and suggestions on query speed will be provided in the contents of the index.
The answer is obvious: indexes must be used in the condition after WHERE. The optimal choice is primary key indexes and unique indexes, whether RR or RC isolation level, which can be completely avoided in the business code (I have not yet worked on a project where I cannot use indexes).