Look at the following SQL statement:
table T (id int, name varchar(20))
Delete from T where id = 10;
MySQL > select * from ‘lock’; Look at the following statement:
Select * from T where id = 10;
What about this statement? There’s too much knowledge involved. To answer these two questions, you need to know something first. InnoDB is the default storage engine for MySQL. The Concurrency Control protocol for InnoDB is MULTI-version Concurrency Control (MVCC). The opposite of MVVC is the Lock-based Concurrency Control. The biggest advantage of MVCC is: read without lock, read and write without conflict. In OLTP applications with more read and less write, read and write conflict is very important, which greatly improves the concurrency performance of the system. At present, almost all RDBMS support MVCC. In fact, MVCC is summed up in one sentence: a way to temporarily store multiple versions of the same data, thereby achieving concurrency control. Current read and Snapshot read In MVCC concurrency control, read operations can be divided into two types: snapshot read and current read. Snapshot read (simple SELECT operation) : Reads the visible version (possibly historical version) of the record without locking. There you have the answer to the second question. Current read (special SELECT operations, INSERT, DELETE, and UPDATE) : The latest version of the record is read, and any record returned by the current read is locked to ensure that no other transaction can modify the record concurrently. A clustered index is also called a clustered index. In InnoDB, data is organized in a clustered index: complete records are stored in a primary key index, through which all columns in the record can be retrieved. The left-most prefix rule applies to composite indexes and prefix indexes. Select * from (a, b, c, d) where (a, b, c, d) = 1 and b = 2 and C > 3 and d = 4; select * from (a, b, C, d) where (a, b, C, d) = 4; D (a, b, d, c); d (a, B, d, c); (a = 1 and b = 2 and c = 3); (a, b, c); (a, b, c); A principle of traditional RDMS Locking is 2PL(two-phase Locking). That is to say, the lock operation is divided into two stages: the lock stage and unlock stage, and ensure that the lock stage and unlock stage do not want to cross. In other words, no matter how many entries are added, deleted or changed in a transaction, the locks are added in the lock phase. After the commit, all entries will be unlocked in the unlock phase. Isolation Levels In MySQL/InnoDB, four isolation levels are defined: Read Uncommitted: Uncommitted records can be Read. This isolation level is not used. Read Committed (RC) : For current reads, the RC isolation level guarantees that Read records are locked (record locks), and phantom reads occur. Repeatable Read (RR) : For the current Read, the RR isolation level ensures that the Read record is locked (record lock) and the Read range is locked. New records that meet the query conditions cannot be inserted (gap lock) and there is no phantom Read. Serializable: Degenerates from MVCC concurrency control to lock-based concurrency control. There is no difference between snapshot read and current read. All read operations are current read, read plus read lock (S lock), and write plus write lock (X lock). At this isolation level, read and write conflicts occur, so concurrency performance deteriorates dramatically and is not recommended in MySQL/InnoDB. A full row Lock in InnoDB consists of three parts: Record Lock: A Record Lock locks a Record in an index. Gap Lock: A Gap Lock either locks the value in the middle of an index record, the value before the first index record, or the value after the last index record. Next-key Lock: A combination of a record Lock on an index record and a gap Lock that precedes the record. With that in mind, let’s move on to the first question. When you look at this question, you probably don’t hesitate to say, add locks. It’s both wrong and right, because we know so little. So what are the prerequisites that we need to know?
Premise 1: Is the ID column a primary key?
Premise two: What is the current system isolation level?
If the id column is not a primary key, do you have an index on the ID column?
Premise 4: If there is a secondary index on the ID column, is it unique?
Premise five: What is the SQL execution plan? Index scan? Full table scan again
According to the above prerequisites, there are nine possible combinations, but I haven’t listed them all yet.
The ID column is the primary key, RC isolation level
The ID column is a secondary unique index, RC isolation level
The ID column is a secondary non-unique index, RC isolation level
No index on id column, RC isolation level
The ID column is the primary key, RR isolation level
The ID column is a secondary unique index, RR isolation level
The ID column is a secondary non-unique index, RR isolation level
No index on id column, RR isolation level
Combination 1: id primary key + RC this is the simplest analysis, to execute the statement, only need to the primary key id = 10 records X lock. The following figure
Select * from primary key where id = 10; select * from primary key where id = 10;
In Repeatable Read (Repeatable Read), Repeatable Read (Repeatable Read) does not commit (Repeatable Read). How does MySQL lock the above statement? See below:
This combination looks very similar to combination three, but is very different. A Gap lock is added to the modified combination. This Gap lock is the key to preventing illusory reading at RR level compared to RC level. In essence, a Gap lock is not for the record itself, but for the Gap between the records. The so-called phantom read is the same transaction, continuous current read, and read a range of records (including direct query of all record results or aggregation statistics), found inconsistent results (standard file generally refers to the increase of records, the reduction of records should also be regarded as phantom read). So how to solve this problem? How to ensure that multiple current reads return consistent records requires that no new records that meet the condition are inserted and committed by other transactions between multiple current reads. In order to achieve this result, Gap lock came into being. As shown in the figure, there are places where new records that meet the condition can be inserted. Considering the orderliness of B+ trees, records that meet the condition must be continuous. [4, b], [10, c], [10, d], [20, e] Insert (10, aa); Insert (10, aa); Insert (10, AA); Insert (10, AA); Insert (10, AA); Therefore, through the first current read, X locks will be added to the records that meet the condition, and three Gap locks will be added to Insert the three Gap locks that may meet the condition, to ensure that the subsequent Insert cannot Insert new records that meet the id = 10, and thus solve the phantom read problem. In combination 5, combination 6, the same RR level, but do not add Gap lock, combination 5 id is the primary key, combination 6 ID is Unique key, can guarantee the uniqueness. An equivalent query can only return one record that meets the condition at most, and new records of the same value cannot be inserted. Conclusion: In RR isolation level, id column has non-unique index, for SQL statement above; First, locate the first record that meets the condition through the ID index, add X lock to the record, and add Gap lock to the Gap lock, and then add X lock to the record that meets the same condition on the primary key cluster index, and return; Then read the next record and repeat. At this point, there is no need to add X lock to the record, but do you need to add Gap lock to the Gap, and finally return the result.
Select * from id where id does not have an index and only full table scan can be performed.
As shown in the figure, this is a very scary thing, the whole table each record to add X lock, each Gap add Gap lock, if there is a large number of data on the table, what is the situation? In this case, the table, except for the snapshot read without locking, any other lock concurrent SQL, cannot be executed, update, delete, insert, and so on, the whole table is locked. Of course, like combination four, MySQL has been optimized to be semi-consistent Read. Semi-consistent Read If semi-Consistent Read is enabled, MySQL will release the lock in advance for records that do not meet the criteria, and the Gap lock will also be released. Semi-consistent Read is triggered either at the Read Committed isolation level; Either innodb_LOCKS_unSAFE_for_binlog is set in Repeatable Read isolation level. Conclusion: In Repeatable Read isolation level, if the current Read of the full table scan is performed, all records on the table will be locked, and all gaps will be locked by Gap, eliminating all delete/update/ INSERT operations. Semi -consistent Read can be triggered in MySQL to alleviate lock overhead and concurrency, but semi-consistent Read itself brings other problems and is not recommended. Combination 4: Serializable In the final combination, for the appeal of the delete SQL statement, the lock process is the same as combination 8. However, query statements (for example, select * from T1 where ID = 10) are read by snapshots at the RC and RR isolation levels and are not locked. At the Serializable isolation level, no matter the query statement is locked, that is, the snapshot read does not exist, and the MVCC is degraded to lock-based CC. Conclusion: In MySQL/InnoDB, the so-called read without lock does not apply to all cases, but depends on the isolation level. At the Serializable isolation level, all operations are locked. A simple delete statement locking is done, but the learning continues. How to lock complex SQL statements? What about index analysis in MySQL? What about performance analysis, performance optimization? Further study and exploration are needed.
Qun7-9-8-9-0-8-4-4