preface

As usual, here are two questions to be answered:

  • What are the isolation levels for transactions in MySQL InnoDB storage engine?
  • What is the implementation mechanism for the corresponding isolation level?

This article will answer the above two questions, analyze the isolation level of transactions and the related locking mechanism.

Introduction to Isolation

Isolation mainly refers to the isolation mechanism provided by the database system to ensure that transactions are executed in an “independent” environment that is not affected by external concurrent operations. This means that when multiple transactions are executed concurrently, the execution of one transaction should not affect the execution of other transactions.

This section describes the four isolation levels

There are four isolation levels defined in the SQL standard:

  • Read Uncommitted: When a Read is not committed, changes in a transaction are visible to other transactions even if they are not committed. There are dirty read
  • Read COMMITTED: Committed reads, the default isolation level of most database systems (MySQL is not), changes made by a transaction from the start until committed are not visible to other transactions. Solve dirty read, phantom read and unrepeatable read
  • Repeatable Read: Repeatable read. This level ensures that the same record is read multiple times in the same transaction with consistent results. Solve dirty read and unrepeatable read, theoretically there is magic, but in the InnoDB engine to solve magic
  • Serializable: Serializable, forcing transactions to be executed serially.

The above four isolation levels are defined by the SQL standard, but are implemented differently in different storage engines. Repeatable Read is the default transaction isolation level in MySQL InnoDB storage engine. Concurrency Control is based on a multi-version Concurrency Control protocol called MVCC (Multi-version Concurrency Control), which solves the illusion problem. There are no dirty or unrepeatable reads. The biggest benefit of MVCC is that read is not locked, read and write conflict, which greatly increases the system’s concurrency performance

Read uncommitted

Uncommitted. In this case, transaction A can see the uncommitted data of transaction B. If transaction B rolls back, transaction A gets dirty data, which is the meaning of dirty read. This isolation level is generally not used in MySQL InnoDB and will not be explained too much.

Read Committed

Commit reads, in which any changes made by a transaction are invisible to other transactions from the start until the commit. Solve the problem of dirty reading, but there is phantom reading phenomenon.

Phantom read refers to the fact that executing the same SQL statement twice in a row in the same transaction may result in different results. The second SQL statement may return a row that did not exist before, also known as a phantom row.

Take this example:

  1. Start by creating a table,
CREATE TABLE `t` (
  `a` int(11) NOT NULL,
  PRIMARY KEY (`a`)
) ENGINE=InnoDB

insert into t(a) values(1);
insert into t(a) values(2);
insert into t(a) values(4);
Copy the code
  1. Execute transaction 1 and transaction 2 respectively:

As you can see from the figure above, Read Committed isolation levels are illusory. In fact, Read Committed can also be a problem. Unrepeatable reads are those in which rows are queried multiple times in a transaction based on the same criteria, but the results are inconsistent because the interval data was modified by other transactions.

A non-repeatable read feels a bit like a phantom read. In fact, the former emphasizes that the same row of data results are different, while the latter emphasizes that multiple queries return different result sets, increasing or decreasing.

Repeatable Read

Repeatable reads, a level that guarantees consistent results from multiple reads of the same record in the same transaction. The InnoDB storage engine solves both illusory and unrepeatable reads. How InnoDB resolves illusory and unrepeatable reads will be revealed later.

Serializable

Serializable, the highest isolation level, avoids the illusion problem by forcing transactions to be executed in serial. However, Serializable locks every row that is read, which can lead to a large number of timeouts and lock contention, resulting in a sharp decrease in concurrency. InnoDB is not recommended in MySQL

Locking analysis at the Read Committed isolation level

InnoDB storage engine provides two standard row-level locks:

  • Shared Lock (S Lock) : Also known as read Lock, multiple transactions can concurrently read the same resource without interfering with each other. That is, if A transaction T adds A shared lock to data A, other transactions can only add A shared lock to data A, but cannot add an exclusive lock. They can only read data but cannot modify data
  • Exclusive Lock (X Lock): Also known as write Lock. After transaction T adds an exclusive Lock to data A, other transactions cannot add any type of Lock to data A. The transaction that obtains the exclusive Lock can read data and modify data.

Note: Shared and exclusive locks are incompatible.

MySQL InnoDB storage engine uses multi-version concurrency control, reads are not locked, and reads and writes do not collide, except in certain scenarios where read locks are displayed (not explored here). This section focuses on locking at the Read Committed isolation level. In MVCC, write operations are generally locked at X.

The locking operation is closely related to the index. When an SQL statement is locked, the index type on its attribute column must be carefully considered. Select * from t1 where id = ‘name’ and id = ‘name’;

insert into t1(name,id) values("a", 10); insert into t1(name,id) values("b"11); insert into t1(name,id) values("c", 13); insert into t1(name,id) values("d", 20);Copy the code

SQL > delete FROM T1 WHERE ID = 10 AND set the isolation level to Read Committed

  • The ID column is the primary key
  • The ID column is a secondary unique index
  • The ID column is a secondary non-unique index
  • There is no index on the ID column

The following is a summary of the above several cases of locking, more detailed content can refer to the database Daniel’s article: MySQL lock processing analysis

The ID column is the primary key

SQL > select * from primary key where id=10

The ID column is a secondary unique index

If the ID column is the unique index and the primary key is the NAME column, then the SQL needs to add two X locks, one for the record on the ID index with id=10 and the other for the record on the primary key index with [name=”a”,id=10]

The ID column is a secondary non-unique index

If there are non-unique indexes on the ID column, all records that meet the SQL query criteria will be locked, and the records on the primary key index will also be locked.

There is no index on the ID column

If there is no index on the id column, SQL will filter the index through the full scan of the cluster. Because the filtering is performed by MySQL Sever level, every record, regardless of whether the condition is met, will be x-locked.

Repeatable Read Locking analysis at isolation level

In Repeatable Read isolation level, InnoDB storage engine can solve phantom Read and unrepeatable Read problems.

Before a brief introduction to InnoDB row locking knowledge, let’s look at three down locking algorithms:

  • Record Lock: A Lock on a single index Record, that is, an X Lock
  • Gap Lock: A Gap Lock that locks a range but excludes the record itself
  • Next-key Lock: Gap Lock + Record Lock, Lock a range, and Lock itself.

A Record Lock will always Lock index records. If InnoDB storage engine builds a table without setting any index, InnoDB will use an implicit primary key to Lock it. (InnoDB adds an implicit primary key index by default if the table has no primary key defined)

Next-key Lock is a Locking algorithm that combines Gap Lock and Record Lock. For example, an index column has 4 values, 10,11,13 and 20, so the interval of next-key Locking for the index is:

  • ($-\infty$, 10)
  • (10, 11]
  • (11, 13]
  • (13, 20]
  • (20,$+\infty$)

Note that InnoDB storage engine optimizes next-key Lock to a Record Lock when the index has a unique attribute (primary Key or unique index), which only locks the index itself, usually with an X Lock.

The next-key Lock mechanism is designed to solve the phantom read problem, especially when the query column index is not unique. Take the following example to illustrate:

  1. First create test table T1, name is primary key index, ID is non-unique index, i.e. secondary index
CREATE TABLE `t1` (
  `id` int(11) NOT NULL,
  `name` varchar(200) DEFAULT NULL,
  PRIMARY KEY (`name`),
  KEY `id_indx` (`id`)
) ENGINE=InnoDB

insert into t1(name,id) values("a",10);
insert into t1(name,id) values("b",11);
insert into t1(name,id) values("c",13);
insert into t1(name,id) values("d",20);
Copy the code
  1. Run delete from T1 where id = 11, as shown in the following figure

This SQL is deleted by index column ID. The index is a non-unique index, so it uses the traditional next-key Locking technology to lock. Moreover, as there are two primary Key indexes and secondary indexes, they need to be locked respectively. For a primary key index (that is, a clustered index), a Record Lock is applied only to the index whose column name = “b”, which is essentially an X Lock.

For non-unique indexes, a next-key Lock is added, and the Lock range is (10,11). A Gap Lock is applied to the index. The Gap Lock is actually the position added between two boundary records. Note that InnoDB also applies a gap lock to the next key in the secondary index, which is a gap lock between (11,13). Add a Record Lock, or X Lock, to the 11 value itself.

Block if another transaction is opened to execute the following statement:

1. select * from t1 where name = "b";
2. insert into t1(name,id) values("c", 12);Copy the code

For example, the first statement cannot be executed because an X lock has been placed on the value of column name=”b” in the clustered index in the initial transaction. So execution is blocked. The second SQL, also unable to execute, inserts the value 12 in the lock range (11,13) and blocks wait.

Therefore, as can be seen from the above example, GAP Lock is used to prevent multiple transactions from inserting records into the same range, which effectively solves the illusion problem.

Isolation Level Summary

The various isolation levels under InnoDB storage engine are summarized below:

Isolation level Dirty read probability Non-repeatable read possibility Illusory possibility Read lock
Read Uncommitted Yes Yes Yes No
Read Committed No Yes Yes No
Repeatable Read No No No No
Serializable No No No Yes

References & acknowledgements

  • InnoDB Storage Engine (Version 2)
  • MySQL lock processing analysis