origin

Recently, my colleague encountered a transaction blocking problem in the development of MySQL. I checked some data and found that the data about MySQL transaction and lock are quite chaotic, making people look at the cloud and fog. Take this opportunity, just to do a summary of this piece, I hope to write a more comprehensive MySQL concurrent transaction processing

Characteristics and isolation levels of transactions

To talk about locking, we must first talk about the characteristics and isolation level of transactions, because locking mechanisms exist to ensure that transactions correspond to the characteristics of the isolation level. Transactions have the following characteristicsIn MySQL, there are several isolation levels RURead uncommitted, as the name implies, is the isolation level at which uncommitted data is read when multiple transactions operate on the same data in parallel, also known as read uncommitted dataDirty read. This isolation level is rarely used in real scenarios because of dirty reads.

RC reads the commit. A transaction can only see the changes made by the committed transaction. However, this isolation level can lead to the phenomenon of non-repeatable reads, that is, the same data can be read multiple times within a transaction, and if the data happens to be modified by another transaction before the end of the transaction, the two reads may be inconsistent in the first transaction.

RR is repeatable, which is the default transaction isolation level of MySQL. In this isolation level, the problem of non-repeatable reads in RC is solved, ensuring that the same rows are seen in the same transaction. However, phantom read may occur, that is, when a transaction performs read operation, after the first query of data total, another transaction performs the operation of adding data and commits, at this time, the total data read by the first transaction is different from the previous count, just like hallucination.

SERIALIZABLE, the highest of the four isolation levels, solves the problem of dirty reads, repeatable reads, and phantom reads, but has the worst performance. It changes the execution of transactions to sequential execution. Compared with the other three isolation levels, the execution of the latter transaction must wait for the completion of the previous one during the parallel transaction execution.

MySQL in the lock

In MySQL, there are the following types of locks by typeWhen it comes to locking types, it is necessary to mention MySQL to storage engine,MySQL commonly used engine has MyISAM and InnoDB, and InnoDB is the default MySQL engine. MyISAM does not support row locking, while InnoDB does support row and table locking.

SELECT * from MyISAM; SELECT * from MyISAM; SELECT * from MyISAM; SELECT * from MyISAM;A read lock blocks writes to the same table, and a write lock blocks both writes to and reads from the same table.

As for InnoDB, it is well known that InnoDB supports transaction and row locking as opposed to MyISAM. A row lock, as the name implies, is a lock on a specific row of data, or more specifically, a lock on an index (this will be discussed in the implementation of the lock below).

Exclusive locks, we usually perform an update operation in the InnoDB, on this line data will hold exclusive lock, and holds exclusive lock, not allowed to add write lock on the data lines and read lock, and other transactions to visit data read and write operations will be blocked, only the current transaction is committed, lock released to allow other affairs to read and write, to avoid dirty reads The effect of

Shared locks, mainly to support concurrent read data and appear, when one transaction holds a lock the sharing data, allowing other transactions to access a Shared lock, but do not allow other transactions for exclusive lock, that is to say, when hold a Shared lock, multiple transactions can read the current data, but not to modify the current data do not allow any affairs operation, Avoid non-repeatable problems

When A transaction needs to acquire A shared lock or an exclusive lock, it needs to acquire the corresponding intent lock first. Why do you want to do this? For example, suppose that in transaction A, A row holds A shared lock and this row can only be read, but not written. Transaction B obtains A write lock on the table. If the lock is successful, transaction B will be able to read and write the entire table, which conflicts with transaction A. This operation is not allowed, so MySQL will obtain the intended lock before applying for a shared or exclusive lock. In other words, if you want to operate on a row in a table, you need to check whether the entire table can be operated. Intent lock application is completed by the database, there is no need to apply for.

Three implementations of row locking

The above analysis of several lock types, in fact, usually in development contact with the most or row lock, row lock implementation has the following several

In InnoDB, the implementation of a Lock is index-based Record Lock, which will Lock index records, such as UPDATE table where id = 1; It’s going to be this implementation

A Gap Lock locks the Gap between the front and back of the index, not the index itself, to prevent phantom reads. When a range condition rather than an equality condition is used to retrieve data and request an exclusive or shared lock, records that do not exist in that range are not allowed to modify inserts. Select * from user where user_id > 100 for update; select * from user where user_id > 100 for update; Insert data with id=102 into another transaction will block and wait for the first transaction to commit. Gap locking is for transactions whose isolation level is repeatable reads or above

A next-key Lock is a combination of a record Lock and a gap Lock, which locks both records and gaps. Data rows are locked in a next-key Lock at the Repeatable Read isolation level

MVCC

The locking mechanism can control concurrent operations to ensure consistency, but the system overhead is high. In the isolation level of RC and RR,MySQL InnoDB uses MVCC (Multi-version Concurrency Control) mechanism to solve phantom read, so that in the concurrent process of transactions,SELECT operations do not lock, read and write conflict, thus improving performance. It works by keeping a snapshot of the data at a point in time. The transaction ID is stored by storing hidden columns after each row so that each transaction corresponds to an increasing transaction ID. If three transactions update the same row at the same time, then there will be three versions of the data, but actually version 1 and version 2 do not exist physically, but are recorded in the Undo log through association, so that undo can pass Log retrieves a historical version of the data, such as a rollback operation, overwriting the data on the data page with the previous version of the data

Below for an example of A snapshot read under the RR isolation level 1: open transaction A according to the condition of A query to the two data, the transaction at this time to insert 1 B data meet the conditions of A data, and to commit the transaction, the transaction again according to the condition of A query, A query to the remains of two data, that is to say, the transaction is not A query to the latest versions of the data, but pass Historical snapshot version implemented by MVCC. This is also a repeatable read implementation.

The above example introduced the read operation, but what about the write operation, do transactions not interfere with each other. Again, for example, an example of the update operation under the RR isolation level 2: hypothesis transaction A perform an update statement, A data is updated condition 2, do not commit the transaction after the update is successful, the transaction B insert A new data, satisfy the conditions for A transaction at this time A press condition A to update the data again, it was found that the transaction B insert new data has been updated. A phantom read occurs, which is the current read, that is, all data modification operations (UPDATE, INSERT, DELETE) read the latest data of committed transactions.

So how to solve the illusion problem of the current read? The problem that MVCC cannot solve is, of course, solved by locks. Transaction B can insert data successfully only after transaction A commits, because transaction A has A Gap Lock. Transaction B can insert data successfully only after transaction A commits. This solves the phantom read problem under the current read operation.

So the MVCC mechanism prevents phantom reads caused by snapshot reads and the next-key lock prevents phantom reads caused by current reads. It should be noted that the MVCC only works at the RC and RR isolation levels. The other two isolation levels are incompatible with MVCC because RU always reads the latest row, not the row that matches the current version of the transaction. SERIALIZABLE locks all rows that are read.

Lock trigger and upgrade

By default InnoDB engine RR level, table Lock can be understood as each row holds a Record Lock. When updating a Record, when the update field does not move the index, it cannot obtain the Record Lock of the corresponding Record. Row Lock will be upgraded to table LockMySQL ExplainTo analyze. Note that when a normal index valueDegree of differentiationThe second transaction blocks when another transaction concurrently operates on different data. This is because MySQL’s execution optimizer considers that locking multiple rows at a time is not as efficient as table locking, so it does not treat this normal index as an indexDegree of differentiationWhen high, it is considered efficient and will not be upgraded to a table lock. Therefore, it is important to create an appropriate index. You are not advised to create an index for a field with low differentiation.

When does a DeadLock occur

What is a deadlock? A deadlock is a phenomenon in which two or more processes are blocked during execution, either by competing for resources or by communicating with each other, and cannot proceed without external action. The system is said to be in a deadlock state or a deadlock occurs in the system. These processes that are always waiting for each other are called deadlocked processes. Example 4: Transaction A acquires the lock whose ID =20, Transaction B acquires the lock with ID =30, and then transaction A tries to acquire the lock with ID =30, which is already held by transaction B, so transaction A waits for transaction B to release the lock, and then transaction B tries to acquire the lock with ID =20 which is held by transaction A, so the two transactions wait for each other, and that causes A deadlock. There are many more deadlock scenarios, which ultimately result from multiple transactions acquiring locks that are mutually exclusive and in an inconsistent order. How to avoid deadlock, usually Record Lock deadlock problems caused by the development will be more careful, but the Gap Lock could lead to a deadlock problem often ignored, so it should pay more attention, another is to establish a suitable index, if no index, then the operating data when locks in each line, will increase the probability of a deadlock.

Command for troubleshooting lock problems

show open tabbles; SHOW OPEN TABLES where In_use > 0;Check which tables are locked

show status like ‘table%’;

table_locks_waitedThe number of waits for table level lock contention to occur. A high value indicates that there is an validated table lock contention

table_locks_immediateIndicates the number of times the table lock was immediately released

show status like ‘innodb_row_lock%’; Innodb_row_lock_current_waitsNumber of locks currently waiting

Innodb_row_lock_timeTotal lockout time since system startup

Innodb_row_lock_time_avgAverage time spent waiting for a callInnodb_row_lock_time_maxThe maximum wait time since system startupInnodb_row_lock_waitsTotal number of waits since system startup

information_schema

Information_schema is a database for recording performance information of MySQL. It is enabled by default after version 5.7

SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCKS;What type of lock is InnoDB? What transaction ID is InnoDB

SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX;Looking at the InnoDB transaction ID, it shows what the operation is and some general information such as whether it is running or waiting for a lock.SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCK_WAITS;Check InnoDB lock wait time and which transaction ID is waiting for the lock