First, the type of lock

Row locks

  • A shared Lock (S Lock) allows a transaction to read a row of data
  • An X Lock allows a transaction to write a row of data

Table lock (intent lock)

Locking allows transaction row-level locks and table-level locks to exist at the same time. To support locking at different granularity, InnoDB storage engine supports an additional locking method

  • A transaction wants to acquire a shared Lock for rows in a table

  • A transaction wants to acquire exclusive locks for rows in a table

Because InnoDB storage engine supports row-level locks, intentional locks don’t actually block any requests other than full table sweeps. Therefore, the compatibility of table level intention lock and row level lock is shown as follows

If the locked object is a tree, then locking the lowest level object, that is, the most fine-grained object, requires locking the coarse-grained object first. In the example above, if you need to lock X on record R on A page, you need to lock IX on database A, table, page, and finally record R on X. If any of these parts cause a wait, the operation needs to wait for the coarse-grained lock to complete. For example, if a transaction had already locked S on table 1 before adding X to r, then the transaction would need to add IX to R on table 1. The transaction would need to wait for the table lock operation to complete due to incompatibility.

What exactly does intent lock do?

Innodb’s intent lock is the main user of multi-granularity lock coexistence. For example, if transaction A wants to lock S on A table, if A row in the table has been locked X by transaction B, the lock application should also be blocked. If there is a lot of data in the table, checking lock flags row by row can be expensive and system performance can suffer. To solve this problem, a new lock type can be introduced at the table level to represent the locking of the row to which it belongs, which leads to the concept of “intentional locking.”

For example, if there are 100 million records in the table, transaction A locks some of them, then transaction B needs to add table level lock to the table, if there is no intention to lock, then we need to check whether the 100 million records in the table are locked. If there is an intent lock, then if transaction A adds an intent lock and then X lock before updating A record, transaction B checks whether there is an intent lock on the table and whether the existing intent lock conflicts with the lock it intends to add. If there is A conflict, transaction B will wait until transaction A releases, without detecting each record. When transaction B updates the table, it doesn’t need to know which row is locked, it just needs to know that one row is locked anyway.

A transaction is holding a lock on a row or is about to hold a lock on a row.

Second, the lock algorithm

  • Record Lock: A Lock on a single row Record

  • Gap Lock: A Gap Lock that locks a range but does not contain the record itself

  • Next-key Lock: Gap Lock+Record Lock locks a range, gaps between indexes, and the Record itself; The purpose is to prevent illusory reading

Mysql: How to do read and write parallel (multi-version control)?

Multi-version concurrency control (MVCC) is a variant of row-level locking, which is similar to row-level locking by saving a snapshot of a node at a certain time. Therefore, different transactions may see different data on the same table at the same time. The data snapshot is realized by adding the creation date version number and deletion date version number after different data rows, and the version number is increasing continuously

The type of reading

  • Consistent Unlocked read (Snapshot read)
    • InnoDB storage engine uses non-locked consistent reads at transaction isolation level commit reads (RC) and repeatable reads (RR)
      • In RC mode, the latest snapshot is read
      • In RR mode, the snapshot at the start of a transaction is read
  • Consistent lock read (current read)
    • Reads with an isolation level of uncommitted reads (RN) are current reads
    • The SELECT… FOR UPDATE (write lock)
    • The SELECT… LOCK IN SHARE MODE

Four, lock processing analysis

The following two simple SQL, they add what lock?

select * from t1 where id = 10

delete from t1 where id = 10

If you want to analyze the locking situation, we must also know the following premise, premise is different, lock processing is different

  • Premise 1: Is the ID column a primary key?
  • Premise two: What is the current system isolation level?
  • If id is not the primary key, is there an index on the ID column?
  • Premise 4: If there is a secondary index on the ID column, is that index unique?
  • Premise 5: What are the execution plans of the two SQL statements? Index scan? Full table scan?

Based on the above, there are several combinations

  • Combination one: the ID column is the primary key, RC isolation level
  • Combination two: the ID column is a secondary unique index, RC isolation level
  • Combination three: the ID column is a secondary non-unique index, RC isolation level
  • Combination four: No index on id column, RC isolation level
  • Combination 5: The ID column is the primary key and RR isolation level
  • Combination 6: THE ID column is a secondary unique index, RR isolation level
  • Combination 7: ID column is level 2 non-unique index, RR isolation level
  • Combination 8: No index on id column, RR isolation level
  • Combination nine :Serializable isolation level

Permutations haven’t been enumerated yet, but it looks like there are a lot of them. Is it really necessary to be so complicated? In fact, that’s how complicated it is to analyze locking. But on the other hand, as long as you select a combination, the SQL needs to add locks, in fact, also determined. Let’s pick a couple of classic combinations

Combination 1: id primary key +RC

This combination, this is the simplest, easiest combination to analyze. SQL: delete from T1 where id = 10; All you need to do is put an X lock on the primary key where id = 10. As shown below:

Conclusion: When id is the primary key, this SQL only needs to place an X lock on the record id=10.

Combination two: id unique index +RC

For this combination, id is not the primary key, but a Unique secondary key. Delete from t1 where id = 10; What kind of lock do I need? See below:

In this combination, the ID is the unique index and the primary key is the name column. At this point, the locking situation is different due to combination one. Select * from primary key where id=10; select * from primary key where id=10; delete from primary key where id=10; Then lock the primary key entry corresponding to name = ‘d’ on the cluster index with X. Why are records on the cluster index also locked? Update t1 set id = 100 where name = ‘d’; update t1 set id = 100 where name = ‘d’; At this point, if the DELETE statement does not lock the record on the primary key index, concurrent UPDATES will not be aware of the existence of the DELETE statement, violating the constraint that updates/deletes on the same record require serial execution.

Conclusion: If the id column is unique, it has a unique index. Select * from [name=’d’,id=10]; select * from [name=’d’,id=10];

Combination 3: id non-unique index +RC

Compared to combination one and two, combination three has changed again, with the isolation level remaining the same as RC, but the constraint on the ID column has been reduced again, so that the ID column is no longer unique and only has a normal index. Delete from t1 where id = 10; Statement, still select the index on the ID column to filter the WHERE condition, so which locks are held? Also see the following figure:

According to the figure, it can be seen that, first of all, on the ID column index, the records that meet the query condition id = 10 are locked. At the same time, the records corresponding to the primary key index are also locked. The only difference with combination 2 is that combination 2 has at most one record that meets the equivalent query, while combination 3 locks all records that meet the query conditions.

Conclusion: If there is a non-unique index on the ID column, all records that meet the SQL query criteria will be locked. At the same time, these records on the primary key index, will also be locked.

Combination 4: ID No index +RC

This is a special case compared to the previous three combinations. Select * from id where id = 10; This filter condition cannot be filtered by index, so it can only be filtered by full table scan. What locks will be added to SQL for this combination? Or, to put it another way, what lock is placed on a full table scan? There are many answers: some people say they will put an X lock on the watch; Some people say that I will select id = 10 from the cluster index; Record plus X lock. What about the reality? See the picture below:

Because there is no index on the ID column, you can only go through the clustered index to perform a full scan. As can be seen from the figure, there are two records that meet the deletion conditions, but all records on the cluster index are locked with X. All records are x-locked whether or not they meet the criteria. Neither a table lock nor a row lock is placed on a record that meets a condition.

One might ask? Why not lock only the records that meet the criteria? This is due to the implementation of MySQL. If a condition cannot be quickly filtered by index, the storage engine layer locks all records back and the MySQL Server layer filters them. That’s why all the records are locked.

Note: In practice, MySQL has some improvements. If the MySQL Server filter criteria are not met, the unlock_row method is called to lock the records that do not meet the criteria (violating the 2PL constraint). This ensures that only the records that meet the criteria will end up holding the lock, but the lock operation for each record cannot be omitted.

Conclusion: If there is no index on the ID column, SQL will filter through the full scan of the clustered index, because filtering is performed at MySQL Server level. So every record, whether it meets the criteria or not, gets an X lock. However, for the sake of efficiency, MySQL has made optimization. For records that do not meet the condition, it will release the lock after judgment. Finally, it holds the lock on the record that meets the condition, but the lock on the record that does not meet the condition will not be omitted. At the same time, the optimization violates the 2PL constraint.

Combination 5: ID primary key +RR

The above four combinations are all locking behaviors at Read Committed isolation level, and the next four combinations are locking behaviors at Repeatable Read isolation level.

Repeatable Read (delete from T1 where id = 10); [id primary key, Read Committed]

Combination 6: ID unique index +RR

[id unique index, Read Committed] [id unique index, Read Committed] Two X locks, one on the record whose ID unique index meets the condition, and one on the corresponding cluster index.

Combination 7: ID non-unique index +RR

Remember the differences between the four isolation levels of MySQL mentioned earlier? The RC isolation level allows phantom reads, while the RR isolation level does not allow phantom reads. However, in combination 5 and 6, the locking behavior is exactly the same as that under RC. So with RR isolation,

Delete from t1 where id = 10; delete from t1 where id = 10; Suppose the index on the ID column is selected for conditional filtering, and the final locking behavior is what? Also look at the picture below:

This graph, relative to combination 3: [not unique lock on ID column, Read Committed] looks the same, but there is a big difference. The biggest difference is that there is a GAP lock in this picture, and the GAP lock looks not to be added to the record, but to load the position between two records, what is the GAP lock used?

In fact, the extra GAP lock is RR isolation level, relative to the RC isolation level, there is no phantom key. Indeed, the position of the GAP lock is not the record itself, but the GAP between two records.

How to ensure that the two current reads return the same record requires that between the first current read and the second current read, no new records that meet the condition are inserted and committed by other transactions. In order to achieve this function, GAP lock came into being.

As shown in the figure, where can I insert new items that meet the condition (id = 10)? Considering the order of B+ tree index, the items that meet the condition must be continuously stored. [6,c] insert id=10; [6,c], [10,b], [10, aa]; [10,bb],[10,c], etc. [10,e],[10,z] can be inserted between [10,d] and [11,f]. And records meeting the conditions will not be inserted after [11,f]. Therefore, to ensure that no new records will be inserted between [6,c] and [10,b], between [10,b] and [10,d], [10,d] and [11,f], MySQL chooses GAP lock to lock these three gaps.

Insert operations, such as Insert [10, AA], first locate between [6,c] and [10,b], and then check whether the GAP is locked before insertion. If locked, Insert cannot Insert records. Therefore, through the current read of the first pass, not only the records that meet the condition are locked (x-lock), similar to combination three. At the same time, add 3 GAP locks to lock the 3 gaps that may Insert records meeting the condition, to ensure that the subsequent Insert cannot Insert new records with ID =10, and thus eliminate the illusion of the second current read of the same transaction.

Interested friends see here, can ask: since to prevent illusionary reading, need to rely on GAP lock protection, why combination five, combination six, also RR isolation level, but do not need to add GAP lock?

First of all, it’s a good question. Second, answering this question is easy. The purpose of the GAP lock is to prevent the illusion of two current reads of the same transaction. And combination five, id is the primary key; Combination 6, id is unique key, can guarantee unique. An equivalent query can only return one record at most, and a new record of the same value will not be inserted, thus avoiding the use of GAP locks. SQL: select * from t1 where id = 10 for update; The first query, did not find to meet the query conditions of the record, then GAP lock can be omitted? I leave it to you to think about.

SQL: delete from T1 where ID = 10 SQL: delete from T1 where ID = 10; First, locate the first record that meets the query condition through the ID index, add the X lock on the record, add the GAP lock on the GAP lock, and then add the record X lock on the primary key cluster index, and then return; Then read the next entry and repeat. Until the first record [11,f] does not meet the conditions, at this point, record X lock is not needed, but GAP lock still needs to be added, and finally the end is returned.

When a GAP lock or nextKey lock is acquired depends on the isolation level. Only specific operations at REPEATABLE READ or above isolation level will acquire a GAP lock or NextKey lock.

Combination 8: ID No index +RR

Combination eight, the last case under Repeatable Read isolation level where there is no index on the ID column. SQL: delete from t1 where id = 10; There is no other path to choose but full table scan. The final locking situation is shown in the figure below:

This is a frightening phenomenon. First, all records on the cluster index are x-locked. Secondly, the GAP between each record in the cluster index is also added with a GAP lock. This example table, with only six records, requires a total of six record locks and seven GAP locks. Imagine if there were 10 million records on a table?

In this case, the table cannot be executed, cannot be updated, cannot be deleted, cannot be inserted, and the entire table is locked.

Of course, similar to combination 4: [ID unindexed, Read Committed], in this case, MySQL has also made some optimizations, which are called semi-consistent Read. Semi-consistent Read If semi-Consistent Read is enabled, MySQL releases locks in advance for records that do not meet query criteria. For the above use case, all records except [d,10], [g,10] will be released without a GAP lock. Semi-consistent Read How to trigger: Either read COMMITTED isolation level; Either Repeatable Read isolation level with innodb_LOCKS_unSAFE_for_binlog parameter set. For a more detailed introduction to semi-Consistent Read, please refer to my previous blog: MySQL+InnoDB semi-Consitent Read principle and implementation analysis.

Conclusion: In Repeatable Read isolation level, if the current Read of the full table scan is performed, all records in the table and all gaps in the cluster index will be locked, preventing all concurrent update/delete/insert operations. Of course, it is possible to trigger semi-consistent Read to mitigate locking overhead and concurrency, but semi-consistent Read itself brings other issues and is not recommended.

Combination 9: Serializable

The final case for the simple SQL mentioned earlier is the Serializable isolation level. For SQL2: delete from t1 where id = 10; The Serializable isolation level is exactly the same as the Repeatable Read isolation level, so this is not introduced.

Serializable isolation level SQL1: select * from T1 where id = 10; At the RC and RR isolation levels, this SQL entry is read by snapshot and not locked. However, at the Serializable isolation level, SQL1 adds read locks, meaning that snapshot reads no longer exist and MVCC concurrency control is downgraded to lock-based CC.

Conclusion: In MySQL/InnoDB, the so-called read without lock does not apply to all cases, but is isolation level dependent. Serializable Isolation level: Reads are not valid without locks. All read operations are current reads.

Deadlock cases

1. Lock conflicts of the same row in different tables

This situation is well understood, with transactions A and B operating on two tables, but in A loop waiting for locks.

2. Row lock conflicts are recorded in the same table

When two jobs are updated in batches, jobA processes ids in the ids list [1,2,3,4] and job processes ids in the ids list [8,9,10,4,2], causing a deadlock.

3. Different index locks conflict

When transaction A is executed, the cluster index will be locked in the order [1,4,2,3,5]. When transaction B is executed, the cluster index will be locked in the order [1,2,3,4,5], which causes the possibility of deadlock.

4. Gap lock conflict

Innodb in RR level, the following situation can also cause deadlock, relatively obscure. Students who are not clear can analyze the principle of gap lock according to the last section.

How to avoid deadlocks as much as possible

  1. Access tables and rows in a fixed order. For example, when the two jobs in Section 2 are updated in batches, the simple method is to sort the ID list first and then execute it. In this way, the situation of cross-lock is avoided. For example, in the case of Section 3.1, adjusting the SQL order of the two transactions to be consistent can also avoid deadlocks.
  2. Great things are divided into small things. Large transactions tend to be deadlocked, and if business permits, large transactions are broken down.
  3. In the same transaction, try to lock all the resources needed at one time to reduce the deadlock probability.
  4. Lower the isolation level. If services allow, it is also a good choice to lower the isolation level, for example, to change the isolation level from RR to RC to avoid many deadlocks caused by gap locks.
  5. Add a reasonable index to the table. You can see that if the index is not moved, a lock will be added to each row of the table, and the probability of deadlock will be greatly increased. =

Seven, how to check the lock

Starting with InnoDB1.0, tables INNODB_TRX, INNODB_LOCKS, and INNODB_LOCK_WAITS were added under the INFORMATION_SCHEMA schema. (See appendix for details.) With these three tables, users can more easily monitor current transactions and analyze possible lock problems.

Show status like 'innodb_row_lock%'; SELECT * FROM information_schema.innodb_trx; SELECT * FROM information_schema.INNODB_LOCKS; SELECT * FROM information_schema.INNODB_LOCK_WAITS; SELECT r.t_id 'wait transaction ID', R.t_x_mysql_thread_id' wait thread ID', r.t_x_query 'wait transaction running statement ', B.trx_id 'block transaction ID', b.trx_mysql_thread_id' block thread ID', Innodb_lock_waits W INNER JOIN information_schema. Innodb_trx b ON b.trx_id = w.blocking_trx_id INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;Copy the code

How does mysql prevent deadlocks?

Innodb_lock_wait_timeout Waits for a lock timeout to roll back a transaction

The intuitive approach is that while two transactions are waiting for each other, when one waits beyond a certain threshold set, one transaction is rolled back and the other transaction continues.

The wait-for graph algorithm proactively performs deadlock detection

The wait-for Graph algorithm is triggered whenever a lock request is not immediately satisfied and enters a wait.

Wait-for Graph requires the database to hold two kinds of information:

  • A linked list of lock information

  • Transaction wait linked list

From the above linked list, a graph can be constructed in which the presence of a loop means that there is a deadlock, so resources are waiting for each other. In a Wait-for graph, transactions are nodes in the graph. In the figure, the definition of transaction T1 pointing to T2 side is:

  • Transaction T1 waits for resources occupied by transaction T2

  • Transaction T1 finally waits for the resources occupied by transaction T2, that is, the same resources are waiting between transactions, and transaction T1 occurs after transaction T2

Example transaction state and lock information

Transaction Wait Lists show that there are 4 transactions T1, T2, T3 and T4, so there should be 4 nodes in wait-for graph. Transaction T2 holds the X lock for Row1 and transaction T1 holds the S lock for Row2. Transaction T1 waits for resources from Row1 in transaction T2, so there is an edge in the Wait-for graph pointing from node T1 to node T2. Transaction T2 needs to wait for row2 objects occupied by transaction T1 and T4, so there are edges from nodes T2 to nodes T1 and T4. Similarly, there are edges from nodes T3 to nodes T1, T2, and T4, so the final wait-for graph is shown in the following figure.

Ps: If there is a deadlock, usually InnoDB storage engine chooses to roll back the least undo transaction and start again

The appendix

INNODB_ROW_LOCK

The column name describe
innodb_row_lock_current_waits Number of locks currently waiting
innodb_row_lock_time Total lockout time since system startup
innodb_row_lock_time_avg Average time spent per wait
innodb_row_lock_time_max The time spent waiting for the most frequent run since system startup
innodb_row_lock_waits The total number of waits since system startup; Directly determine the direction and strategy of optimization

INNODB_TRX

InnoDB provides information about each transaction currently being executed internally, including whether the transaction is waiting to be locked, when the transaction is started, and the SQL statement (if any) that the transaction is executing. See dev.mysql.com/doc/refman/…

The column name describe
TRX_ID Transaction Id
TRX_WEIGHT The weight of the transaction, reflecting (but not necessarily an exact count) the number of rows changed and the number of rows locked by the transaction. To resolve deadlocks, InnoDB selects the least heavily weighted transaction as the “victim” of the rollback. Transactions that have changed non-transactional tables are considered heavier than other transactions, regardless of the number of changed and locked rows.
TRX_STATE Transaction execution status. The allowed value is RUNNINGThe LOCK WAIT,A ROLLING BACK, and COMMITTING.
TRX_STARTED Start time of trading.
TRX_REQUESTED_LOCK_ID The ID of the lock that the transaction is currently waiting on, if TRX_STATEIs the LOCK WAIT; Otherwise, NULL ` `.
TRX_WAIT_STARTED The transaction starts waiting for the lock time if TRX_STATEIs the LOCK WAIT; Otherwise, NULL ` `.
TRX_MYSQL_THREAD_ID MySQL thread ID that corresponds to the ID value in show ProcessList
TRX_QUERY The SQL statement that the transaction is executing
TRX_OPERATION_STATE The current operation of the transaction, if any; Otherwise, NULL ` `.
TRX_TABLES_IN_USE The number of tables InnoDB uses to process the current SQL statement for this transaction.
TRX_TABLES_LOCKED The number of tables with row locks in InnoDB’s current SQL statement. (Because these are row locks, not table locks, it is often still possible to read and write to the table through multiple transactions, although some rows are locked.)
TRX_LOCK_STRUCTS The number of locks retained by the transaction.
TRX_LOCK_MEMORY_BYTES The total size in memory occupied by the lock structure for this transaction
TRX_ROWS_LOCKED The approximate number or number of rows locked for this transaction. This value may include deleted marker rows that actually exist but are not visible to the transaction
TRX_ROWS_MODIFIED The number of rows modified and inserted in this transaction.
TRX_CONCURRENCY_TICKETS A value indicating how much work the current transaction can perform before being swapped out
TRX_ISOLATION_LEVEL The isolation level of the current transaction.
TRX_UNIQUE_CHECKS Whether unique checks are turned on or off for the current transaction. For example, they might be turned off during bulk data loading
TRX_FOREIGN_KEY_CHECKS Whether to turn foreign key checking on or off for the current transaction. For example, they might be turned off during bulk data loading
TRX_LAST_FOREIGN_KEY_ERROR A detailed error message for the last foreign key error, if any; Otherwise, NULL ` `
TRX_ADAPTIVE_HASH_LATCHED Whether the adaptive hash index is locked by the current transaction. When an adaptive hash index searches the system partitioned, a single transaction does not lock the entire adaptive hash index. Adaptive hash index partitioning byinnodb_adaptive_hash_index_parts“, the default setting is 8.
TRX_ADAPTIVE_HASH_TIMEOUT Whether to immediately discard the search latch for adaptive hash indexes or keep it between MySQL calls. When there is no adaptive hash index contention, the value remains zero, and statements retain latches until they are complete. It counts down to zero during contention, and the statement releases the latch immediately after each row lookup. When the adaptive hash index search system is partitioned (controlledinnodb_adaptive_hash_index_parts“), the value remains 0.
TRX_IS_READ_ONLY A value of 1 indicates that the transaction is read-only.
TRX_AUTOCOMMIT_NON_LOCKING A value of 1 indicates that the transaction isSELECT] (https://dev.mysql.com/doc/refman/5.7/en/select.html) do not use FOR UPDATEOr or LOCK IN SHARED MODEClause, and is executing, [autocommitTherefore, the transaction will contain only this statement. When this column and TRX_IS_READ_ONLYWhen both are 1, InnoDBOptimize transactions to reduce the overhead associated with transactions that change table data

INNODB_LOCKS

Provides information about each lock that an InnoDB transaction has requested but not yet acquired, and each lock that the transaction holds that blocks another transaction. See dev.mysql.com/doc/refman/…

The column name describe
LOCK_ID A unique lock ID, internally InnoDB.
LOCK_TRX_ID ID of the transaction that holds the lock
LOCK_MODE How to request a lock. Allows locking mode descriptor S, XThat ISIX,That GAP, AUTO_INC, and UNKNOWN ` `. Lock pattern descriptors can be combined to identify specific locking patterns.
LOCK_TYPE The type of lock
LOCK_TABLE The name of the table that has been locked or contains locked records
LOCK_INDEX The name of the index, if LOCK_TYPE is RECORD; Otherwise, a NULL
LOCK_SPACE The tablespace ID of the locked record if LOCK_TYPEIs a RECORD; Otherwise, NULL ` `
LOCK_PAGE Lock the page number of the record if LOCK_TYPEIs a RECORD; Otherwise, NULL ` `.
LOCK_REC Page lock record within the heap number, if LOCK_TYPEIs a RECORD; Otherwise, NULL ` `.
LOCK_DATA Lock-related data, if any. If LOCK_TYPEIs a RECORDIs the primary key value of the locked record, otherwise NULL. This column contains the value of the primary key column in the locked row in the format of a valid SQL string. If there is no primary key, LOCK_DATAIs the only InnoDBID of an internal row. If a gap lock is performed on key values or ranges higher than the maximum value in the index, LOCK_DATAReport _supremum_pseudo-record. InnoDB when the page containing the locked record is not in the buffer pool (if it is paged to disk while holding the lock)Do not get pages from disk to avoid unnecessary disk operations. On the contrary, LOCK_DATASet to NULL.

INNODB_LOCK_WAITS

Contains one or more rows for each blocked InnoDB transaction, indicating the locks it has requested and any locks that block the request. See dev.mysql.com/doc/refman/…

The column name describe
REQUESTING_TRX_ID ID of the requested (blocked) transaction.
REQUESTED_LOCK_ID ID of the lock on which the transaction is waiting.
BLOCKING_TRX_ID ID of the blocking transaction.
BLOCKING_LOCK_ID ID of the lock held by the transaction that prevents another transaction from proceeding

references

MySQL Tech Insider: InnoDB Storage Engine

MySQL lock processing analysis

Mysql lock procedure

Database transactions and locks (3)

Analysis of the idea of MySQL deadlock problem