0. Tease

I recently encountered a deadlock problem, but it was the theoretical knowledge of “interview building rockets” (mainly mysql transaction locks) that helped me locate the problem. Here is the analysis process

1. Background

1.1 Problems

Recently, I was involved in a new project. One day, I received an email about abnormal deadlock. I asked the elderly person of the project and said that the deadlock alarm had been around for a while. Then locate the code that reported the deadlock and find that there is no database write operation in a large transaction before the line of code that reported the deadlock. And all the transactions in the whole project, there is no such thing as deadlock caused by two tables in one transaction, and write in different order. And in the alarm records, there is only one location, and then the problem location entered a confused period.

1.2 Environment Description

Mysql’s transaction isolation level is repeatable reads

To describe the problem, I created the following table and the data in the table (the structure and data below are fictitious, but the logic is the same as the deadlock I encountered). The last three fields in the table have no indexes.

+----+-------+------+--------+
| id | name  | age  | height |
+----+-------+------+--------+
|  1 | Bob   |   15 |      0 |
|  2 | Alice |   16 |      0 |
|  3 | Candy |   17 |      0 |
|  4 | Dog   |   18 |      0 |
+----+-------+------+--------+
Copy the code

Deadlock SQL

update people set height=0 where id = ( select temp.id from ( select id from people where height=0 order by id asc limit  1 ) temp )Copy the code

2. Theoretical support

Here are some theoretical supplements and examples. Read on as you become familiar with them and the cause of deadlocks will emerge.

2.1 Read operations in transactions

Mysql > lock (‘ lock ‘, ‘lock’, ‘read’, ‘lock’, ‘read’, ‘lock’);

2.1.1 snapshot read

This problem is relatively simple, does not hold the lock, in the RR level is snapshot read, is MVCC. The idea of repeatable readability is pretty obvious, but I’m going to run the demo here.

Transaction 1

begin;
select * from people where age = 15;

select sleep(10);

select * from people where age = 15;

commit;
Copy the code

Transaction 2

begin;
update people set height=10 where age = 15;
commit;
Copy the code

You can see that two transaction pairs operate on the same piece of data, with transaction 1 executing first and then transaction 2. I paused transaction 1 for 10 seconds to ensure that by the second read, transaction 2 had already executed. As you can see, the query results in transaction 1 and transaction 1 are the same. And transaction 2 is not blocked.

The symptom is as expected, and the snapshot is read without lock. Reads in a transaction are actually snapshot reads and current reads. The current read requires a shared lock (read lock /S lock).

2.1.2 the current reading

An S lock will be applied only to the where condition after update or to the select for update keyword. So let’s do that again

Transaction 1

begin; update people set height=0 where age=18; Or the begin. select * from people where age =18 for update;Copy the code

Transaction 2

begin;
update people set height=10 where age = 15;
commit;
Copy the code

I’m going to execute transaction 1, and you can see that I’m updating with age=18, and I’m not committing this transaction. Age =15; age=15; age=15; The range of the read lock is the number of rows scanned by the query, although transaction 2 only updates rows whose age=15. But I didn’t add an index, so I did a full table scan, which would lock everything. Then indexing age does not cause a lock wait. (To reproduce this scenario, be sure to use Explain to view the execution plan and ensure the number of rows for your query.)

2.2 Write operations in transactions

Write locks (exclusive locks /X locks) are added when rows are written to a transaction.

This is essentially a read-write lock. Transaction reads and reads do not block, but read and write locks compete for access.

2.3 Types of row locks and other locks

This part of the knowledge is not related to the deadlock, it is mentioned.

Innodb in RR mode, primary key and unique index are row mode, RC level only this row lock. At the same time, there is a gap lock gap, record lock + gap lock combination called next-key lock. Solve gap lock introduced by phantom reading at RR level. There is lock contention for insert and delete operations in scope. Since this has nothing to do with my deadlock, I will not expand the description.

In addition, there are intent locks, MDL locks. In InnoDB, the intent lock is the table lock, divided into intention shared lock, intention exclusive lock, when adding the corresponding row lock, also need to obtain the table of the intent lock. An MDL lock is a lock that is placed during MDL operations in a transaction, which blocks DDL operations.

Both types of locks are similar in that they prevent DDL operations while a transaction is in progress.

3. All of a sudden

3.1 Fault Location

Now back to the deadlock issue, which mentioned that the business code for two transactions holding each other’s locks was not found through the code layer. For example, when a DBA asks for the deadlock log, it is found that the two deadlock transactions are the same SQL mentioned above. With the support of the above theory, we find the problem.

The update statement uses a subquery, which is the current read and holds the read lock. Transaction B also obtains the read lock successfully. However, when transaction A tries to acquire the write lock, it finds that another transaction holds the read lock for this row, and transaction B also holds the read lock for this row, thus forming A deadlock.

3.2 Problem recurrence

Transaction 1

begin; update people set height=1 where id = ( select temp.id from ( select id,sleep(10) from people where height=0 order by id  asc limit 1 ) temp ); commit;Copy the code

Transaction 2

begin; update people set height=1 where id = ( select temp.id from ( select id,sleep(10) from people where height=0 order by id  asc limit 1 ) temp ); commit;Copy the code

When these two transactions are executed separately on both terminals, a deadlock occurs. Since a business deadlock is triggered only when both statements are read and then write, the sleep function is used for stable repetition to help achieve this scenario. Note that this is the live query sweep, executing sleep in the real phase.

PS: The final solution to the problem has nothing to do with mysql. We took the subquery out, and designed another scheme to ensure that the id is correct, so it is not clear here.