MySQL Index Merge

preface

MySQL lock mechanism I believe everyone in learning MySQL have a simple understanding, that since there is a lock must not open the deadlock problem. In fact, MySQL does not have deadlocks in most scenarios (for example, the amount of concurrency is not high, and THE SQL is not too tight), but in high concurrency business scenarios, a deadlock will occur if you are not careful, and this deadlock is difficult to analyze.

Some time ago, I met a strange deadlock in my internship in the company. I didn’t have time to sort it out properly before, but RECENTLY I had time to reproduce it, so as to accumulate some experience.

The business scenario

A brief introduction to the business background. The company is engaged in e-commerce live broadcasting, and I am responsible for the business related to the anchor terminal. The deadlock appears when the anchor updates the product information in the background.

One of our goods will have two associated ids. It is impossible to identify a unique commodity by any of the ids (that is, the relationship between the ID and the commodity is one-to-many), and only two ids can be queried at the same time to identify a commodity. SQL > select * from ‘where’ where (ID, ID, ID);

UPDATE test_table SET `name`="zhangsan" WHERE class_id = 10 AND teacher_id = 8;
Copy the code

This SQL is very simple, updating a field based on two equivalent conditions.

Do not know can you see the SQL meng, as a rule, should be of multiple SQL in a transaction would likely deadlock, how could this SQL deadlock?

Yes, I had the same doubts at the time, and even wondered if it was a blind alarm system (it turned out not to be…) At that time, it was really confusing. And because of the reason of database permissions, I can not see the deadlock log, and it is close to the end of work, I can trouble to find DBA, so I directly search engine to go… SQL > update SQL > Merge SQL > update SQL > Merge Index

The index combined

Index Merge Index Merge is an optimization feature introduced in MySQL 5.0 to optimize the use of multiple indexes in a single SQL query.

Let’s look at the SQL, assuming class_id and teacher_id are two normal indexes, respectively:

UPDATE test_table SET `name`="zhangsan" WHERE class_id = 10 AND teacher_id = 8;
Copy the code

If Index Merge is not optimized, MySQL will perform the following steps to query data:

  • According to class_id or Teacher_id (it is up to the optimizer to decide which index to use based on the actual data situation, which is assumed hereclass_idThe primary key ID of the corresponding data is queried on the secondary index
  • Based on the queried primary key ID, the corresponding data row is obtained through the query (that is, query the cluster index)
  • From the data rowteacher_id, determine whether it is equal to 8, and return if the condition is met

MySQL only uses one index, but it does not use multiple indexes, simply because multiple indexes are on multiple trees, forcing them to use will degrade performance.

MySQL > Merge Index query

  • According to theclass_idQuery the corresponding primary key, and then query the corresponding data row according to the primary key (denoted as result set A)
  • According to theteacher_idQuery the corresponding primary key, and then query the corresponding data row according to the primary key (denoted as result set B)
  • The intersection operation of result set A and result set B is performed to obtain the result set that finally meets the conditions

With Index Merge, MySQL splits an SQL statement into two query steps, using two indexes, and optimizes performance with intersection operations.

Deadlock analysis

After analyzing the Index Merge step, let’s go back and think about why deadlock occurs.

Remember Index Merge, which splits an SQL query into two steps, where the problem arises. We know that a row-level exclusive lock is added to the UPDATE statement. Before analyzing the lock step, we assume that we have the following table:

The characteristics of chart data meet our article beginning said, according to the class_id and teacher_id a single field can not only determine a data, only combined the two fields, to determine a data set and class_id and teacher_id two ordinary index respectively.

Suppose you have two SQL statements running concurrently with completely different parameters. Your intuition tells you that there should be no deadlocks, but your intuition is often wrong:

//Thread A performs UPDATE test_tableSET `name`="zhangsan" WHERE class_id = 2 AND teacher_id = 1;

//Thread B performs UPDATE test_tableSET `name`="zhangsan" WHERE class_id = 1 AND teacher_id = 2;
Copy the code

In Index Merge, when the above SQL is executed concurrently, the MySQL lock procedure is as follows:

Eventually, the two transactions wait for each other, creating a deadlock

The solution

Because the deadlock is essentially caused by Index Merge optimization, the solution to the deadlock problem in this scenario is essentially to let MySQL not go through Index Merge optimization.

Plan a

For example, we can first query the corresponding primary key according to class_id, then query the corresponding primary key according to teacher_id, and finally query the data according to the primary key after intersection.

Scheme 2

Teacher_id Merge Index Merge teacher_id Merge Index Merge teacher_id Merge Index Merge teacher_id Merge Index Merge teacher_id Merge Index Merge

Plan 3

To force a single index to be removed, add for index(class_id) after the table name to specify that only the class_id index is removed in this statement

Plan 4

Disable Index Merge optimization:

  • Permanently closed:SET [GLOBAL|SESSION] optimizer_switch='index_merge=off';
  • Temporary closure:UPDATE /*+ NO_INDEX_MERGE(test_table) */ test_table SET name="zhangsan" WHERE class_id = 10 AND teacher_id = 8;

Scene: the repetition

Data preparation

To facilitate testing, here is a SQL script that can be imported using Navicat to get the required test data:

Download address: CDN. Juzibiji. Top/file/index_…

After importing, we get 10000 test data in the following format:

The test code

Due to space limitations, here only shows the code Gist links: gist.github.com/juzi214032/…

The above code basically starts 100 threads to execute our data modification SQL statement to simulate online concurrency. After a few seconds of running, we get the following error:

com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
Copy the code

This means that a deadlock exception has been generated

Deadlock analysis

MySQL > create deadlock log; MySQL > create deadlock log;

SHOW ENGINE INNODB STATUS;
Copy the code

In the log, we find the line LATEST DETECTED DEADLOCK, starting with the DEADLOCK we generated last time, and then we start analyzing it.

Teacher_id = 16. It currently holds a row lock. Lines 34 to 39 are the row data, and lines 34 are the hexadecimal representation of the primary key. Let’s switch to base 10 which is 1616. Similarly, if you look at line 45, the data waiting for the lock is the primary key id 1517.

Next, the same method is used to analyze transaction 2. It can be seen that transaction 2 holds three locks, which are the data rows with primary keys 1317, 1417 and 1517 respectively, and waiting for 1616.

Looking at this we already see that transaction 1 holds 1616 and waits for 1517, and transaction 2 holds 1517 and waits for 1616, so a deadlock is formed. At this point the MySQL treatment method is to rollback hold locks the least transaction, and JDBC is thrown in front of us MySQLTransactionRollbackException rollback anomalies.

conclusion

If you do not know MySQL Index Merge, you will not have a clue when checking MySQL Index Merge, because you will only be presented with a very simple SQL, even if you look at the deadlock log, the same is not clear.

So deal with this kind of problem, more or test your knowledge reserves and experience, as long as encountered once, after writing SQL pay more attention to it!

I’m Xiao Jie. Please follow my wechat official account and take you to know more about the front and back end.