preface

Halo, I believe everyone must have had a very happy Dragon Boat Festival, I see the circle of friends all kinds of sun travel, sun food, really envy ah, unlike me, a cold can only be at home.

Of course, holiday syndrome isn’t just a joke, as much fun you have as you feel depressed when you go to work after the holidays. To which I would like to say, it’s okay, don’t get upset, let’s see how I can make you happy by airing my dirty laundry.

Abnormal SQL statement

Last Thursday lunch break, I was taking a nap on the station, sleep as if to see their own with Li Bai in the glory of the valley of the four scene, just after I took five to lead his teammates to push the opposite crystal, a panic and rapid “bad” woke me up from sleep. I narrowed my dim eyes, only to find that the voice just came from my team leader Zhuang Ge. I saw him nervously opening the log system to check the log, and I had a hunch that something bad was happening. After a careful inquiry, I realized that during my squinting, the CPU of the online database server was full. At the same time, the production database read – only delay is set and an alarm is generated, which lasts for half an hour.

This let me breath in a gasp, for our group to do many systems all use the same database server, amount of active users have several hundred thousand, if the server crashes will make all the system service is not available, so we quickly through SQL log in problem to find, and finally to screen out because a high amount of SQL query didn’t leave the index caused, The log list shows that the number of rows scanned by this SQL statement is up to millions, which is basically the case of full table scan. In addition, tens of thousands of times were queried within half an hour, and each SQL query took more than 3000ms. My god, no wonder the server CPU will be full, such a time-consuming SQL statement query volume is so large, of course, the database resources are directly crashed, this is the SQL query at that time:


Temporary processing

Look at this statement, I gasped again, this is not my system call SQL statement? It’s over. There’s no escaping this time. You’re in your sleep.


Of course, since I wrote the SQL myself, I could tell at a glance that there was something wrong with this statement.

According to my code processing, this SQL call is missing an important parameter user_fruit_id. If this parameter is not passed, it should not go through this SQL query. In my design, this parameter is the leftmost field of a joint index in the data table.

KEY `idx_userfruitid_type` (`user_fruit_id`,`task_type`,`receive_start_time`,`receive_end_time`) USING BTREE
Copy the code

Although positioning in the SQL statement, but the problem of online is urgently needed, always can’t find the bug it redone again online, so, we can only do a temporary processing, is added on the original table a joint index, is actually removed user_fruit_id fields, let these high volume of queries can go new index, as follows

KEY `idx_task_type_receive_start_time` (`task_type`,`receive_start_time`,`receive_end_time`,`created_time`) USING BTREE
Copy the code

By adding indexes, the number of rows scanned is significantly reduced, and the instance can be restarted.

Leftmost matching principle

So why is the leftmost field not indexed? This is because MySQL’s joint index is based on the leftmost matching principle.

As we all know, the bottom of the index is a B+ tree structure, and the structure of the joint index is also a B+ tree, but the number of key values is not one, but multiple, to build a B+ tree can only be built according to one value, so the database builds the B+ tree according to the leftmost field of the joint index.

For example, we use the joint index of two fields (name, age) to analyze,

The picture comes from Teacher Lin Xiaobin’s “MySQL Actual Combat lecture 45” course.


Select ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4, ID4 Select * from age; select * from age; select * from age; select * from age; select * from age; This is why the SQL statement I wrote earlier is missing the leftmost user_fruit_id field and is running a full table scan query.

Normally, if a joint index is designed like (a,b), then a and B will be used as a condition, or A alone as a query condition, in which case we do not need to design a separate index for a.

If (a,b) is not available, you will have to maintain another index, which means you need to maintain both (a,b) and (b).

Find out the bugs

Although the temporary processing, but the problem is not solved, it is obvious that the system has a bug will have such query conditions. Because I wrote the code by myself, I immediately located the specific method in the code after KNOWING which SQL it was. Later, I found out that it was because my nullating of the user_fruit_id field did not take effect.

Since this field is passed from the caller, I annotate it with a non-null limit in the method argument, which is @notnull in the Javax package,

public class GardenUserTaskListReq implements Serializable { private static final long serialVersionUID = -9161295541482297498L; @apiModelProperty (notes = "fruit id") @notnull (message =" fruit ID cannot be empty ") private Long userFruitId; */..................... }Copy the code

Although this annotation is used for non-null validation, I do not add another annotation **@ “Validated” ** to the parameters. If this annotation is not added, the validation rules for calls to the Javax package are invalid. The correct way to write this annotation is to add an annotation before the parameters of the Controller layer method.


In addition, because the user_fruit_id field is the primary key of another table, I do not query the existence of this ID in the code, so that whatever value is passed by the caller directly triggers the SQL query, and directly runs the full table scan without running the index.

I have to say, this is really a rookie mistake, this result also let me feel ironic, no matter how to say also worked for several years, how still make some novice level mistakes, this face hit really let me quite ashamed.


conclusion

Although it was a simple mistake, the consequences were quite serious. This incident also made me more alert, and I must abide by the principles in the future development work, which can be summarized as follows:

1. You can’t trust the calling side. Important parameters must be verified first, even if they are non-null values. If they do not meet the conditions, they must return or throw exceptions directly. They cannot participate in business SQL queries, otherwise frequent access will also cause a burden on the service.

2. Perform performance query before SQL statement. For tables with a large amount of data, after the index is built, all SQL queries are tested with explain and the index is further optimized based on the results.

3. The code must be reviewed. I did not pay much attention to the review of the code before, although it was also due to the tight iteration schedule, but in any case, the bugs were caused by my negligence, especially small errors such as null values that are common in Java. Sometimes a small bug can easily lead to the collapse of the whole system. This problem also makes me deeply understand the importance of reviewing code. No matter how troublesome the workload of business development is, this step should never be ignored.

subsequent

After knowing the cause of the bug, I re-released the code on the same day. Later, Zhuang told me that he asked me to write a summary of the problem record in order to alert others of the problem in the future. I am a little puzzled, this is not my strength ah, high school composition has never exceeded 40 points, but think about it is really their own problem, or honestly write a record. I thought so you can breathe a sigh of relief, but the elder brother of the flat (group a heavyweight) suddenly with strange eyes looked at me and patted me on the shoulder, all of these statements, XXX also because last time online problems to write a report, this time you are no exception, I estimate request not too many words, may also be just ten thousand words or so. I instantly felt a lightning bolt hit my head, oh my God…