The background,

From a business point of view: slow SQL will lead to poor user experience of the product, which will reduce the user’s favorable degree of the product. From a database perspective: Slow SQL affects database performance, and each SQL execution consumes a certain amount of I/O resources. Suppose the total resource is 100, and a slow SQL takes up 30 resources for a total of 1 minute. The total amount of resources that other SQL can allocate in that 1 minute is 70, and so on. When the resources are allocated, all new SQL executions are queued up.

Second, the discovery

Before we can govern slow SQL, we need to know which SQL is slow, that is, we need to identify the objects to govern. MySQL provides slow query logs. When the SQL time exceeds the specified threshold, the SQL is recorded in the slow query log file. Users can extract slow SQL from the slow query log file. Slow query logs can be dynamically enabled in MySQL. That is, if slow query logs are not enabled on an online server, the server becomes invalid after being restarted. To prevent online services from being affected, you can do this first and add configuration items to the my.cnf configuration file.

  • Check the configuration

    • Slow_query_log Specifies whether slow query logs are enabled
    • Long_query_time Slow query threshold
    • Slow_query_log_file Specifies the slow.log file location
    show VARIABLES like '%query%';
    Copy the code
  • Enable slow log query

Slow query logs are not enabled for the MySQL database by default. You need to manually set this parameter. Otherwise, you are not advised to enable this parameter because slow query logs may adversely affect performance

# Enable slow query logsset global slow_query_log=ON; Slow query thresholdset global long_query_time=1; Slow query log filesset global slow_query_log_file=/tmp/mysql_slow.log
Copy the code

Three, positioning

We extract slow SQL through slow query logs, classify these slow SQL based on different applications, organize a document, locate the code of the corresponding application, and record the services and scenarios (scheduled tasks, online real-time query, etc.) of the slow SQL application in the document.

Four, analysis,

Next is according to the sorted documents, do some analysis of these slow SQL, find out the cause of slow SQL.

4.1 Index level analysis

Use the Explain command to output the execution plan of the SQL. Through the execution plan, we can understand the execution details of the slow SQL.

The execution plan columns in Mysql are described.

  • Id: indicates a hierarchical id parsed according to the SQL syntax

  • Type: access path type used by the specified table in the execution plan.

    This is a very important field, and it is the main way to judge the efficiency of an SQL execution (here are just a few common ones).

From the best to the worst: system > const > eq_ref > ref > range > index > ALL

System: Reads a row from the system table. This is a special case of the const join type.

Const: The table has at most one matching row, which will be read at the start of the query. Const is used to compare all parts of the PRIMARY KEY or UNIQUE index with a constant value

Eq_ref: This is used when all parts of an index are joined and the index is UNIQUE or PRIMARY KEY.

Ref: Compared with eq_ref, it does not use a unique index. Instead, it uses a normal index or a partial prefix of a unique index. The index is compared with a value, and multiple rows may be found that match the condition. -ref is a common situation in our daily development and is the level we expect in principle, where the query hits the index.

Range: indicates the index range scan

Index: scans only the index tree and does not need to query in the table. In this case, the result of the Extra column in Explain is Using index

All: indicates full table scanning.

  • Possible_keys: Optional index when looking up rows in a table.
  • Key: displays the index that MySQL actually decides to use.
  • Key_len: displays the length of the index key used by MySQL, that is, the length of the index selected for this query.
  • Ref: The ref field identifies which fields or constants are used in conjunction with the key to retrieve records from the table
  • Rows: This column indicates the number of rows that MySQL estimates to be read and examined. It is important to note that this is not the number of rows in the result set.
  • Filtered: Indicates the number of rows returned (the amount of data filtered by MySQL layer WHERE) as a percentage of the number of rows that need to be scanned (the value of the rows column). Generally, the higher the level, the more expensive the query.
  • Extra: This column contains the details of the queries that MySQL solves (the following are just a few common ones).
    • Using filesort: indicates that mysql uses an external index sort for data, instead of reading data in the order of the indexes in the table.
    • Using index: This value emphasizes the need to query a table Using an index, rather than directly accessing the table data.
    • Using temporary: This value indicates that internal temporary tables (memory based tables) are used. This typically occurs when queries contain group BY and Order BY clauses, or columns from different tables use distinct.
    • Using WHERE: Where conditional query. Usually Using WHERE indicates that the optimizer needs to query data back into the table through an index

4.2 Analysis at the Business Level

Consider it based on specific business scenarios

  • Whether all query conditions are required
  • Can the query time range be shortened
  • Some large fields exist in the table. Shorten the field length according to the actual situation
    • If you do not have to fetch the table every time, you can consider separating the table from the current table. The main table has small fields, and the child table maintains large fields, which is more efficient

Fifth, to solve

5.1 SQL optimization

5.1.1 Index Optimization

  1. Left prefix rule: If multiple columns are indexed, the query must start from the left column. Otherwise, the index will fail
  2. When using the unequal symbol (! =, <>) will invalidate the index
  3. Using IS not NULL invalidates the index, but is NULL does not
  4. A like fuzzy query that starts with a wildcard character invalidates the index
  5. If OR is used, indexes must be added to the left and right fields; otherwise, indexes will be invalid
  6. Using a function on an index column invalidates the index
  7. Avoid implicit type conversions – string fields are invalidated without single quotes
  8. Sometimes the MySQL optimizer does not select the optimal INDEX. You can use FORCE INDEX(idx_order_id) to FORCE an INDEX to be removed. Of course, you must ensure that the INDEX cannot be deleted later, otherwise it is a BUG

5.1.2 Sub-query optimization

select something from user_table 
where id in (select user_id from order_table where xxx=yyyy);
Copy the code

MySQL optimizer has optimized the subquery since 5.7, which will automatically convert to join before execution. However, for MySQL with versions below 5.7, we suggest changing the subquery to join mode:

 select a.something
 from user_table a, order_table b
 where a.id=b.user_id
 and b.xxx=yyyy;
Copy the code

5.1.3 Paging optimization

Select * from mytbl order by id Limit 100000,10 select * from mytbl order by id Limit 100000,10

select * from mytbl 
where id > = ( select id from mytbl order by id limit 100000.1 ) 
limit 10# orselect * from mytbl 
inner join (select id from mytbl order by id limit 100000.10) as tmp 
on tmp.id=ori.id;
Copy the code

5.1.4 File Sorting Optimization Using filesort

Orders establishes the IDX_pPO_CREATED_AT index, which is analyzed using EXPLAIN

EXPLAIN
SELECT id,
       temp_id,
       pos_id,
       `type`,
       member_id,
       temp_status,
       money_amount,
       trans_amount,
       return_trans_amount,
       coupon_id,
       cash_points,
       is_cancel,
       is_auto_cancel,
       is_compensate,
       is_multi_equity,
       company_id,
       store_id,
       store_type,
       source_orders,
       return_order_id,
       created_at,
       updated_at
FROM orders
WHERE 1 = 1
  AND `created_at` > = '2021-08-27 00:00:00'
  AND `created_at` < = 'the 2021-08-27 23:59:59'
  AND `type` = 0
  AND `is_cancel` = 0
  AND `temp_status` = 0
ORDER BY id DESC
LIMIT 0.100

Copy the code

MySQL uses an external index to sort data, instead of reading data in index order.

To solve

Because the index leaf node data is based oncreated_atOrdered, and we can use that to avoid sorting. Let’s replace ORDER BY ID DESC withORDER BY created_at DESCUsing filesort has disappeared:

5.2 Service Transformation

If SQL itself is at its maximum performance, it still takes a long time. At this point, we can start from a business perspective and see if we can make some flexibility and compromise in business.

5.2.1 Optimization of total display

As shown in the figure above, when we do paging, the total number of eligible records is displayed at the bottom of the page, along with the number of paging pages. There are no performance issues when the data is small, but if the data is large, the total count() itself is not too fast, plus every time to open the page to calculate once, that will bring performance problems, but also slow down the page opening speed.

If the business allows, when the amount of data is small, accurate display, when the amount of data is too large, the user is not sensitive to the real data, then we can use 1000+ and other fuzzy methods for preliminary display, reduce unnecessary scanning, but also allow users to open for the first time to achieve acceleration effect.

SQL: select count(*)
) from table where xxx=yyy;

Select count(*) from (select id from table where XXX =yyy limit 1000);

5.2.1 Associated table Optimization

Because services are complex, many tables are associated with a CERTAIN SQL, which takes a long time to match tables. At this point, you can see if you can change the multi-associative SQL to the association with fewer tables, using code to do the association, but will increase the number of requests.

5.3 Reducing Data

If the number of rows in a single table is more than 5 million or the capacity of a single table is more than 2GB, SQL optimization will be slow, this time to do data split, which belongs to the level of change in the architecture, the impact of the surface is very large, in addition to the slow SQL itself, other related SQL may also be “affected” to. This method of slow SQL governance can solve the problem of slow SQL to a certain extent. The reduction of action data is as follows:

  • The vertical resolution
  • Horizontal split
  • Integrated split (vertical + horizontal)

conclusion

This paper introduces the methodology of slow SQL governance in practice. 1. Discover: how to discover slow SQL. 2, positioning: how to locate the slow SQL write where. 3, analysis: encountered slow SQL analysis ideas. 4, solution: slow SQL solution. SQL optimization itself is a more complex problem, listed above, only part of the optimization of the case, but all optimization ideas are as far as possible to reduce the NUMBER of SQL in the execution of the process of scanning data blocks, as long as follow this core idea, SQL optimization is not too difficult. Above is the content that wants to speak today, this article is a few summary and experience in author actual management process, if have wrong place, still point out please.