What is slow query?
The definition of slow query has different conditions in different scenarios, but it is generally defined by the query speed. In fact, the query is very slow, such as the common paging list multi-condition union + sub-query + multi-table association + massive data, common e-commerce platform scenarios. The slow query I have experienced is defined as 2s, and any SQL above 2s is defined as slow. Slow query will not only greatly reduce user experience and consume database and system performance, so, how to solve slow query?
How to optimize
Analysis of SQL
First of all, the SQL structure analysis, related tables, fields, conditions and so on are the key points of analysis, so how to analyze it? For query statements, look at the index first, so the establishment of the index can improve the query efficiency is obvious, such as where, group by, such as whether to use the index, index Settings are generally based on conditions, such as select * from user where age>=20 group by Address, where age and address are both conditions, can be indexed on these two fields. See whether the query field is associated with multiple tables, the foreign key and association relationship of multiple table association, and see the divergence of subqueries. These analyses can be viewed step by step through the SQL execution plan.Copy the code
SQL Execution plan
Here is a brief excerpt of how to analyze SQL using an execution plan
explain select * from user; In the explain result, you can see id, select_type, table, PARTITIONS, type, possible_keys, key, key_len, ref, rows, filtered, and Extra. What do these fields do? What does it mean?
Id indicates the execution sequence of SQL statements. If the IDS are the same, the SQL statements are considered to be in the same group. If the IDS are the same, the SQL statements are executed from top to bottom. For subqueries, the ordinal number of the ID increasesCopy the code
Select_type Select query type, which is mainly used to distinguish common query from complex query such as joint query and subquery. - SIMPLE: SIMPLE select, without UNION or subquery. The outermost query in a subquery. If the query contains any complex subparts, the outermost select is marked as a primary-union: the second or subsequent select statement in the UNION - DEPENDENT UNION: The second or subsequent SELECT statement in the UNION, depending on the external query. -union RESULT: the RESULT of the UNION. -dependent SUBQUERY: The first SELECT in a SUBQUERY, DEPENDENT on the external query -derived: - UNCACHEABLE SUBQUERY: The result of a SUBQUERY cannot be cached and the first row of the outer link must be reevaluatedCopy the code
Table indicates the name of the table accessed by the SQL in this step, that is, the table from which the data in this row comes. Sometimes it is not a real table name, it may be an alias, or it may be an abbreviation of the result of several stepsCopy the code
Partitions, if queried based on partitioned tables, show the partitions accessed by the queryCopy the code
Type Access mode of a table. It is also called access type. Type allows you to quickly view SQL performance. The common types are: ALL, index, range, ref, eq_ref, const, system, NULL (from left to right, performance from bad to good) ALL: Full Table Scan, MySQL will Scan the entire Table to find a matching row index: Range: only rows in a given range are retrieved, using an Index to select rows -ref: Represents the join matching condition of the table, i.e. which columns or constants are used to find values on the index column, looking up the specified value instead of the range -eq_ref: SQL > select * from a table where a primary key or a unique key is used - const, system: select * from a table where a primary key or a unique key is used These types of access are used when MySQL optimizes part of a query and converts it to a constant. MySQL can convert the query to a constant if the primary key is placed in the WHERE list. System is a special case of const type. System-null is used when the table is queried with only one row: MySQL breaks down statements during optimization without even accessing the table or index. For example, picking the minimum value from an index column can be done by a separate index lookup.Copy the code
Possible_keys displays possible indexes, usually one or more. If an index exists on a field involved in an SQL statement, the index is listed but not necessarily used. If the column is NULL, there is no associated indexCopy the code
Key The key column shows the key (index) that MySQL actually decided to use, most likely included in possible_keys. Key values are not necessarily a subset of possible keys. MySQL > select key, key, key, key, key, key, key, key, key, key, key If no index is selected, the key is NULL. To FORCE MySQL to USE or IGNORE indexes in the Possible_keys column, USE FORCE INDEX, USE INDEX, or IGNORE INDEX in the query.Copy the code
Key_len represents the number of bytes used in the index. This column is used to calculate the length of the index used in the query. The shorter the length is, the better, without losing accuracy. Key-len displays the maximum possible length of the index field, not the actual length, that is, key_len is calculated from the table definition, not retrieved from the table.Copy the code
Ref shows which column of the index is used. Sometimes it is a constant that indicates which column or constant is used to find the value on the index columnCopy the code
Rows estimates the number of rows in the result set, indicating that MySQL estimates the number of rows that need to be read to find the desired record based on the table statistics and index selection. In table statistics, there is a noun that represents the number of different values on an index. InnoDB will select N data pages by default, count the different values of these pages, get an average, and then multiply by the number of pages in the index to get the cardinality of the index.Copy the code
Filtered indicates the percentage of the number of returned rows to the number of rows to be read. The greater the filtered column value, the better, is also an estimated valueCopy the code
Using WHERE: specifies whether to use the where filter condition Using temporary (temporary) : MySQL needs to use temporary tables to store result sets. The order by. Group by must follow the order in which the indexes are built. - Using filesort: when a Query contains an order by operation and cannot be sorted Using an index, it is called "filesort". When sorting, it is best to follow the order in which the indexes are created or else you may find a Using filesort. -using Index: Using Where (Index) is used to perform the lookup of Index key value. If (Index) is not used at the same time, the Index is used to perform the lookup of Index key value. - Using join buffer: The change emphasizes that the index is not used when fetching join conditions and that the join buffer is needed to store intermediate results. If this value is present, it should be noted that indexes may be added to improve performance depending on the query. - Impossible WHERE: This value emphasizes that the WHERE statement will result in no eligible rows (the result cannot exist by collecting statistics). -select tables Optimized Away: This value means that the optimizer may return only one line from the aggregate function result by using the index alone. -no tables Used: The Query statement uses from dual or does not contain any from clause. Using filesort, Using temporary, and Using index are the most common expressions. The first two expressions need to be optimized.Copy the code
Execution plan after the analysis of SQL, there is no problem, basically index mechanism and related field optimization should be about the same, but the query is very slow, such as the subquery divergence, table, union, etc. These SQL statements are extremely not recommended if you don’t want to do this operation in database level, recommended the establishment of the view, but view maintenance It is not recommended to use complex SQL queries. You can use Java memory to share some of the SQL load.
Fuzzy search is a convenient and efficient way to query a string using the inverted index of elasticSearch.
Insert and UPDATE statements are generally slow. Consider locking and caching. The database is flushing dirty pages, for example, when the redo log is full and needs to be synchronized to disk. Locks, such as table locks and row locks, are encountered during execution.
Enable slow log query
General test environment and production environment is, be sure to open the test environment is slow response screening test phase interface log, have to open the production environment, users feel very slow, after all, can’t timely tell development, but the development also need to pay attention to this matter, such as the user to reflect the problem belongs to in hindsight, online environment may bring huge losses That will deduct the bonus ah, for the money, this must be opened.
Take mysql as an example:
Add several lines to my.ini: slow_query_log (slow_query_log) and slow_query_log (slow_query_log)
// Define how many seconds the query is slow, Log-slow-queries ="mysql_slow_query.log" #5.5 and later slow_query-log =On Slow_query_log_file ="mysql_slow_query. log" // Record the query log-query-not-using-indexestpspb16glos dndnorte/t without using the indexCopy the code
The optimizer
The optimizer uses the MRR principle: MRR [multi-range Read] reads ids or keys into buffer sort, by converting “random disk reads” to “sequential disk reads”, reducing disk I/O, thereby improving index query performance. mysql >set optimizer_switch**=**’mrr=on’; MRR explain select * from STu where age between 10 and 20; For Myisam, the disk is sequentially read after being sorted by ROWID before fetching complete data from the disk. For Innodb, the cluster index is sorted by the key value of the cluster index, and then the cluster index is read sequentially. Disk prefetch: Request a page of data, you can put back a few pages of data also return together, on the data in the buffer pool, so if the next time you just need the data on the next page, is no longer need to disk read (locality principle) index itself is to reduce the disk IO, speed up the query, and the MRR, is to reduce the index of disk IO, further enlarged
Sub – library sub – table, read and write separation
The above are short time can be optimized to see the effect, the next optimization is from the direction of the architecture. First is the master-slave cluster, separation, speaking, reading and writing, the main library to write, read from the library, if it involves huge amounts of data analysis of kanban report this kind of business, can consider to big data processing, data section, of course this need big data related to the development, also can buy some domestic large data services directly, such as ali oceanDB, kirin data processor (kylin), etc. You depots table if you have time, vertical + level comprehensive scientific analysis after the split, these are all big projects, renovation points is more, the service interface and persistence, master-slave synchronization between, caching, and so on all need to coordinate transformation, of course, the performance of the depots table is a long-term, is necessary, because at the beginning of the business, the business logic and related third-party docking Are uncertain, sub – database sub – table is in later data business continued to accumulate under the circumstances of the beginning.