1. Blow water

A few days ago the boss asked me how to carry out SQL optimization, I answeredThe new index. Ha ha ha, then the boss went out looking for a stick, came in and said to me you know where the door is, you go out or I invite you out?He was forced to go out and get beaten, and when he came back the boss said go and see what was thatExplainAnd turn in a 5000-word review…

The following content is described in MySQL 8.0

2. Basic content

Mysql > alter table create index (); mysql > alter table create index (); First of all, InnoDB’s index model is B+ tree. In InnoDB, tables are stored in the form of indexes according to the order of primary keys, which is called indexed organized tables. Each new index corresponds to a B+ tree in InnoDB. Suppose we have a table with a primary key column ID, a field K, name, and an index on K.

create table T(
id int primary key.k int not null.name varchar(16),
index (k))engine=InnoDB;
Copy the code

Insert three rows into table:

INSERT INTO T (id, k) VALUES (100.1);
INSERT INTO T (id, k) VALUES (200.2);
INSERT INTO T (id, k) VALUES (300.3);
INSERT INTO T (id, k) VALUES (500.5);
INSERT INTO T (id, k) VALUES (600.6);
Copy the code

An example of two trees is shown below:As you can see from the figure, every time you create an indexWe added a new B+ tree, and the index is divided intoThe primary key indexandNon-primary key index:

Primary key index:Leaf nodes indexed by primary keys hold entire rows of data. In InnoDB, the primary key index is also calledClustering index(Clustered index).

Non-primary key index:Leaf node contents that are not primary key indexes are primary key values. In InnoDB, non-primary key indexes are also calledSecondary indexes(Secondary index).

Now execute a query:

select * from T where id=500;
Copy the code

This statement simply searches the B+ tree id and returns all the data for the leaf node.

Next query statement:

select * from T where k=5
Copy the code

Select * from B+ tree where k=5; select * from B+ tree where k=5; select * from B+ tree where k=5;

That is, a query based on a non-primary key index requires one more index tree to be scanned. Therefore, we should try to use primary key queries in our applications.

Of course, I know that primary key queries are rarely used in real development, because primary keys do not normally exist in business processes, so consider the following statement:

select id from T where k=5
Copy the code

Index tree (k, k, k, k, k, k, k, k, k, k, k);

  • Because overwriting indexes can reduce the number of tree searches and significantly improve query performance, it is a common performance optimization method to use overwriting indexes.

Based on the overwriting instructions above, let’s look at another example:

CREATE TABLE `tuser` (
  `id` int(11) NOT NULL.  `id_card` varchar(32) DEFAULT NULL.  `name` varchar(32) DEFAULT NULL.  `age` int(11) DEFAULT NULL. `ismale` tinyint(1) DEFAULT NULL. PRIMARY KEY (`id`),  KEY `id_card` (`id_card`),  KEY `name_age` (`name`.`age`) ) ENGINE=InnoDB Copy the code

There is a primary key index id, two secondary indexes ID_card and name_age. Name_age is an index consisting of two columns. Suppose the following statement is now executed:

select id from tuser where name like'a %';Copy the code

Select * from name; select * from name; select * from name; select * from name;Select ID3, ID3, ID3, ID3, ID3, ID3, ID3, ID3, ID3, ID3

However, if the combined index is defined as name_age (age,name), the above query does not use the combined index, which is called the left-most prefix principle.

  • There is an unwritten rule in the industry that a single table contains no more than 20 fields and no more than 5 indexes. As the amount of data increases, too many indexes occupy too much physical space.

Of course, these are some basic SQL, encounter some more complex SQL statements, how to optimize, this will useExplain execution planThis SQL statement is very complex, is now used by the company, so type a code, sorry ha:Here’s the point: Using the Explain keyword, you can see the number of rows scanned by each query, which index is used, etc. The following article will Explain what each field means and then look at how to optimize...

3, Explain the execution plan

In the case of the image above, Explain mainly displays information about how the optimizer’s expectations match the actual execution times and other iterator-based information through the following fields:

id

Id is the order of execution, which is the priority of each statement, which can be the same (which is determined by the optimizer) or different (the higher the ID value, the higher the priority, and the earlier the statement is executed).

select_type

Select query type, mainly used to distinguish a variety of complex queries, such as ordinary query, federated query, sub-query, etc., the value of the following:

  • SIMPLE: indicates the simplest select query statement, that is, the query does not contain operations such as subquery or union fetch and set.
  • PRIMARY: When the query contains any complex subparts, the outermost query is marked as PRIMARY.
  • SUBQUERY: When a select or WHERE list contains a SUBQUERY, the SUBQUERY is marked as SUBQUERY.
  • DERIVED: Represents the select contained in the SUBquery in the FROM clause. Subqueries contained in the FROM list are marked as DERIVED.
  • UNION: If the UNION is followed by a SELECT statement, it will be marked as UNION. If union is included in the subquery of the FROM clause, the outer select is marked as derived.
  • UNION RESULT: represents reading data from the temporary table of the UNION, and the

    of the table column indicates the UNION operation with the results of the first and fourth select.
    ,4>

table

The name of the table to be queried is not necessarily a real table. If an alias is specified for the table, the alias is displayed. It may also be a temporary table. (Just a table name, just a comment)

partitions

For non-partitioned tables, the value is NULL. If partitioned tables are queried, partitions that are matched by partitioned tables are displayed.

type

What type of query is used, which is a very important metric in SQL optimization, in order of good to bad: system > const > eq_ref > ref > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

  • System: When a table has only one row (system table), the amount of data is small and disk I/O is not required.
  • Const: indicates that the query hit the primary key or unique index, or that the joined part is a const. This kind of scan is highly efficient, returns little data, the speed is very fast.
  • Eq_ref: the primary key or unique key index is matched during query.
  • Ref: Different from eq_ref, ref means that a non-unique index is used and many rows are found that match the condition.
  • Ref_or_null: This connection type is similar to ref, except that MySQL searches for rows containing NULL values in addition.
  • Index_merge: The index merge optimization method is used, and the query uses more than two indexes.
  • Unique_subquery: It’s just that the index lookup function used to completely replace the subquery is more efficient.
  • Index_subquery: Different from unique_subquery, used for non-unique indexes and can return duplicate values.
  • Range: Selects rows using an index, retrieving only rows in a given range. Simply put, this means retrieving data from a given range for an indexed field. Use bettween in the WHERE statement… And, <, >, <=, in and other conditional query types are range.
  • Index: Both index and ALL read the entire table. The difference is that index is read through the index tree, while ALL is read from the hard disk.
  • ALL: traverses the entire table to find matching rows, with the worst performance.

possible_keys

If an index exists on a column, the index will be listed, but this index is not necessarily the index that will be used in the final query. In some cases, you may need to specify force index when the optimizer does not execute the specified index.

key

Unlike possible_keys, key is the actual index used in the query. If no index is used, it is displayed as NULL.

key_len

Key_len: indicates the length (bytes) of the index used in the query. In principle, the shorter the length is, the better.

A single-column index takes the entire index length into account. A multi-column index, not all columns are available, needs to calculate the columns actually used in the query.

ref

Ref: const, func, null

Display const when using constant equivalents. In an associated query, the associated fields of the corresponding associated table are displayed. If the query condition uses expressions, functions, or if the condition column is implicitly converted, it may appear as FUNc. In other cases, null is displayed.

rows

Rows: Estimates the number of rows that need to be read to find the records we need, based on table statistics and index usage.

This is an important data to evaluate the performance of SQL, mysql needs to scan the number of rows, very intuitive display of SQL performance, in general, the smaller the rows value is the better.

filtered

Filtered is a percentage. In simple terms, this field represents the percentage of records returned by the storage engine that satisfy the criteria after the data is filtered. (After MySQL.5.7, default Explain directly displays filtered data for partitions and filtered data).

Extra

This column displays a lot of information that is not suitable for other columns, and a lot of additional information from Explain is displayed in this field:

  • Using index: we used an overwrite index in the corresponding select operation. Generally speaking, it means that the column of the query is overwritten by the index. Using index can be very fast.
EXPLAIN SELECT id FROM s_goods;
Copy the code
image
  • Using WHERE: Not all queries with WHERE statements display Using WHERE when no index is available.
EXPLAIN SELECT * FROM s_goods WHERE cat_id = 1;
Copy the code
image
  • Using temporary: Indicates that the query results need to be stored in a temporary table. This table is used for sorting or group query.
EXPLAIN SELECT shop_id FROM s_goods WHERE cat_id IN (1.2.3.4) GROUP BY shop_id;
Copy the code
image
  • Using filesort: indicates a sort operation that cannot be done Using an index, i.e. an ORDER BY column that does not have an index.
EXPLAIN SELECT * FROM s_goods ORDER BY shop_id;
Copy the code
image
  • Using join buffer: Displays this if the associated field does not use an index.
EXPLAIN SELECT * FROM s_goods a LEFT JOIN s_goods_cats b ON a.shop_id = b.parent_id;
Copy the code
image
  • Impossible WHERE: Indicates that there are no eligible rows when we use the incorrect WHERE statement.
EXPLAIN SELECT * FROM s_goods WHERE 1 <> 1;
Copy the code
image
  • No tables Used: Our query does not have a FROM clause or a FROM DUAL clause.
EXPLAIN SELECT NOW(a);Copy the code
image

There are some other fields, here will not list one by one, if there is a need to click on the following link to view!

MYSQL 8.0 MYSQL 8.0

thanks

  • Lin Xiaobin MYSQL combat 45 talk
  • Will not see Explain execution plan, advise you do not write resume familiar with SQL optimization

At the end

If you think my article is helpful to you, please pay attention to my wechat official account :” A happy and painful programmer “(no advertising, simply share original articles, pJ practical tools, a variety of Java learning resources, looking forward to common progress with you).