Explain do you really know how to use it? August more challenge

preface

Environment: Mysql 8.0.21
Explain results vary with the Mysql version

1. The role of Explain?

EXPLAIN statements provide information about how MySQL executes statements. EXPLAIN can be applied to SELECT, DELETE, INSERT, REPLACE, and UPDATE statements

Explain returns one row of information for each table used in the SELECT statement. It lists the output tables in the order they were read by mysql while processing the statement. Mysql uses the nested loop join method to resolve all joins, which means mysql reads a row from the first table, then finds a matching row in the second table, the third table, and so on. After processing all the tables, Mysql outputs the selected columns and backtracks through the list of tables until it finds a table with more matching rows. Read the next row from the table, and proceed to the next table.

Explain Output column information

1.1 id

This is the sequence number of select in the query, and the row can be Null if it references the union result of other rows

The value of the ID column represents the order in which the table is executed. There are three cases:

1.1.1 same id

EXPLAIN SELECT s.*, t.* FROM student s, teacher t
Copy the code

You can see that the explain command isselectThe identifier statement generates a row of information for each table, where the ID is the same, indicating that the two tables are executed from top to bottom, regardless of the order of execution in the SQL.

1.1.2 id different

EXPLAIN SELECT * FROM teacher t WHERE id = ( SELECT s.tid FROM student s WHERE s.id ="2")Copy the code

The two tables with nested subqueries are executed in different order, so explain resolves the query information ID differently. The larger the ID is, the higher the priority is.

1.1.3 Id is the same or different

EXPLAIN SELECT s.id, s.NAME, c.id, c.NAME FROM class c, student s WHERE s.tid = ( SELECT t.id FROM teacher t WHERE t.id = "1")
Copy the code

If the ids are the same, they can be considered as a group and executed from top to bottom in the order shown in the result set (regardless of the order declared in the SQL). A larger ID indicates a higher priority, and the command is executed first.

1.2 select_type

Select the type of all cases as shown in the table below:

1.2.1 SIMPLE

Simple query without any complex queries such as subqueries or union [see example 1.1.1 for details]

1.2.2 PRIMARY

If the query contains any complex subqueries, the outermost layer is marked as primary, commonly known as egg shell. [See example 1.1.2 for details]

1.2.3 the UNION

EXPLAIN SELECT `NAME` FROM tb_employees_china UNION SELECT `NAME` FROM tb_employees_usa
Copy the code

unionAfter theselectIs marked asunionAnd theunionIn front of theselectIs marked asprimary; ifunionIncluded in thefromClause, the outer layerselectWill be marked asderived

1. The UNION RESULT

The result of the merger of two unions. [See example 1.2.3 for details]

1.2.5 DEPENDENT UNION

EXPLAIN SELECT `NAME` FROM tb_employees_china WHERE `NAME` IN ( SELECT `NAME` FROM tb_employees_china UNION SELECT `NAME` FROM tb_employees_usa )
Copy the code

The first condition to satisfy the UNION, and the second or subsequent select in the UNION, depends on external queries.

1.2.6 SUBQUERY

EXPLAIN SELECT * FROM tb_employees_china WHERE id = ( SELECT id FROM tb_employees_china WHERE `name` = "lisi" )
Copy the code

The first select in a subquery is identified as subquery

1.2.7 DEPENDENT SUBQUERY

Select in a subquery that relies on an external query [see example 1.2.4 for details] The first select of this example neutron query is identified as dependent subQuery

1.2.8 DERIVED

Subqueries included in the FROM clause are marked as derived, and mysql recurses these subqueries to put the results in temporary tables (temporary tables are a system burden, but sometimes have to be used).

Note: In this example, the mysql environment is: 5.7

EXPLAIN SELECT * FROM ( SELECT id FROM tb_course tc ) temp
Copy the code

Because Mysql 8.0 is optimized over Mysql 5.7, the above instance will be explained in the 8.0 environment. The subquery will not be identified as derived. No instance of Mysql 8.0 derived has been found. (Follow-up findings will be supplemented)

1.2.9 DEPENDENT DERIVED

Based on Derived, it relies on external queries.

1.2.10 MATERIALIZED

EXPLAIN SELECT * FROM tb_class WHERE `NAME` IN ( SELECT `NAME` FROM tb_class)
Copy the code

The process of saving records in a subquery result set to a temporary table is called Materialize. The temporary table that stores the result set of the subquery is called the materialized table

When the query optimizer executes a statement containing a subquery and chooses to materialize the subquery and then join it with the outer query, the subquery will be identified as MATERIALIZED

The third record of the execution plan has an ID of 2, indicating that this record was executed first and is a single table query. It is identified as MATERIALIZED, and the query optimizer is intended to convert the subquery to MATERIALIZED tables first. The second record of the execution plan is the record whose ID is 1 and table is

. This record is the physical table generated after the execution of the corresponding subquery with ID 2. Then join the tb_class with the physical table (two tables with ID 1).

1.2.11 UNCACHEABLE SUBQUERY

Subqueries whose results cannot be cached must recalculate the results for each row of the external query

1.2.12 UNCACHEABLE UNION

The second or subsequent select in the union is a non-cacheable subquery

Table 1.3

The name of the table referenced by the output row. There are three cases in addition to the table name

: union of id values between M and N [see example 1.2.4 for details]
: derived table result with id value N [see example 1.2.8 for details]
: result of materialized subquery with id value N [see example 1.2.10 for details]

1.4 type

The official full name is Join Type, which means: join type. There are 12 types of type in Mysql 8.0. The following six types are mainly introduced. From top to bottom, efficiency is enhanced, and we should try to optimize our SQL to make it as good a type as possible, taking into account the actual situation.

1.4.1 all (Full Table Scan)

Full scan, a full table scan is performed for each row combination in the table. If the table is the first table that is not marked const, ALL can usually be avoided by indexes that allow rows to be retrieved from the table based on constant or column values from the previous table.

ALL is a violent and primitive search method that is time-consuming and inefficient. But Mysql official describes a few cases where you can use ALL scans:

The table is so small that performing a table scan is much faster than interrupting a key lookup, which is common for short tables with fewer than 10 rows
For index columns, there are no restrictions available in the ON or WHERE clauses
Comparing index columns to constant values
A low cardinality key (many rows match the key value) is being used through another column, in which case Mysql assumes that multiple key lookups may be required by using keys and that table scans will be faster.

For small tables, table scans are usually appropriate and have a negligible impact on performance, and for large tables, query optimization is a must.

SELECT * FROM `tb_employees_china` WHERE `name` = "zhangsan"
Copy the code

This is because the name column is neither a primary key nor an index, so a full scan is used to find it.

1.4.2 index (Full index Scan)

The difference between index and all is that index scans the index tree and returns data to the table based on the index. Compared with all, they both retrieve data from the full table. In addition, index must read the index first and retrieve data randomly from the table.

SELECT id FROM `tb_employees_usa`
Copy the code

If type is index and Extra is Using index, as shown in the figure above, the overwritten index is used. That is, there is no need to return to the table and the current index tree meets the current query requirements.

1.4.3 range

For example, between… between… between… between… between… between… between… And, <, >, in, or all belong to index range scans.

1.4.4 ref

This join type occurs if the search criteria column uses indexes and does not use primary keys and Unique, that is, ordinary indexes are used instead of primary key indexes and Unique indexes. In this way, even if the first data is quickly found using the index, the scan cannot be stopped. Instead, a small scan near the target value is performed. The advantage is that the entire table does not need to be scanned because the indexes are ordered and even if there are duplicate values, the scan is still a very small range.

Create an index for the name field in tb_employees_USA in the following table and run the following statement:

explain select * from tb_employees_usa where `name` = 'rose'; \Copy the code

1.4.5 eq_ref

What makes eQ_REF so much more interesting than ref is that eQ_REF knows that there is only one result set for this type of lookup, and only one result set if the lookup is using a primary key or a unique index. It is known before the search that there must be only one result set, so the query is stopped immediately the first time a value is found. This type of connection performs a precise query each time and does not require much scanning, so the search is more efficient.

select * from tb_employees_usa join tb_employees_china using(id); \Copy the code

1.4.6 const

In general, if a primary key is placed in the WHERE clause as a query condition, the MySQL optimizer optimizes the query to a constant, i.e., const

select * from tb_employees_usa where `id` = '3'; \Copy the code

1.5 possible_keys

The indexes that the query might use are listed here

1.6 the key

Query the actual index used

1.7 key_len

The smaller the key_len, the better the index. How to calculate the key_len length?

The column type	Whether is empty	The length of the	key_len	note
tinyint	Allow Null	1	key_len = 1+1	Allow NULL and increment the length of key_len by 1
tinyint	Do not allow Null	1	key_len = 1	Do not allow NULL
int	Allow Null	4	key_len = 4+1	Allow NULL and increment the length of key_len by 1
int	Do not allow Null	4	key_len = 4	Do not allow NULL
bigint	Allow Null	8	key_len = 8+1	Allow NULL and increment the length of key_len by 1
bigint	Do not allow Null	8	key_len = 8	Do not allow NULL
char(1)	Allow Null	utf8mb4=4,utf8=3,gbk=2	key_len = 1*3 + 1	Allow NULL, character set UTF8, key_len + 1
char(1)	Do not allow Null	utf8mb4=4,utf8=3,gbk=2	key_len = 1*3	NULL, character set UTf8 not allowed
varchar(10)	Allow Null	utf8mb4=4,utf8=3,gbk=2	key_len = 10*3 + 2 + 1	Dynamic column type, key_len + 2, NULL, key_len + 1 is allowed
varchar(10)	Do not allow Null	utf8mb4=4,utf8=3,gbk=2	key_len = 10*3 + 2	Dynamic column type, key_len length + 2

Calculate the length of an index using our execution plan above.

Where, the index name isidx_nameThe index type of thevarchar(30)According to the above chart,varchar(30)Key_len as follows:30 * 3 + 2 + 1 = 93

Key_len can also be used to determine whether the union index is in effect and which union index is being overridden, as shown in an example.

We will betb_employees_usaDelete all indexes from the table and create a joint index(name,age)

Execute the following SQL to view the execution plan

explain select * from tb_employees_usa where name='rose';

1Copy the code

key_lenThe value is 93, and the SQL is calculated to use a single index of the federated indexesname

Then run the following SQL to view the execution plan

explain select * from tb_employees_usa where name='rose';
Copy the code

key_lenThe value is 98, and the SQL is calculated to use the federated index.

1.8 ref

The value of the ref field is a column or constant, which, along with the value of the key, selects row data from the table. As in the above example

select * from tb_employees_usa join tb_employees_china using(id);
Copy the code

refThe field values fortest.tb_employees_usa.idOn behalf oftb_employees_usa 的 idThe column withtb_employees_chinaTo filter the row data with the primary key of

If the condition in the WHERE clause is an equivalent constant, then the ref value is const

1.9 rows

Indicates the number of rows to be scanned during SQL execution. A larger value indicates the number of rows to be scanned and the corresponding time is longer. However, it is important to note that the rows output in EXPLAIN is only an estimate and cannot be fully trusted.

1.10 filtered

Represents the percentage of the data returned by the storage engine that meets the number of queried records after being filtered by the Server layer. This value is a percentage, not a specific number of records.

1.11 Extra

Performance from good to bad:

using index > using where > using temporary > using filesort
Copy the code

using index: overwrites the index and does not need to return to the table
using where: Column data from only used
using temporary: indicates that MySQL needs to use temporary tables to store result sets. This is common in sorted and grouped queries
using filesort: MySQL cannot use the index to complete the sort operation called “file sort”, generally has this value, it is recommended to use the index to optimize
using join buffer: highlights the fact that indexes are not used when fetching join conditions and that connection buffers are needed to store intermediate results. If this value occurs, indexes should be added on a case-by-case basis to improve performance
distinct: The distInc keyword is used in the select section

summary

This article describes what each field represents in EXPLAIN. For example, if you are interested in MySQL, stay tuned to the MySQL column.