MySQL Execution Plan

I. concepts

An execution plan, simply put, is the execution of SQL in a database. It is usually used for SQL performance analysis, optimization, and other scenarios. Use the explain keyword +SQL statement in MySQL to view the SQL execution plan.

The test data

Mysql officially provides a test library data on Github, so we can use it to learn (all the following SQL is based on the employees database)

II output

The EXPLAIN output column is as follows:

Column	meaning
`id`	the`SELECT`identifier
`select_type`	`SELECT`type
`table`	If the table has an alias, the alias is displayed
`partitions`	Matching partitions when using partitioned tables
`type`	Connection type
`possible_keys`	Alternative possible indexes
`key`	The actualSelected index
`key_len`	The length of the selected index
`ref`	withColumns for index comparisons
`rows`	Estimate the number of rows to check
`filtered`	Table the percentage of rows filtered by criteria
`Extra`	Additional information

A, id has

The order of query, SQL execution order is based on the highest ID to the lowest, peer (same ID) from top to bottom. Can be null when the result is union, in which case the value of the table column is

indicates that the result of the query with ids M and N is merged.
,n>

Second, the select_type has

This indicates the query type. The specific query type is as follows:

type	instructions
SIMPLE	Simple query (not used`union`And subqueries)
PRIMARY	The outermost layerThe query
UNION	`union`At the back of the select
DEPENDENT UNION	`union`Later select and depend on the outer query (e.g., union in the sub-query and use the outer query’s columns for filtering criteria)
UNION RESULT	`union`The result set
SUBQUERY	The subquery
DEPENDENT SUBQUERY	Subqueries and depend on outer queries
DERIVED	Derive table (from select or join select)
MATERIALIZED	Materialized subquery (materialized means generating subquery results intoA temporary table)
UNCACHEABLE SUBQUERY	Subqueries that cannot be cached
UNCACHEABLE UNION	`union`Subsequent subqueries cannot be cached

`SIMPLE`

Simple queries do not use unions or subqueries

SELECT * FROM employees 
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	employees		range	PRIMARY	PRIMARY	4		149711	100	Using where

`PRIMARY`

Outermost queries can have multiple (in the outermost select, each join or Cartesian product is marked as PRIMARY) and have the same ID

SELECT
	e.*,
	a.sum_salary 
FROM
	employees e
	LEFT JOIN ( SELECT s.emp_no, sum( s.salary ) AS sum_salary FROM salaries s GROUP BY s.emp_no ) a ON e.emp_no = a.emp_no
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	e		ALL					299423	100
1	PRIMARY		ref	<auto_key0>	<auto_key0>	4	employees.e.emp_no	10	100
2	DERIVED	s		index	PRIMARY	PRIMARY	7		2838462	100	Using aggregate

`UNION`and`UNION RESULT`

The select tag after the union keyword becomes union, and the final RESULT of the union is marked as Union RESULT

SELECT
	e.* 
FROM
	employees e 
WHERE
	e.emp_no = 10001 UNION
SELECT
	e.* 
FROM
	employees e 
WHERE
	e.emp_no = 10002 UNION
SELECT
	e.* 
FROM
	employees e 
WHERE
	e.emp_no = 10003
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	e		const	PRIMARY	PRIMARY	4	const	1	100
2	UNION	e		const	PRIMARY	PRIMARY	4	const	1	100
3	UNION	e		const	PRIMARY	PRIMARY	4	const	1	100
	UNION RESULT	< union1, 2, 3 >	ALL								Using temporary

`DEPENDENT UNION`

The select tag after the union keyword becomes union, or DEPENDENT Union if it depends on the outer query condition. Mysql uses EXISTS to optimize the IN subquery

SELECT
	* 
FROM
	employees e1 
WHERE
	e1.emp_no IN (
        Mysql in optimization
SELECT e2.emp_no  FROM employees e2  WHERE	e2.first_name = 'Matt' 
UNION
SELECT e3.emp_no  FROM employees e3  WHERE  e3.first_name = 'Bezalel' 
)
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- after optimization is equivalent to the following SQL -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
SELECT
	* 
FROM
	employees e1 
WHERE
 EXISTS (
SELECT e2.emp_no  FROM employees e2  WHERE	e2.first_name = 'Matt'  and e1.emp_no=e1.emp_no
UNION
SELECT e3.emp_no  FROM employees e3  WHERE  e3.first_name = 'Bezalel' and e3.emp_no=e1.emp_no
)
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	e1		ALL					299423	100	Using where
2	DEPENDENT SUBQUERY	e2		eq_ref	PRIMARY	PRIMARY	4	func	1	10	Using where
3	DEPENDENT UNION	e3		eq_ref	PRIMARY	PRIMARY	4	func	1	10	Using where
	UNION RESULT	3 > < union2,	ALL								Using temporary

`SUBQUERY`

Subquery select in SELECT or where (without optimization)

SELECT
	d.*,
	(SELECT  COUNT(*)  FROM departments ) as dept_num
FROM
	departments d
WHERE
 d.dept_no = (SELECT dept_no FROM dept_emp de  WHERE  de.emp_no=10001)
Copy the code

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	d	const	PRIMARY	PRIMARY	12	const	1	100
3	SUBQUERY	de	ref	PRIMARY	PRIMARY	4	const	1	100	Using index
2	SUBQUERY	departments	index		dept_name	122		9	100	Using aggregate; Using index

`DEPENDENT SUBQUERY`

Dependent subqueries depend on external query conditions based on subqueries.

 SELECT
	e.*.-- Subqueries depend on outer queries
	( SELECT sum( s.salary ) FROM salaries s WHERE s.emp_no = e.emp_no ) AS 'sum' 
FROM
	employees e
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	e		ALL					299423	100
2	DEPENDENT SUBQUERY	s		ref	PRIMARY	PRIMARY	4	employees.e.emp_no	10	100	Using aggregate

`DERIVED`

A derived table is a temporary table that can be used in a select from or join

SELECT
	e.*,
	a.sum_salary
FROM
-- from select 
  ( SELECT emp_no,COUNT(*) as total_titles FROM	titles  GROUP BY emp_no ) e
-- join select
	LEFT JOIN ( SELECT s.emp_no, sum( s.salary ) AS sum_salary FROM salaries s GROUP BY s.emp_no ) a ON e.emp_no = a.emp_no
Copy the code

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	d	const	PRIMARY	PRIMARY	12	const	1	100
3	SUBQUERY	de	ref	PRIMARY	PRIMARY	4	const	1	100	Using index
2	SUBQUERY	departments	index		dept_name	122		9	100	Using aggregate; Using index

`MATERIALIZED`

Materialized subqueries, which the optimizer uses to achieve more efficient subquery processing. Materialization speeds query execution by generating subquery results into temporary tables (usually in memory). The first time MySQL needs a subquery result, it externalizes that result into a temporary table. MySQL references the temporary table again any time the result is needed later. For example,

SELECT * FROM t1
WHERE t1.a IN (SELECT t2.b FROM t2 WHERE where_condition);
Copy the code

When materialization was used to optimize subqueries, In this case, the select type of explain may be SUBQUERY, and materialize and Materialized-subQuery will appear in the Message of show WARNINGS, indicating that the current SUBQUERY result is materialized.

Third, the table

The name of the table involved or a custom alias (as XXX). This can also be one of the following values:

<unionM.N>😕idforMThe results ofunion Id forNThe end.
<derivedN>😕idTo *N*Derived tables. For example, derived tables may come fromFROMSubqueries in the (including join) clause.
<subqueryN>😕idTo *N* the materializedThe subquery.

Fourth, the partitions

The record will match the partition of the query. For non-partitioned tables, this value is NULL.

Five, the type has

This property represents the type of query, from worst to best:

`ALL`

The worst-case scenario for a full table scan is to be scolded, such as using a composite index that does not match leftmost

SELECT * FROM titles  WHERE title ='Senior Engineer'
Copy the code

Title field is in the middle of the titles table and cannot be matched by index, so full table scan is required. The extra column in the execution plan Using WHERE indicates that the results are filtered.

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	titles		ALL					433321	10	Using where

`index`has

Index-based full table scan can be divided into the following two situations:

If the index is a cover index of the query (that is, the index value can satisfy all the data needed for the query), then only the index tree is scanned. In this case, the Extra column shows Using Index. An index-only scan is usually faster than an ALL because the data in the index is usually smaller than the data in the entire table.
```
SELECT dept_name FROM departments  ORDER BY  dept_name desc
Copy the code
```
Dept_name is the unique index. SQL > select depT_name (depT_name, depT_name, depT_name);

id select_type table partitions type possible_keys key key_len ref rows filtered Extra

1 SIMPLE departments index dept_name 122 9 100 Using index
Rows of data are indexed sequentially and then used for a full table scan. Uses Index does not appear in the Extra column.
```
SELECT * FROM titles  WHERE emp_no= 10001 ORDER BY  emp_no desc
Copy the code
```
PRIMARY KEY (emp_no,title,from_date), emp_no, from_date);

id select_type table partitions type possible_keys key key_len ref rows filtered Extra

1 SIMPLE titles index PRIMARY 159 433321 100

Type =index is mostly used when sorting by index.

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	departments		index		dept_name	122		9	100	Using index

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	titles		index		PRIMARY	159		433321	100

`range`

A range scan retrieves only rows whose index values are in a given range. The key column in the output row indicates which index to use. Key_len contains the longest key part used. For this type, ref is NULL. Range can be used when comparing key columns with constants using any of the = (non-unique index), <>, >, >=, <, <=, IS NULL, <=>, BETWEEN, LIKE, or IN () operators.

-- emp_no Primary key
SELECT * FROM `employees` WHERE emp_no>10001 and emp_no<10012
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	employees	range	PRIMARY	PRIMARY	4		10	100	Using where

`unique_subquery`

This type replaces the following form of eq_ref for some IN subqueries:

value IN (SELECT primary_key FROM single_table WHERE some_expr)
Copy the code

Unique_subquery is just an index lookup function that completely replaces subqueries for efficiency.

`index_merge`

Indicates that index merge optimization is used. In this case, the columns in the key output row contain the list of indexes used, and key_len contains the list of the longest key portion of the index used. See “Official Documentation” for index merge optimization details.

`ref_or_null`

Similar to ref, but Mysql will additionally search for row records that contain NULL

SELECT * FROM ref_table
  WHERE key_column=expr OR key_column IS NULL;
Copy the code

`fulltext`

Represents executed using the FULLTEXT index

`ref`

For the previous table, all rows matching index values are read from the table. Ref indicates that only the left-most prefix of the KEY is used or that the KEY is not a PRIMARY KEY or UNIQUE index (in other words, the previous table cannot match only one row based on the index value)

Emp_no is just the left-most prefix on primary keys
SELECT * FROM employees a,titles b 
  WHERE a.emp_no = b.emp_no
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	a		ALL	PRIMARY				299423	100
1	SIMPLE	b		ref	PRIMARY	PRIMARY	4	employees.a.emp_no	1	100

`eq_ref`

For the previous table, the table matches only one row. It is used when the matched index is **PRIMARY KEY or UNIQUE** and not null index. Eq_ref can be used for indexed columns that are compared using the = operator, and the comparison value can be either a constant or an expression.

SELECT
	* 
FROM
	employees a
	-- dept_emp emp_no prefix for primary key (may match more than one
	LEFT JOIN dept_emp b ON a.emp_no = b.emp_no
	The departments table where dept_NO uniquely matches a row is eq_ref
	LEFT JOIN departments c ON b.dept_no = c.dept_no
Copy the code

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered
1	SIMPLE	a	ALL					299423	100
1	SIMPLE	b	ref	PRIMARY	PRIMARY	4	employees.a.emp_no	1	100
1	SIMPLE	c	eq_ref	PRIMARY	PRIMARY	12	employees.b.dept_no	1	100

`const`

Const denotes that the table has at most one matching row, and const denotes which rows are constant value ** equivalent to the **PRIMARY KEY or UNIQUE index.

# formatSELECT * FROM tbl_name WHERE primary_key=1;
SELECT * FROM tbl_name WHERE primary_key_part1=1 AND primary_key_part2=2; # for the sampleSELECT * FROM	departments a WHERE a.dept_name = 'Finance'
-- --------- above is equivalent to ---------- below
SELECT a.dept_name,a.dept_no  FROM	departments a WHERE a.dept_name = 'Finance'
Copy the code

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	a		const	dept_name	dept_name	122	const	1	100	Using index

Note: Because departments only have two fields: Select * from innoDb where dept_NO = dept_name (select * from innoDb where dept_no = dept_name) Using index is displayed in Extra.

`system`

Denotes that the table has only one row (system table), which is a const special case.

Six, possible_keys

As the name implies, possible_Keys is not null and possible_key is empty because it only provides reference indexes (such as the names of indexes on certain fields) that the query can possible_Keys is not null.

Seven, the key

The actual used index, which includes the PRIMARY key index, or the name of the self-created index.

Eight, key_len

Represents the number of bytes that decide to use the index. With no loss of accuracy, the shorter the length, the better

Nine, ref

Shows which columns (column names) or constants (const) are compared to the index specified in the key column.

Ten, rows has

Number of rows scanned. That is, how many rows need to be scanned to get the data you need. In general, the smaller the rows, the more efficient they are, and most SQL optimizations reduce the size of this value (the metric of optimization). In InnoDB, this is just an estimate and it’s not necessarily accurate.

Eleven, filtered,

Percentage value: Represents the estimated percentage of rows that will be filtered by criteria. The maximum value is 100, which means that rows are not filtered. A decrease from 100 indicates an increase in the amount of filtering. Rows shows the estimated number of rows that have been checked, and rows× Filtered shows the number of rows that will be joined with the table below. For example, if rows is 1000 and filtered is 50.00 (50%), the number of rows connected using the following table is 1000 x 50% = 500.

Twelve, Extra has

This property is very important because it contains real information about what happened when the SQL was executed

Child of ‘table’ pushed join@1

This table is referenced as a child node of the table in the connection and can be pushed down to the NDB kernel. This is only applicable to NDB clusters if the pushed-down connection is enabled.

const row not found

For things like SELECT… FROM tbl_name. This table is empty.

Deleting all rows

For DELETE, some storage engines, such as MyISAM, support a handler method that deletes all table rows in a simple and quick manner. This added value is shown if the engine uses this optimization.

Distinct

MySQL is looking for different values, so it stops searching for more rows for the current row combination after it finds the first matching row.

FirstMatch(tbl_name)

Half connection FirstMatch connection shortcut

Full scan on NULL key

Subquery optimization occurs as a fallback strategy when the optimizer cannot be accessed using index-lookup.

Impossible HAVING

HAVING clause always false cannot query any rows.

Impossible WHERE

The WHERE clause is always false and no rows can be queried.

Impossible WHERE noticed after reading const tables

MySQL has read all const (and system) tables and noticed that the WHERE clause is always false.

LooseScan(m.. n)

Use the half-join LooseScan policy. M and N are key component numbers.

No matching min/max row

No rows meet the criteria for the query, such as SELECT MIN (…) The FROM… The WHERE condition.

No matching rows after partition pruning

For DELETE or UPDATE, the optimizer does not find anything to DELETE or UPDATE after the partition pruning. It has a similar meaning to the Impossible WHERE of the SELECT statement.

No tables used

The query statement does not specify a FROM clause or a FROM dual clause.

For INSERT or REPLACE statements, EXPLAIN displays this value when there is no SELECT part. For example, it appears in EXPLAIN INSERT INTO VALUES (10) because it is equivalent to EXPLAIN INSERT INTO SELECT 101 FROM DUAL.

Not exists

MySQL is able to perform LEFT JOIN optimization on the query and does not check for more rows in the table for the previous row combination once it finds a row that matches the LEFT JOIN condition. Here are examples of the types of queries that can be optimized in the following ways:

SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id
WHERE t2.id IS NULL;
Copy the code

Suppose t2.id is defined as NOT NULL. In this case, MySQL scans T1 and looks for rows in T2 using the value of T1.id. If MySQL finds a matching row in T2, it knows that T2.id can never be NULL and will not scan the remaining rows in T2 with the same ID value. In other words, for every row in T1, MySQL needs to do only one lookup in T2, regardless of how many rows are actually matched in T2.

The Plan isn ‘t ready yet

This value appears in EXPLAIN FOR CONNECTION when the optimizer has not finished creating an execution plan FOR a statement executed in a named CONNECTION. If the execution plan output contains more than one line, any or all of them can have this Extra value, depending on how far along the optimizer is in determining the full execution plan.

Range checked for each record

MySQL found that there were no good indexes to use, but found that some indexes might be used after the column values of the previous table were known. For each combination of rows in the table above, MySQL checks whether the rows can be retrieved using the range or Index_MERGE access methods. This is not fast, but it is faster than performing a join without an index.

Scanned N databases

This represents the number of directory scans performed by the server while processing queries for the INFORMATION_SCHEMA table. The value of N can be 0,1 or all.

Select tables optimized away

One of the better results is that when you use some aggregate function to access a field where an index exists, the optimizer completes the entire query by locating the index directly to the desired data row at once, such as MIN()\MAX()

Skip_open_table, Open_frm_only, Open_full_table

These values represent file open optimizations for queries that apply to the INFORMATION_SCHEMA table.

Skip_open_table: There is no need to open the table file. This information is already available in the query by scanning the database directory.
Open_frm_only: Just open the.frm file for the table.
Open_full_table: Unoptimized information lookup. You must open.frm,.myd and. MYI file.

Start temporary, End temporary

This indicates that temporary tables are used in the half-join Duplicate Weedout policy.

unique row not found

For things like SELECT… Select * FROM tbl_name; no rows meet the UNIQUE index or PRIMARY KEY condition on the table.

Using filesort has

Indicates that MySQL uses an external index sort for the results, rather than reading the contents from the table in index order. It may be sorted in memory or on disk. A sort operation in MySQL that cannot be done with an index is called “file sort”. When this value is present, you need to try to index the sorted field.

-- emp_no Primary key
SELECT emp_no FROM employees  WHERE emp_no> 10001 ORDER BY  first_name desc
Copy the code

At this time, type in the execution plan is range and Using WHERE in Extra; Using filesort, you need to sort the query results in memory.

Using the index has

Overwriting index cases. Indicates that only the information in the index tree is used to retrieve column information from the table, without having to look separately to read the actual rows (no need to go back to the table). This policy can be used when queries use only columns belonging to a single index.

For InnoDB tables with a user-defined clustered index, this index can be used even if it does not exist in the Extra column. This happens if type is index and key is PRIMARY.

Using the index condition has

Read the table by accessing the indexes and testing them to determine if the full table row is read. In this way, index information is used to defer (” push down “) reading of full table rows,

Using index for group-by

Similar to the Using Index Table access method, Using Group for group-by means that MySQL has found an index that can be used to retrieve all columns of a GROUP BY or DISTINCT query without any additional disk access to the actual table. In addition, indexes are used in the most efficient way, so only a few index entries are read for each group.

Using join buffer (Block Nested Loop), Using join buffer (Batched Key Access)

The sections of the tables in the join are read into the join buffer, and their rows are used from the buffer to perform the join with the current table. Block Nested Loop indicates that the Block Nested Loop algorithm is used, and Batched Key Access indicates that the batch Key Access algorithm is used. That is, the keys in the table from the row preceding the EXPLAIN output are buffered, and the matching rows are extracted in bulk from the table of rows in the join buffer.

Using MRR

Read tables using a multi-range read optimization strategy

Using sort_union (…). , Using the union (…). , Using intersects (…).

Shows how to merge index scans for the Index_Merge connection type.

Using temporary has

Indicates that MySQL uses temporary tables to store query results. Common in sort order by and group by queries

Using the where has

Indicates that the returned results are filtered

Using where with pushed condition

This item only applies to NDB tables. This means that NDB Cluster is pushing optimizations under use conditions to improve the efficiency of direct comparisons between non-indexed columns and constants.

Zero limit

This query has a LIMIT 0 clause and cannot select any rows.