The interviewer said: work so long, should know SQL execution plan, talk about SQL execution plan! I looked at the big flower arm tattooed on the interviewer’s arm and a string of Korean characters I couldn’t understand. I swallowed and suggested to calm down. After collecting my thoughts, I slowly said to the interviewer: I can’t. I: %^ & %$! @ #

One, foreword

When we work for a certain number of years, some knowledge points should be mastered, we need to know, for example, the interviewer asked about SQL execution plan today, when we execute a SQL, we can directly corresponding results, but you do not know, it will go through how deep dark tunnel, Connectors, query caches, analyzers, optimizers, and executors are all used to filter and re-filter. Sometimes, when you wait for N, you will want to break the computer. But when you look at today’s SQL execution plan, you will no longer need to break the computer. Read this article and you’ll know that’s not true. Let’s find out why

To optimize SQL statement execution in actual application scenarios, you need to view the execution process of SQL statements to speed up the execution efficiency. Explain +SQL statements are often used to simulate the optimizer’s execution of SQL queries to see how mysql handles SQL statements.

Website address: dev.mysql.com/doc/refman/…

Let’s start with the following SQL statement, which will haveId, select_type, table, etcThese columns, this is the information that we have in our execution plan, and what we need to figure out is what these columns are for, and how many values each column might have.

explain select * from emp;

Information contained in the execution plan

Column (Column) Meaning (fancy)
id The SELECTIdentifier (The id of each SELECT clause)
select_type The SELECTType (The type of the SELECT statement)
table The table for The output row (The current name of the table)
partitions The matching partitions (Shows the partition the query will visit if your query is based on a partitioned table)
type The join type (The current access mode in the table)
possible_keys The possible indexes to choose (Possible indexes to use)
key The index actually chosen (Indexes that are evaluated by the optimizer for final use)
key_len The length of The chosen key (The length of the index used)
ref The columns compared to The index (Column referenced to the previous table)
rows Estimate of Rows to be examined (To get the final record, ask for the number of records scanned)
filtered Percentage of Rows filtered by table condition (After the data returned by the storage engine is filtered at the Server layer, the ratio of the number of records that satisfy the query remains)
extra Additional information (Additional information notes)
Intimate small farmers have put the SQL copy out, only need to put into the database to execute, convenient and simple - small farmersCopy the code

Construction sentences:

SET FOREIGN_KEY_CHECKS=0;
DROP TABLE IF EXISTS `dept`;
CREATE TABLE `dept` (
  `DEPTNO` int NOT NULL,
  `DNAME` varchar(14) DEFAULT NULL,
  `LOC` varchar(13) DEFAULT NULL.PRIMARY KEY (`DEPTNO`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `dept` VALUES ('10'.'ACCOUNTING'.'NEW YORK');
INSERT INTO `dept` VALUES ('20'.'RESEARCH'.'DALLAS');
INSERT INTO `dept` VALUES ('30'.'SALES'.'CHICAGO');
INSERT INTO `dept` VALUES ('40'.'OPERATIONS'.'BOSTON');

DROP TABLE IF EXISTS `emp`;
CREATE TABLE `emp` (
  `EMPNO` int NOT NULL,
  `ENAME` varchar(10) DEFAULT NULL,
  `JOB` varchar(9) DEFAULT NULL,
  `MGR` int DEFAULT NULL,
  `HIREDATE` date DEFAULT NULL,
  `SAL` double(7.2) DEFAULT NULL,
  `COMM` double(7.2) DEFAULT NULL,
  `DEPTNO` int DEFAULT NULL.PRIMARY KEY (`EMPNO`),
  KEY `idx_job` (`JOB`),
  KEY `jdx_mgr` (`MGR`),
  KEY `jdx_3` (`DEPTNO`),
  KEY `idx_3` (`DEPTNO`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `emp` VALUES ('7369'.'SMITH'.'CLERK'.'7902'.'1980-12-17'.'800.00'.null.'20');
INSERT INTO `emp` VALUES ('7499'.'ALLEN'.'SALESMAN'.'7698'.'1981-02-20'.'1600.00'.'300.00'.'30');
INSERT INTO `emp` VALUES ('7521'.'WARD'.'SALESMAN'.'7698'.'1981-02-22'.'1250.00'.'500.00'.'30');
INSERT INTO `emp` VALUES ('7566'.'JONES'.'MANAGER'.'7839'.'1981-02-02'.'2975.00'.null.'20');
INSERT INTO `emp` VALUES ('7654'.'MARTIN'.'SALESMAN'.'7698'.'1981-09-28'.'1250.00'.'1400.00'.'30');
INSERT INTO `emp` VALUES ('7698'.'BLAKE'.'MANAGER'.'7839'.'1981-01-05'.'2850.00'.null.'30');
INSERT INTO `emp` VALUES ('7782'.'CLARK'.'MANAGER'.'7839'.'1981-09-06'.'2450.00'.null.'10');
INSERT INTO `emp` VALUES ('7839'.'KING'.'PRESIDENT'.null.'1981-11-17'.'5000.00'.null.'10');
INSERT INTO `emp` VALUES ('7844'.'TURNER'.'SALESMAN'.'7698'.'1981-09-08'.'1500.00'.'0.00'.'30');
INSERT INTO `emp` VALUES ('7900'.'JAMES'.'CLERK'.'7698'.'1981-12-03'.'950.00'.null.'30');
INSERT INTO `emp` VALUES ('7902'.'FORD'.'ANALYST'.'7566'.'1981-12-03'.'3000.00'.null.'20');
INSERT INTO `emp` VALUES ('7934'.'MILLER'.'CLERK'.'7782'.'1982-01-23'.'1300.00'.null.'10');

DROP TABLE IF EXISTS `emp2`;
CREATE TABLE `emp2` (
  `id` int NOT NULL AUTO_INCREMENT,
  `empno` int DEFAULT NULL.PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
INSERT INTO `emp2` VALUES ('1'.'111');
INSERT INTO `emp2` VALUES ('2'.'222');

DROP TABLE IF EXISTS `salgrade`;
CREATE TABLE `salgrade` (
  `GRADE` int NOT NULL,
  `LOSAL` double DEFAULT NULL,
  `HISAL` double DEFAULT NULL.PRIMARY KEY (`GRADE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `salgrade` VALUES ('1'.'700'.'1200');
INSERT INTO `salgrade` VALUES ('2'.'1201'.'1400');
INSERT INTO `salgrade` VALUES ('3'.'1401'.'2000');
INSERT INTO `salgrade` VALUES ('4'.'2001'.'3000');
INSERT INTO `salgrade` VALUES ('5'.'3001'.'9999');

DROP TABLE IF EXISTS `t_job`;
CREATE TABLE `t_job` (
  `id` int NOT NULL AUTO_INCREMENT,
  `job` varchar(9) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL.PRIMARY KEY (`id`),
  KEY `j` (`job`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Copy the code

2.1 id

The serial number of a SELECT query, which contains a set of numbers indicating the order in which the select clause or operation table is executed in the query

Id numbers fall into three categories:

1. If the ID is the same, the execution order is from top to bottom

- left associated
explain select * from emp e left  join dept d on e.deptno = d.deptno;
- right associated
explain select * from emp e right join dept d on e.deptno = d.deptno;
Copy the code

Verify by left JOIN and right Join; Left JOIN scans table E first and then table D. Right JOIN scans table D and then table E

2. If the IDS are different, the id sequence number increases. The higher the ID value, the higher the priority and the earlier the query is executed

explain select * from emp e where e.deptno = (select d.deptno from dept d where d.dname = 'SALES');
Copy the code

Select * from table where id = 2;Mysql > select_type SUBQUERY; select_type SUBQUERYSelect * from d where deptno = deptno

3. The same and different ids exist at the same time: The same ids can be regarded as a group and are executed from top to bottom. In all groups, the higher the ID value, the higher the priority, the earlier the execution

explain select * from emp e join dept d on e.deptno = d.deptno join salgrade sg on e.sal between sg.losal and sg.hisal where e.deptno = (select d.deptno from dept d where d.dname = 'SALES');
Copy the code

In this case, execute from id 2 first, if the IDS are the same, execute in order

2.2 select_type

It is mainly used to distinguish the type of query, whether it is common query, joint query or sub-query

Select_type value Meaning (fancy)
SIMPLE Simple queries do not contain UNION and subqueries
PRIMARY If the query contains any complex subqueries, the outermost query is marked as Primary
UNION If the second select appears after the union, it is marked as union
DEPENDENT UNION Similar to union, depentent here means that the result of a union or union all is affected by an external table
UNION RESULT Select to get the result from the union table
SUBQUERY Include subqueries in select or WHERE lists
DEPENDENT SUBQUERY Subquery subqueries are affected by external table queries
DERIVED Subqueries that occur in the FROM clause are also called derived classes
UNCACHEABLE SUBQUERY Indicates that results using subqueries cannot be cached
UNCACHEABLE UNION The result of the union query cannot be cached
--simple: simple query, excluding subquery and union
explain select * from emp;
Copy the code

--primary: If the query contains any complex subqueries, the outermost query is marked as primary
explain select * from emp e where e.deptno = (select d.deptno from dept d where d.dname = 'SALES');
Copy the code

--union: If the second select appears after the union, it is marked as union
explain select * from emp where deptno = 10 union select * from emp where sal >2000;
Copy the code

-- Dependent union: similar to union, depentent means that the result of union or union all is affected by the external table
explain select * from emp e where e.empno  in ( select empno from emp where deptno = 10 union select empno from emp where sal >2000);
Copy the code

--union result: Select the result from the union table
explain select * from emp where deptno = 10 union select * from emp where sal >2000;
Copy the code

--subquery: Contains subqueries in select or WHERE lists
explain select * from emp where sal > (select avg(sal) from emp) ;
Copy the code

--dependent subquery: Subquery subqueries are affected by external table queries
explain select * from emp e where e.deptno = (select distinct deptno from dept where deptno = e.deptno);
Copy the code

--DERIVED: subqueries that occur in the --DERIVED: from clause, also called DERIVED classes,
explain select * from (select ename staname,mgr from emp where ename = 'W' union select ename,mgr from emp where ename = 'E') a;
Copy the code

-- UNCACHEABLE SUBQUERY: indicates that the results of a SUBQUERY cannot be cached
 explain select * from emp where empno = (select empno from emp where deptno=@@sort_buffer_size);
Copy the code

--uncacheable Union: indicates that the query results of the union cannot be cached
explain select * from emp where exists (select 1 from dept where  emp.deptno = dept.deptno union select 1 from dept where  deptno = 10);
Copy the code

Table 2.3

Which table is being accessed by the row, table name or alias, possibly temporary table or union merge result set

1. If it is a specific table name, it indicates that the data is obtained from the actual physical table, or it can be the alias of the table

explain select * from emp where sal > (select avg(sal) from emp) ;

2. The table name is the form derivedN. Explain select * from (select ename staname, MGR from emp where ename = ‘WARD’ union select ename staname,mgr from emp where ename = ‘SMITH’) a;

Derived2:Indicates that we need to fetch data from derived table 2

3. When there is a union result, the table name is in the form of union N1,n2, etc. N1 and n2 represent the IDS participating in the union

explain select * from emp where deptno = 10 union select * from emp where sal >2000;

2.4 type

Type shows the access type, which indicates how I access our data. The most easy to think of is full table scan, which directly and violently traverses a table to find the data I need. It is very inefficient.

system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

In general, make sure the query is at least range and preferably ref

--all: full table scan. Generally, such SQL statements and large data volumes need to be optimized.
explain select * from emp;
Copy the code

--index: Full index scan is more efficient than all. There are two main situations: one is that the current query overwrites the index, that is, the data we need can be obtained in the index, or the index is used for sorting, so as to avoid data reordering
explain  select empno from emp;
Copy the code

Range: =, <>, >, >=, <, <=, IS NULL, BETWEEN, LIKE, or IN()
explain select * from emp where empno between 7000 and 7500;
Copy the code

Index_subquery: Associates subqueries with indexes instead of scanning the entire table
explain select * from emp where emp.job in (select job from t_job);
Copy the code
--unique_subquery: This join type is similar to index_subquery and uses a unique index
 explain select * from emp e where e.deptno in (select distinct deptno from dept);
Copy the code
-- ref_OR_NULL: The query optimizer selects this access mode for fields that require both a condition and a null value
explain select * from emp e where  e.mgr is null or e.mgr=7369;
Copy the code

--ref: a non-unique index is used to find data
 create index idx_3 on emp(deptno);
 explain select * from emp e,dept d where e.deptno =d.deptno;
Copy the code

--eq_ref: Data lookup using a unique index
explain select * from emp,emp2 where emp.empno = emp2.empno;
Copy the code

--const: This table has at most one matching row,
explain select * from emp where empno = 7369;
Copy the code

--system: The table has only one row of records (equal to the system table). This is a special case of const type
Copy the code

2.5 possible_keys

Shows one or more indexes that may be applied to the table. If an index exists on a field involved in the query, the index will be listed but not necessarily used by the query

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;
Copy the code

2.6 the key

If the index in use is null, no index is used. If an overwrite index is used, the index overlaps with the select field of the query.

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;
Copy the code

2.7 key_len

Represents the number of bytes used in the index. Key_len can be used to calculate the length of the index used in the query. The shorter the length is, the better, without losing precision. 1. Generally, key_len is equal to the length of the index column type in bytes, for example, int is 4 bytes and bigint is 8 bytes. For the utF8 character set, one character is three bytes; for the GBK character set, one character is two bytes. 3. If the column type is NULL, add another bytes to key_len. For example, VARCHAR (TEXT\BLOB does not allow whole columns to be indexed, and if partial indexes are also considered dynamic column types) requires an additional 2 bytes character set for key_len to affect index length, data storage space, and the appropriate character set for the column. Variable length fields require an extra 2 bytes, fixed length fields do not. Nulls require 1 byte of extra space, so the old saying is that index fields should not be null because null complicates statistics and requires an extra byte of storage.

-- varchar(len) variable length fields that allow NULL: len*(Character Set: utf8=3,gbk=2,latin1=1) +1(NULL)+2-- varchAR (len) Variable length fields and NULL: len*(Character Set: UTf8 =)3,gbk=2,latin1=1) +2(Variable length field) --char(len) Fixed field and allowed NULL: len*(Character Set: utf8=3,gbk=2,latin1=1) +1(NULL)
-- char(len) Fixed field and not allowed NULL: len*(Character Set: utf8=3,gbk=2,latin1=1)
Copy the code
explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;
Copy the code

2.8 ref

Shows which column of the index is used, if possible, as a constant

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;
Copy the code

2.9 rows

Based on table statistics and index usage, it is important to estimate the number of rows that need to be read to find the required record. This parameter is a direct reflection of how much data is found in the SQL, and as little as possible to accomplish the purpose

explain select * from emp;
Copy the code

2.10 extra

Contains additional information.

Mysql can not use index to sort data, but can only use the sorting algorithm to sort data
explain select * from emp order by sal;
Copy the code

--using temporary: Create a temporary table to store intermediate results and delete the temporary table after the query is complete
explain select ename,count(*) from emp where deptno = 10 group by ename;
Copy the code

--using index: this indicates that the current query overwrites the index and reads data directly from the index without accessing the table. If a using WHERE table name index is present at the same time, it is used to perform a lookup of the index key value; if not, the surface index is used to read the data, rather than actually lookup it
explain select deptno,count(*) from emp group by deptno limit 10;
Copy the code

--using WHERE: Use where to filter conditions
explain select * from emp2 where empno = 1;
Copy the code

Three summary

SQL execution plan is not very difficult, mainly remember the meaning of each column and how to optimize, this is a lot of training and practical implementation, interested partners can try, or very interesting, this article is just a brief introduction to MySQL execution plan. For a comprehensive and in-depth understanding of MySQL, please read the official MySQL manual first