My original blog address: blog.csdn.net/weixin_3931…

Explain Tool Introduction

Using the Explain keyword, you can simulate the optimizer to execute SQL statements. And the optimization results performed by the optimizer are fed back. Using these results we can find bottlenecks in our SQL or tables. Note that Explain doesn’t really do the SQL for us, just the analysis.

Explain analysis examples

The official document: dev.mysql.com/doc/refman/…

DROP TABLE IF EXISTS `actor`;
 CREATE TABLE `actor` (
 `id` int(11) NOT NULL,
 `name` varchar(45) DEFAULT NULL,
 `update_time` datetime DEFAULT NULL.PRIMARY KEY (`id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `actor` (`id`, `name`, `update_time`) VALUES (1.'a'.'2017 ‐ ‐ 22 15:27:18 12'), (2.'b'.'2017 ‐ ‐ 22 15:27:18 12'), (3.'c'.'2017 ‐ ‐ 22 15:27:18 12');

 DROP TABLE IF EXISTS `film`;
 CREATE TABLE `film` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(10) DEFAULT NULL.PRIMARY KEY (`id`),
 KEY `idx_name` (`name`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

 INSERT INTO `film` (`id`, `name`) VALUES (3.'film0'), (1.'film1'), (2.'film2');

 DROP TABLE IF EXISTS `film_actor`;
 CREATE TABLE `film_actor` (
 `id` int(11) NOT NULL,
 `film_id` int(11) NOT NULL,
 `actor_id` int(11) NOT NULL,
 `remark` varchar(255) DEFAULT NULL.PRIMARY KEY (`id`),
 KEY `idx_film_actor_id` (`film_id`,`actor_id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `film_actor` (`id`, `film_id`, `actor_id`) VALUES (1.1.1), (2.1.2), (3.2.1);
Copy the code
 explain select * from actor;
Copy the code

The explain variant

explain extended:

Explain Extended: Provides additional query optimization information in addition to EXPLAIN. This is followed by the show Warnings command to get an optimized query statement to see what the optimizer optimized. There are also filtered columns, which are half-scale values, and rows * filtered/100 estimate the number of rows that will be connected to the previous table in Explain (the previous table in Explain has an ID value smaller than the current table ID value).

explain partitions

Explain Partitions have one more partition field than Explain, which shows the partitions to be accessed by the query if it is based on a partitioned table.

Core columns in EXPLAIN

1. The id column

The id column is numbered as the sequence number of the select, and the number of ids increases in the order in which the select appears. The larger the ID column is, the higher the execution priority is. If the IDS are the same, the execution is performed from the top down. If the ID is NULL, the execution is performed last.

2.select_type

  • Simple: indicates simple query. The query does not contain subqueries and unions
  • Primary: the outermost select in a complex query
  • Subquery: a subquery contained in a select (not in the FROM clause)
  • Derived: Subquery contained in the FROM clause. MySQL stores the results in a temporary table, also known as a derived table

Use this example to understand the primary, SubQuery, and Derived types

set session optimizer_switch='derived_merge=off'; Close # mysql57.New features optimize explain for merge of derived tablesselect (select 1 from actor where id = 1) from (select * from film where id = 1) der;
Copy the code

  • Union: The second and subsequent select in the union
 explain select 1 union all select 1;
Copy the code

3. The table columns

This column indicates which table is being accessed by the EXPLAIN row. When there is a subquery in the FROM clause, the table column is format, indicating that the current query depends on the query with id=N, so the query with ID =N is executed first. When there is a union, the value of the table column of the Union RESULT is <union1,2>, and 1 and 2 represent the IDS of the select rows participating in the union.

4. Type column (key of key)

This column represents the association type or access type, which is the approximate range of data row records that MySQL determines how to find rows in the table. System > const > eq_ref > ref > range > index > ALL Generally speaking, it is necessary to ensure that the query reaches the range level, and ref is the best

NULL

Mysql can decompose queries during the optimization phase without having to access tables or indexes during the execution phase. For example, if you select a minimum value in the index column, you can do this by looking up the index separately without accessing the table during execution

const, system

Mysql can optimize part of a query and convert it to a constant (see the result of show Warnings). Select * from primary key; select * from constant; select * from primary key; System is a special case of const. If only one tuple in the table matches, it is system.

explain extended select * from (select * from film where id = 1) tmp;
show WARNINGS;
Copy the code

eq_ref

All parts of the primary key or unique key index are joined, and at most one qualifying record is returned. This is probably the best type of join besides const, and it does not appear in simple SELECT queries.

explain select * from film_actor left join film on film_actor.film_id = film.id;
Copy the code

ref

Instead of using a unique index, eq_ref uses a normal index or a partial prefix of a unique index, which is compared to a value and may find multiple rows that match the condition

  1. Simple SELECT query, name is normal index (not unique index) explain select * from film where name = 'film1';
  2. Associative table query, idx_FILm_actor_id is the joint index of film_id and actor_id, where the left prefix film_id of film_actor is used.
explain select film_id from film left join film_actor on film.id = film_actor.film_id;
Copy the code

range

Range scanning usually occurs in(), between,>,<, >=, and so on. Use an index to retrieve rows in a given range.

explain select * from actor where id > 1;
Copy the code

index

Scan the whole index can get the result, general is to scan a secondary indexes, the scan will not from the index began to quickly find root node, but directly to the secondary index leaf node traversal and scanning (back), speed or slower, the query is commonly used to cover index, secondary indexes is compared commonly small, so this is usually faster than ALL.

 explain select * from film;
Copy the code

ALL

A full table scan scans all the leaf nodes of your cluster index. Normally this would need to be optimized by adding an index

explain select * from actor;
Copy the code

Conclusion:

Eq_ref: scans one data based on the primary key index or the unique key index ref: scans one data based on the index range Scans some data based on the index Index: scans all data based on the index All: partially scans all data based on the index

It’s enough to focus on the above routines.

Possible_keys column

This column shows which indexes the query might use to look up. Possible_keys may have columns while key is NULL when explaining possible_keys. In this case, it is because there is not much data in the table. Mysql thinks that the index is not helpful for the query and selects the full table query. If the column is NULL, there is no associated index. In this case, you can improve query performance by examining the WHERE clause to see if you can create an appropriate index, and then using Explain to see the effect.

The key column

This column shows which index mysql actually uses to optimize access to the table, or NULL if no index is used. If you want to force mysql to use or ignore indexes in the possible_keys column, use force index and ignore index in the query.

Key_len column

This column shows the number of bytes used by mysql in the index. This value can be used to figure out which columns are used in the index. For example, the joint index idx_FILM_actor_id for film_actor consists of two int columns film_id and actor_id, and each int is 4 bytes. From key_len=4 in the result, it can be inferred that the query uses the first column, the film_ID column, to perform the index lookup.

explain select * from film_actor where film_id = 2;
Copy the code

Key_len is computed as follows:

  1. String, char(n) and vARCHar (n). In versions after 5.0.3, n represents the number of characters, not the number of bytes. In UTF-8, a number or letter is 1 byte and a Chinese character is 3 bytes
  • Char (n) : contains 3n bytes if Chinese characters are stored
  • Varchar (n) : if storing Chinese characters, the length is 3n + 2 bytes. The extra 2 bytes are used to store the length of the string, because
  • Varchar is a variable length string
  1. Numeric types
  • Tinyint: 1 byte
  • Smallint: 2 bytes
  • Int: 4 bytes
  • Bigint: 8 bytes
  1. Time to type
  • Date: 3 bytes
  • Timestamp: 4 bytes
  • Datetime: 8 bytes

The maximum length of the index is 768 bytes. If the string is too long, mysql extracts the first half of the string from the index.

Ref column

This column shows the columns or constants used by the table to find values in the index of the key column. Common examples are const (constant), field names (e.g. Film.id).

Rows column

This column is the number of rows that mysql expects to read and detect, note that this is not the number of rows in the result set.

Extra column

Extended column, with a lot of additional information.

  1. Using index: Uses an overwrite index

Mysql > select * from ‘select’; mysql > select * from ‘select’; mysql > select * from ‘select’; mysql > select * from ‘select’; Overwrite index is generally aimed at the secondary index, the entire query result can be obtained only through the secondary index, there is no need to find the primary key through the secondary index tree, and then go to the primary key index tree to obtain other field values

explain select film_id from film_actor where film_id = 1;
Copy the code

2. Using WHERE: A where statement is used to process the result, and the queried column is not overwritten by the index

explain select * from actor where name = 'a';
Copy the code

  1. Using index condition: the column in the query is not completely covered by the index. The where condition is a range of leading columns.
 explain select * from film_actor where film_id > 1;
Copy the code

  1. Using temporary: mysql needs to create a temporary table to process queries. This situation is generally to optimize, the first is to think of using indexes to optimize.
  • Actor.name has no index and a temporary table is created to be distinct
 explain select distinct name from actor;
Copy the code

  • Film. name select idx_name, extra select using index
explain select distinct name from film;
Copy the code

  1. Using filesort: Sort external rather than indexed, from memory when data is small or on disk otherwise. In this case, index optimization is also generally considered.
  • Actor.name does not create an index, which browses the entire actor table, saves the sort keyword name and the corresponding ID, then sorts the name and retrieves the row record
 explain select * from actor order by name;
Copy the code

  1. Film. name select idx_name and extra select using index
explain select * from film order by name;
Copy the code

  1. Select Tables Optimized Away: Use aggregation functions (such as Max, min) to access a field that has an index
 explain select min(id) from film;
Copy the code

Indexing best practices

CREATE TABLE `employees` (
`id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(24) NOT NULL DEFAULT ' ' COMMENT 'name',
 `age` int(11) NOT NULL DEFAULT '0' COMMENT 'age',
`position` varchar(20) NOT NULL DEFAULT ' ' COMMENT 'job',
 `hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Entry Time'.PRIMARY KEY (`id`),
 KEY `idx_name_age_position` (`name`,`age`,`position`) USING BTREE
 ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COMMENT='Staff Record';

 INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei'.22.'manager',NOW());
 INSERT INTO employees(name,age,position,hire_time) VALUES('HanMeimei'.23.'dev',NOW());
 INSERT INTO employees(name,age,position,hire_time) VALUES('Lucy'.23.'dev',NOW());
Copy the code

Values match

The query condition happens to be the index field.

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei';
Copy the code

 EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22;
Copy the code

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';
Copy the code

Left prefix rule

If you index multiple columns, follow the leftmost prefix rule. Indicates that the query starts at the left-most front of the index and does not skip columns in the index.

EXPLAIN SELECT * FROM employees WHERE name = 'Bill' and age = 31;
EXPLAIN SELECT * FROM employees WHERE age = 30 AND position = 'dev';
EXPLAIN SELECT * FROM employees WHERE position = 'manager';
Copy the code

Not doing anything on the index column (calculation, function, (automatic or manual) type conversion) will cause the index to fail and move to a full table scan

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';
EXPLAIN SELECT * FROM employees WHERE left(name,3) = 'LiLei';
Copy the code

Storage engines cannot use columns to the right of range conditions in indexes (left prefix rule variant)

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager';
Copy the code

You can see that the middle age uses a range query, and the position after age does not use the index.You can see this through the key_len fieldHere we can do our own deduction with the data structure of the index:

If name and age >20 in b+ tree search results. Positions are not necessarily ordered. So I have to iterate over the result set of these two conditions.

For example, here you can filter out the three underground by name and age. But the first efG > EEG > ABC. Position is not ordered. Then you can’t filter it according to certain conditions.

Minimize select * statements by using overridden indexes (queries that access only the index (index columns contain query columns) to avoid back to the table)

EXPLAIN SELECT name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position='manager';
Copy the code

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';
Copy the code

You can see that the second query does not use the usering index. Indexes are used only for WHERE filtering

Mysql in use does not equal (! If an index cannot be used, a full table scan will occur

< <, > >, <=, >=, the mysql internal optimizer evaluates whether to use an index or not based on multiple factors, such as the percentage of searches, table size, and so on

EXPLAIN SELECT * FROM employees WHERE name ! = 'LiLei';
Copy the code

Although name is indexed, key_len is null, indicating that no index is used

Is null,is not NULL Generally, indexes cannot be used

 EXPLAIN SELECT * FROM employees WHERE name is null
Copy the code

Like begins with a wildcard (‘$ABC… Mysql index failure is a full table scan operation

EXPLAIN SELECT * FROM employees WHERE name like '%Lei'
Copy the code

EXPLAIN SELECT * FROM employees WHERE name like 'Lei%'
Copy the code

Going back to the B + tree, we can use the Lei prefix to filter out all elements in the index tree whose first three letters are Lei, as long as it doesn’t start with Lei. Because the index is in order.

Like ‘% string %’ index not used A) To use an overwrite index, the query field must be an established overwrite index field

 EXPLAIN SELECT name,age,position FROM employees WHERE name like '%Lei%';
Copy the code

B) Search engines may be required if overwriting indexes are not available

The index of a string without single quotation marks is invalid

EXPLAIN SELECT * FROM employees WHERE name = '1000';
EXPLAIN SELECT * FROM employees WHERE name = 1000;
Copy the code

The main thing is that mysq does the conversion automatically. It is also said above that index fields should not use functions. Hidden here is the function that helps us use the type conversion.

The mysql internal optimizer will evaluate whether to use an index based on multiple factors, such as the percentage of searches and the size of the table. For details, see Scope query optimization

 EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' or name = 'HanMeimei';
Copy the code

This is because the amount of data in the table is so small that the mysql engine does not use indexes to calculate the data, which is faster. Because using an index you need to return the table to the primary key index.

Range query optimization

Add a single value index to the age

ALTER TABLE `employees` ADD INDEX `idx_age` (`age`) USING BTREE ;
explain select * from employees where age > =1 and age < =2000;
Copy the code

The mysql internal optimizer evaluates whether to use an index based on multiple factors such as the percentage of retrieves, table size, and so on. In this example, the optimizer may choose not to use index optimization because of a single large query: large ranges can be split into smaller ranges

 explain select * from employees where age > =1 and age < =1000;
 explain select * from employees where age > =1001 and age < =2000;
Copy the code

Restores the original index state

ALTER TABLE `employees` DROP INDEX `idx_age`;
Copy the code

Index Usage summary

Where a=3 and b like ‘kk%’ and c=4

**like KK% = constant ** If a string is too long, we always truncate it as an index field. So if you were to intercept the first two fields that would be the case.

The regression to the B+ tree is filtered by the first two letters being KK, and the resulting data is partially ordered C is ordered if the first two elements are equal. This leaves space for mysql queries to exploit indexes.

Differ on %KK and %KK% equivalent ranges

The rest is a matter of interpretation.