MySQL Explain details and index best practices

This is the first day of my participation in the August More text Challenge. For details, see: August More Text Challenge

Create a new sample table and data

- cast
DROP TABLE IF EXISTS `actor`; 
CREATE TABLE `actor` ( 
`id` int(11) NOT NULL.`name` varchar(45) DEFAULT NULL.`update_time` datetime DEFAULT NULL, 
PRIMARY KEY (`id`))ENGINE=InnoDB DEFAULT CHARSET=utf8; 

INSERT INTO `actor` (`id`.`name`.`update_time`) VALUES 
(1.'a'.NOW()), 
(2.'b'.NOW()), 
(3.'c'.NOW()); 

- the movie table
DROP TABLE IF EXISTS `film`; 
CREATE TABLE `film` (`id` int(11) NOT NULL AUTO_INCREMENT, 
`name` varchar(10) DEFAULT NULL, 
PRIMARY KEY (`id`), 
KEY `idx_name` (`name`))ENGINE=InnoDB DEFAULT CHARSET=utf8; 

INSERT INTO `film` (`id`.`name`) VALUES
(3.'film0'),
(1.'film1'),
(2.'film2'); 

-- Movie list of actors
DROP TABLE IF EXISTS `film_actor`; 
CREATE TABLE `film_actor` ( 
`id` int(11) NOT NULL.`film_id` int(11) NOT NULL.`actor_id` int(11) NOT NULL.`remark` varchar(255) DEFAULT NULL, 
PRIMARY KEY (`id`), 
KEY `idx_film_actor_id` (`film_id`.`actor_id`))ENGINE=InnoDB DEFAULT CHARSET=utf8; 

INSERT INTO `film_actor` (`id`.`film_id`.`actor_id`) VALUES 
(1.1.1),
(2.1.2),
(3.2.1);
Copy the code

Explain the column,

explain select * from actor;

The id column

The id column number is the serial number of select. For every select there are several ids, and the order of ids is increments in the order in which the select appears. The larger the ID column is, the higher the priority is. If the ID is the same, the execution starts from the top down. If the ID is NULL, the execution ends.

Select_type column

1) Simple: Simple query. 2) Primary: the outermost select in a complex query. 3) Subquery: the subquery contained in the SELECT clause. 4) Derived: the subquery contained in the from clause. MySQL stores the results in a temporary table, also called a derived table

set session optimizer_switch='derived_merge=off'; # Turn off the mysql5.7 feature for merging tables
explain select (select 1 from actor where id = 1) from (select * from film where id = 1) t;
set session optimizer_switch='derived_merge=on'; Restore the default configuration
Copy the code

5) Union: the second and subsequent select in the union

explain select 1 union all select 1;
Copy the code

The table column

This column indicates which table is being accessed by the row of explain. When there are subqueries in the FROM clause, the table column is in format, indicating that the current query depends on the query with ID =N, so the query with ID =N is executed first. When there is a union, the value of the table column of the union RESULT is

, where 1 and 2 represent the IDS of the select rows that participate in the union.
,2>

Partitions are listed

Table partition (few people do table partition nowadays, personal opinion)

The type column

This column represents the association type or access type, which is the approximate range of data row records that MySQL determines how to look up rows in the table. System > const > eq_ref > ref > range > index > ALL

1) NULL: mysql can decompose the query statement during the optimization phase without having to access the table or index during the execution phase.

For example, selecting the minimum value in an index column can be done by looking up the index alone, without accessing the table at execution time

explain select min(id) from film;
Copy the code

2) const, system

Mysql can optimize a part of a query and convert it to a constant. When all columns used for primary or unique keys are compared to constants, the table has at most one matching row and is read once, which is relatively fast. System is a special case of const; if there is only one matching tuple in the table, it is system

set session optimizer_switch='derived_merge=off'; # Turn off the mysql5.7 feature for merging tables
explain select * from (select * from film where id = 1) tmp;
set session optimizer_switch='derived_merge=on'; Restore the default configuration
Copy the code

3) eq_ref

If all parts of a primary key or unique key index are connected, at most one record will be returned. This is probably the best join type outside of const, and would not appear in a simple SELECT query.

explain select * from film_actor left join film on film_actor.film_id = film.id
Copy the code

4) ref

In contrast to eq_ref, instead of using a unique index, use a normal index or a partial prefix of a unique index. The index is compared to a value and may find more than one row that meets the criteria.

Select index (name); select index (name);

explain select * from film where name = 'film1';
Copy the code

2. Query the associated table. Idx_film_actor_id in table film_actor is the combined index of film_id and actor_id.

explain select film_id from film f left join film_actor fa on f.id = fa.film_id;
Copy the code

5) range

Range scanning usually occurs in(), between,>,<, >=, and so on. Use an index to retrieve rows of a given range.

explain select * from actor where id > 1;
Copy the code

6) index

Scan the whole index can get the result, general is to scan a secondary indexes, the scan will not from the index began to quickly find root node, but directly to the secondary index leaf node traversal and scanning, speed or slower, the query is commonly used to cover index, secondary indexes is compared commonly small, so this is usually faster than ALL.

explain select * from film;
Copy the code

7) ALL

A full table scan, which scans all the leaves of your clustered index. Often this requires adding indexes to optimize.

explain select * from actor;
Copy the code

Possible_keys column

This column shows which indexes the query might use to find. In this case, there is not much data in the table. Mysql considers that the index is not helpful for this query, so it selects the full table query. If the column is NULL, there is no associated index. In this case, you can improve query performance by checking the WHERE clause to see if an appropriate index can be created, and then using explain to see the effect.

The key column

This column shows which index mysql actually uses to optimize access to this table. If no index is used, the column is NULL. If you want to force mysql to use or ignore the index in possible_keys, use force index, ignore index in your query.

key_len

This column shows the number of bytes used by mysql in the index. Using this value, you can determine which columns of the index are used.

For example, the federated index idX_FILM_actor_id for film_actor consists of two int columns film_id and actor_id, and each int is 4 bytes. From the key_len=4 in the result, you can infer that the query uses the first column: film_id column to perform the index lookup.

explain select * from film_actor where film_id = 2;
Copy the code

The calculation rules for key_len are as follows:

String, char(n) and varchar(n). In versions after 5.0.3, n represents the number of characters, not the number of bytes. In UTF-8, a number or letter is 1 byte, and a Chinese character is 3 bytes
- Char (n) : 3n bytes if Chinese characters are stored
- Varchar (n) : 3n + 2 bytes for Chinese characters, with the additional 2 bytes used to store the length of the string, because varchar is a variable-length string
Numeric types
- Tinyint: 1 byte
- Smallint: indicates 2 bytes
- Int: 4 bytes
- Bigint: 8 bytes
Time to type
- Date: 3 bytes
- Timestamp: 4 bytes
- Datetime: 8 bytes
If the field is allowed to be NULL, 1 byte is required to record whether it is NULL

The maximum length of the index is 768 bytes. When a string is too long, mysql will do something similar to the left prefix index, extracting the first half of the characters for the index.

Ref column

This column displays the column or constant used to find the value of the table in the index of the key column record.

Rows column

This column is the number of rows that mysql expects to read and detect. Note that this is not the number of rows in the result set.

Filtered column

Rows * filtered/100 can estimate the number of rows to be connected to the previous table in explain (the previous table is the table whose ID value in explain is smaller than the current table ID value).

Extra column

This column shows the additional information. Common importance values are:

1) Using the index

Mysql > Select * from ‘extra’ using index; mysql > Select * from ‘extra’ using index; mysql > Select * from ‘extra’; The overwritten index is generally targeted at the secondary index. The whole query result can be obtained only through the secondary index. There is no need to find the primary key through the secondary index tree and then obtain the values of other fields in the primary key index tree through the primary key

2) Using the where

The WHERE statement is used to process the results, and the queried columns are not overwritten by the index

3) Using index condition:

The column in the query is not completely overwritten by the index, and the where condition is a range of leading columns;

explain select * from film_actor where film_id > 1;
Copy the code

4) Using temporary

Mysql needs to create a temporary table to handle the query. In this case, it is usually necessary to optimize, and the first thing to do is to think of using indexes to optimize.

Actor. Name has no index. Create a temporary table to distinct

explain select distinct name from actor; 
Copy the code

Select * from ‘extra’ using index where ‘idx_name’ is used

explain select distinct name from film;
Copy the code

5) Using filesort

External sort is used instead of index sort. Data is sorted from memory when it is small, otherwise it needs to be sorted on disk. In this case, you should also consider using indexes to optimize.

No index is created for actor-name, the entire actor table is browsed, the sort keyword name and corresponding ID are saved, and then the name is sorted and the row record is retrieved

explain select * from actor order by name;
Copy the code

2. Film.name creates an idx_name index, and extra is using index

explain select * from film order by name;
Copy the code

6) Select tables optimized away

Use some aggregate functions (such as Max, min) to access a certain field that has an index

explain select min(id) from film;
Copy the code

Indexing Best Practices

CREATE TABLE `employees` ( 
`id` int(11) NOT NULL AUTO_INCREMENT, 
`name` varchar(24) NOT NULL DEFAULT ' ' COMMENT 'name'.`age` int(11) NOT NULL DEFAULT '0' COMMENT 'age'.`position` varchar(20) NOT NULL DEFAULT ' ' COMMENT 'job'.`hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Entry Time', 
PRIMARY KEY (`id`), 
KEY `idx_name_age_position` (`name`.`age`.`position`) USING BTREE 
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COMMENT='Staff Records'; 

INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei'.22.'manager'.NOW()); 
INSERT INTO employees(name,age,position,hire_time) VALUES('HanMeimei'.23.'dev'.NOW()); 
INSERT INTO employees(name,age,position,hire_time) VALUES('Lucy'.23.'dev'.NOW());
Copy the code

1. Full value matching

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei';
Copy the code

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' and age = 22;
Copy the code

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' and age = 22 and position ='manager';
Copy the code

2. Left-most prefix rule

If you index more than one column, follow the leftmost prefix rule. This means that the query starts at the top left of the index and does not skip the columns in the index.

EXPLAIN SELECT * FROM employees WHERE name = 'Bill' and age = 31; 
EXPLAIN SELECT * FROM employees WHERE age = 30 AND position = 'dev'; 
EXPLAIN SELECT * FROM employees WHERE position = 'manager';
Copy the code

3. Do not do anything on the index column

Add a normal index to hire_time:

ALTER TABLE `employees` ADD INDEX `idx_hire_time` (`hire_time`) USING BTREE ; 
EXPLAIN select * from employees where date(hire_time) ='2018 ‐ ‐ 09 30';
Copy the code

Translate to date range query, it is possible to walk the index:

EXPLAIN select * from employees where hire_time >='2021 ‐ to ‐ 08 09 00:00:00' and hire_time <='2021 ‐ 08 ‐ 20 23:59:59';  
Copy the code

Restore the index to its original state

ALTER TABLE `employees` DROP INDEX `idx_hire_time`;
Copy the code

4. The storage engine cannot use the column to the right of the range condition in the index

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager'; 
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager';
Copy the code

5. Use overwrite indexes (queries that only access indexes (index columns contain query columns)) to reduce select * statements

EXPLAIN SELECT name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';
Copy the code

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';
Copy the code

6. Mysql in use does not equal (! = or <>), not in, not EXISTS can result in a full table scan < <, > >, <=, >=. The mysql internal optimizer evaluates the use of indexes based on multiple factors such as retrieval ratio, table size, and so on

EXPLAIN SELECT * FROM employees WHERE name! ='LiLei';
Copy the code

7. Is NULL,is Not NULL Generally, indexes cannot be used

EXPLAIN SELECT * FROM employees WHERE name is null;
Copy the code

8. Like ‘$ABC… ‘) Mysql index failure causes full table scan operation

EXPLAIN SELECT * FROM employees WHERE name like '%Lei';
Copy the code

EXPLAIN SELECT * FROM employees WHERE name like 'Lei%';
Copy the code

Problem: How to resolve index like’% string %’ not being used?

A) To use the overwrite index, the query field must be the overwrite index field

EXPLAIN SELECT name,age,position FROM employees WHERE name like '%Lei%';
Copy the code

B) Search engines (ES, etc.) may be needed if overwrite indexes are not available

9. The index of a string without single quotes is invalid

EXPLAIN SELECT * FROM employees WHERE name = '1000'; 
EXPLAIN SELECT * FROM employees WHERE name = 1000;
Copy the code

10. Avoid using OR or in. Mysql does not necessarily use indexes when using it to query

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' or name = 'HanMeimei';
Copy the code

11. Scope query optimization

Adds a single value index to the age

ALTER TABLE `employees` ADD INDEX `idx_age` (`age`) USING BTREE ; 
explain select * from employees where age >=1 and age <=2000;
Copy the code

The mysql internal optimizer evaluates the use of indexes based on multiple factors, such as retrieval ratio and table size. In this example, the optimizer may have chosen not to go through the index because the single data volume query was so large

Optimization: A large scope can be broken down into smaller areas

explain select * from employees where age >=1 and age <=1000; 
explain select * from employees where age >=1001 and age <=2000;
Copy the code

Restore the original index state

ALTER TABLE `employees` DROP INDEX `idx_age`;
Copy the code

Summary of Index Usage

Let’s say index(a, b, c)

Where clause	Whether the index is in use
where a=3	Y, use a
where a=3 and b=4	Y, use a,b
where a=3 and b=4 and c=5	Y, use a,b,c
Where b=3 or c= 4 or c=5	N
where a=3 and c=5	Y, use a, but not c
where a=3 and b>4 and c=5	Y, use a,b, but not C
where a=3 and b like ‘k%’ and c=5	Y, use a,b,c
where a=3 and b like ‘%k’ and c=5	Y, only use a
where a=3 and b like ‘%k%’ and c=5	Y, only use a
where a=3 and b like ‘k%kk%’ and c=5	Y, use a,b,c

Like KK% is equivalent to = constant, %KK and %KK% are equivalent to ranges