preface

Recently, an old friend went to Baidu for an interview. He should be interviewing for a senior engineer position. He told me that he was asked about mysql index knowledge points. In fact, the interviewer is mainly looking at the performance of mysql tuning related to the theoretical knowledge in fact also want to know your understanding of the principle, to confirm whether you have relevant tuning experience. The friend said that his answer is ok, and then very smoothly carried out three four sides. Here’s how to answer this question!

  • Public account: I am a mu
  • Amu is not only a programmer, but also a prose enthusiast

The following is my own understanding, if the following is wrong, please spray my duck!

Talk about index classification

According to data structure, it can be classified into B+TREE index, HASH index, and FULLTEXT index. According to index type can be divided into: common index, primary key index, unique index, full text index, combined index, primary index, secondary index.

What’s the difference between these two? There must be: one is the index implementation type; One is the type used to create the index

  • Common INDEX :(INDEX) an INDEX based on a common field is called a common INDEX
ALTER TABLE `table_name` ADD INDEX idx_name ( `user_name` ) 
Copy the code
  • PRIMARY KEY index :(PRIMARY KEY) an index based on a PRIMARY KEY is called a PRIMARY KEY index. A table can have only one PRIMARY KEY index
ALTER TABLE `table_name` ADD PRIMARY KEY ( `user_id` ) 
Copy the code
  • A table can have multiple UNIQUE indexes. The column value of the index is allowed to be null. Multiple null values in the column value will not cause repeated conflicts
ALTER TABLE `table_name` ADD UNIQUE (`user_name`)
Copy the code
  • The full text indexing: (FULLTEXT) based onvarchar,char,textFull-text indexes on columns; Cooperate withmatch againstUse, like a search engine, when the data is large, it takes up space and time
ALTER TABLE `table_name` ADD FULLTEXT ( `user_desc` )
Copy the code
  • Composite indexes: Indexes built on multiple columns are called composite indexes and follow the left-most prefix principle
ALTER TABLE `table_name` ADD INDEX idx_name_age ( `user_name`, `user_age` )
Copy the code
  • Level 1 index: Indexes and data are stored together in the same B+tree leaf node. A primary key index is usually a primary index
  • Secondary index: The leaf nodes of a secondary index tree store primary keys rather than data. In other words, after finding the index, obtain the corresponding primary key, and then go back to the primary index to find the data record corresponding to the primary key

Note: Do not abuse indexes; Don’t create too many indexes; Do not create duplicate indexes

① Although indexes improve query speed, they also reduce table update speed

② Create index files that occupy disk space. Minimize creating too many composite indexes on large tables;

Index structure B+ Tree index structure B+ Tree index structure

Indexes cover

As the name implies: overwrite index is the data column of the query only need to be retrieved from the index, no need to read the data row; In plain English, the data in our SQL query needs to be overwritten by the index.

Mysql > select * from B+Tree; We all know the principle of B+Tree. I won’t repeat it here. Say what it’s for:

  • No need to return to the table, fast query

  • Reduce system call and data copy to cache wait time

Select * from XXX (select * from XXX); select * from XXX (select * from XXX

How to confirm that SQL triggered index overwrite

Trigger Index overwrite: Index overwrite can be triggered when the output of the explain SQL statement is Using Index.

Then let’s look at Explain keyword analysis:

Common type describe descr
Using Index Index overwrites are used without the need to query back to the table There is no
Using Index Condition Index push-down is used (version 5.6+) There is no
Using Where Sever layer filter data using the WHERE condition Isn’t the best
Using Filesort An additional sort operation was taken because the index tree could not be utilized Need to optimize
Using Temporary Temporary tables are used to hold intermediate result sets Need to optimize
Using Join Buffer Circular nested scan is used for continuous table queries Need to optimize

② Index coverage example

Create a temporary table to test index coverage and create an index for the nickname user_nameCREATE TABLE `user` (
  `user_id`   int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'user id',
  `user_name` varchar(125)     NOT NULL DEFAULT ' '     COMMENT 'User name',
  `user_pwd`  varchar(64)      NOT NULL DEFAULT ' '     COMMENT 'User password',
  `user_sex`  tinyint(1)       NOT NULL DEFAULT '0'    COMMENT 'User gender 0- Confidentiality; 1 - male; 2 - women ',
  `create_at` int(10)          NOT NULL DEFAULT '0'    COMMENT 'Creation time'.PRIMARY KEY (`user_id`),
  KEY `idx_name` (`user_name`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='User table';
Copy the code

Let’s look at our first SQL statement:

mysql> explain select user_id,user_name from user where user_name = 'Li Ah Mu' limit 1;
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+---------- ---+
| id | select_type | table | partitions | type | possible_keys | key      | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+---------- ---+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name      | idx_name | 377     | const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+---------- ---+
1 row in set.1 warning (0.01 sec)
Copy the code

We can see from the above result that Extra = ‘Using index’, Using idx_name normal index. We know that the index in the leaf node of B+Tree can also be used as a data page, storing common directory entry records. The index tree of idx_name stores the primary key ID and user_name. In this way, no operation is required to return to the table and the query efficiency is high.

Let’s look at another SQL statement:

For example, we sometimes query the user's PWD mysql by the user's nickname> explain select user_pwd from user where user_name = 'Li Ah Mu';
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key      | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name      | idx_name | 377     | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
1 row in set.1 warning (0.01 sec)
Copy the code

Let’s take a look at the process direction:

  • throughidx_nameIndex tree, find it from the treeUser_name = 'user_name'Id of the corresponding primary key
  • Find the data that meets the condition in the primary key index tree by performing a back-table operation and return

SQL = idx_name; SQL = idx_name; SQL = idx_name; SQL = idx_name; SQL = idx_name; This operation does not comply with the principle of index coverage, because it goes through the table, thus affecting the query efficiency. Completely inconsistent with the idea of index coverage.

Here it is necessary to explain the meaning of the fields in the explain result set to deepen the impression:

field description
id SELECT identifier, the identification ID of each SELECT clause
select_type SELECT the type; For example: ① Simple simple select(do not use UNION or subquery); Select * from primary; ③ The second or subsequent select statement in the union; ④ The second or subsequent query statement of the dependent union depends on the external query; ⑤ The result set of union result; ⑥ The first select query in the subquery; The dependent subquery is a subquery, and the dependent subquery is a subquery. Select * from (select * from);
table The current name of the table
partitions Displays partitions accessed by the query
type Join type in the current table; For example, the system table has only one row; Const table has at most one matching row; ③ eq_ref reads a row from each combination of rows from the previous table; ④ ref: For each row combination from the previous table, all rows with matching index values are read from the table; ⑤ ref_or_NULL is the same as ref, but mysql can search for rows containing NULL values; ⑥ Index_Merge uses the index merge optimization method; ⑦ range only retrieves a given range of rows, using an index to select rows; This join type is the same as ALL, except that only the index tree is scanned. This is usually faster than ALL because index files are usually smaller than data files; ⑨ all Performs a full table scan for each combination of rows from previous tables
possible_keys Displays the indexes that may be used
key Displays the index used by mysql after the optimizer evaluation. If no index is selected, the key is null
key_len The length of the index used, or null if the key is null
ref Column referenced to the previous table
rows Get the number of records that need to be scanned for the result set
filtered The percentage of the record data returned by the storage engine that meets the query requirement after the data is filtered at the server layer
Extra Query additional information

An index pushdown

Before introducing index push-down, we add a user_age field to the table above:

Add field user_age:

alter table `user` add column `user_age` smallint(5) not null default '0' comment 'User age'; Mysql > execute mysql> alter table `user` add column `user_age` smallint(5) not null default '0' comment 'User age';
Query OK, 0 rows affected, 1 warning (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 1

Copy the code

Add index name + age:

alter table `user` addindex idx_name_age (`user_name`, `user_age`); Mysql > execute mysql> alter table `user` add index idx_name_age (`user_name`, `user_age`);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0
Copy the code

View table structure:

Mysql > alter table user_age; mysql > alter table idx_name_age> show create table user\G
*************************** 1. row ***************************
       Table: user
Create Table: CREATE TABLE `user` (
  `user_id` int unsigned NOT NULL AUTO_INCREMENT COMMENT 'user id',
  `user_name` varchar(125) NOT NULL DEFAULT ' ' COMMENT 'User name',
  `user_pwd` varchar(64) NOT NULL DEFAULT ' ' COMMENT 'User password',
  `user_sex` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'User gender 0- Confidentiality; 1 - male; 2 - women ',
  `create_at` int NOT NULL DEFAULT '0' COMMENT 'Creation time',
  `user_age` smallint NOT NULL DEFAULT '0' COMMENT 'User age'.PRIMARY KEY (`user_id`),
  KEY `idx_name_age` (`user_name`,`user_age`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='User table'
1 row in set (0.00 sec)
Copy the code

Index push down introduction

Index Condition Pushdown (ICP) is available on Mysql5.6+ to optimize queries. Reasonable optimization of index push-down on non-primary key indexes can effectively reduce The Times of back to the table and reduce The Times of the mysql server receiving data from the storage engine, greatly improving the query efficiency

No index push-down process occurs

Before there is no index pushdown, the execution process is like this, for example, executing the following SQL:

## Based on the user nickname+Age query matches the userselect * from user where `user_name` like 'l %' and `age` = 26; Because the above joint index idx_name_age has been established, the name index must be matched according to the leftmost matching principle for query. Otherwise, the index is not moved, resulting in full table scanCopy the code

First, the storage engine pulls data from the storage engine according to the name index. After retrieving the data through the index, the storage engine finds the data records that meet the requirements by constantly returning the table to the primary key index one by one. Then the data is loaded to the server layer and starts filtering the data that meet the requirements through the user_age condition.

Push down Figure 3-1 without index

The age field is ignored by the storage engine when querying data. ② The results of three compound requirements can be queried directly in the index tree of IDx_name_age by name index. ③ The system starts to query the age value of the primary key user_id three times. (4) Filter through age condition in server layer to obtain the final result set that meets the requirements;

The index push-down process appears

The comparison with the figure shows that: after the data is obtained through the index tree, the index push-down operation is carried out, and the internal condition filtering of the index matches the data results.

Use the index to push down Figure 3-2

Select idx_name_age, idx_name_age, idx_name_age, idx_name_age, idx_name_age, idx_name_age; ② The matching records are found in the index tree, and the corresponding field values are queried in the back table of the primary key index tree by the primary key ID. ③ There is no need to pull data from the storage engine to the server layer for filtering

practice

To do this, insert a few pieces of data:

## Insert data into mysql> insert into `user`(`user_name`,`user_pwd`,`user_sex`,`user_age`,`create_at`) VALUES('Li Ah Mu'.'123'.1.26.1624182989), ('Seven Plums'.'123'.1.31.1624182989), ('Li Jia Qi'.'123'.1.29.1624182989), ('High Fire'.'123'.1.25.1624182989);
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0Mysql > select * from 'mysql'> select * from user;
+---------+-----------+----------+----------+------------+----------+
| user_id | user_name | user_pwd | user_sex | create_at  | user_age |
+---------+-----------+----------+----------+------------+----------+
|       1 |O li mu| 123      |        1 | 1624182989 |       26 |
|       2 |Plum and pure| 123      |        1 | 1624182989 |       31 |
|       3 |Jia-qi li| 123      |        1 | 1624182989 |       29 |
|       4 |High fire fire| 123      |        1 | 1624182989 |       25 |
+---------+-----------+----------+----------+------------+----------+
4 rows in set (0.00 sec)
Copy the code

SQL execution in mysql 5.5

## Check the version5.5Less than5.6Version of mysql> select version();
+------------+
| version()  |
+------------+
| 5.519.-log |
+------------+
1 row in set (0.00SEC) ## View execution Explain result set graph1

mysql> explain select * from user where `user_name` like 'l %' and `user_age` = 26;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | user  | ALL  | idx_name_age  | NULL | NULL    | NULL |    4 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00SEC) ## Another execution result graph2
mysql> explain select user_id,user_name from user where `user_name` like 'l %' and `user_age` = 26;
+----+-------------+-------+-------+---------------+--------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key          | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+--------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | user  | range | idx_name_age  | idx_name_age | 379     | NULL |    3 | Using where; Using index |
+----+-------------+-------+-------+---------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
Copy the code

Using where; Using where; The storage engines in Figure 1 and Figure 2 show possible indexes, but Figure 1 does not walk the indexes and scans the entire table. Figure 2 scans only a few of the indexes; So one conclusion can be drawn: like queries with percentage signs leading, not 100% will not go to the index. ① Less data directly back to the full table scan; Select * from primary key; select * from primary key

Mysql 5.6 mysql 5.6 SQL

Check the mysql version first8.0

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.019.    |
+-----------+
1 row in set (0.00SEC) ## View execution explain result set mysql> explain select * from user where `user_name` like 'l %' and `user_age` = 26;
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+------ -----------------+
| id | select_type | table | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+------ -----------------+
|  1 | SIMPLE      | user  | NULL       | range | idx_name_age  | idx_name_age | 379     | NULL |    3 |    25.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+------ -----------------+
1 row in set.1 warning (0.01 sec)

Copy the code

Extra = Using index condition uses index push-down. So we later write SQL statement, need to appropriate according to where conditions do reasonable index, try to make our SQL statement is the optimal state; Of course, this is often required by our company, and bad SQL statements will be taken out as cases.

Little summary index push down

  • Push-down optimization without index: Queries records based on indexes, returns to tables, and filters records based on WHERE conditions
  • Push-down optimization with indexes: When retrieving records from the index tree, detect whether the data can be filtered using the WHERE condition and queried in the back table

Leftmost matching principle

If (a, c, b) is set in the table, will you continue to use (a, C, b) or (b, a, c) in the SQL query? At least the basic meeting in my interview is asked, especially some big factory! Basically, they want to see if you understand the leftmost matching principle. To see if you normally tune SQL.

Why use federated indexes

Alter table joint index field;

Mysql > alter table select * from user where id = 1; mysql > alter table where id = 1alter table `user` drop index idx_name_age;

alter table `user` addindex idx_name_age_sex ( `user_name`, `user_age`, `user_sex` ); Mysql > execute mysql> alter table `user` drop index idx_name_age;
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table `user` add index idx_name_age_sex ( `user_name`, `user_age`, `user_sex` );
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0
Copy the code
  • Idx_name_age_sex Joint index. It is equivalent to three indexes: IDx_name, IDx_NAMe_AGE, and idx_AGe_sex, saving disk space and write operation costs
  • If an overwrite index occurs,select user_name,user_age,user_sex from xxx where xxx, obtain data directly through index traversal, and query data disordered back to the table.Reduce I/O operations and table return times

Leftmost matching rule: mysql always follows the leftmost matching rule when creating federated indexes. Any consecutive index from the leftmost starting point will be matched successfully; However, if there is a range query (like, >, <, between), index matching will be stopped.

Full value matching query

Name_age_sex Index order

mysql> explain select * from user where `user_name` ="Li Ah Mu"and `user_age` = 26 and `user_sex` = 1;
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
| id | select_type | table | partitions | type | possible_keys    | key              | key_len | ref               | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name_age_sex | idx_name_age_sex | 380     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
1 row in set.1 warning (0.00 sec)
Copy the code

Name_sex_age Index order

mysql> explain select * from user where `user_name` ="Li Ah Mu"and `user_sex` = 1 and `user_age` = 26;
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
| id | select_type | table | partitions | type | possible_keys    | key              | key_len | ref               | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name_age_sex | idx_name_age_sex | 380     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
1 row in set.1 warning (0.00 sec)
Copy the code

Sex_age_name Index order

mysql> explain select * from user where `user_sex` = 1 and `user_age` = 26 and `user_name` ="Li A Mu ";+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
| id | select_type | table | partitions | type | possible_keys    | key              | key_len | ref               | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name_age_sex | idx_name_age_sex | 380     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
1 row in set.1 warning (0.00 sec)

Copy the code

Select idx_name_AGe_SEX from idx_name_age_sex; It may not be easy to see what kind of index it is going through by observing that key_len and ref are exactly the same as the first one.

Some of you might be wondering, “Oh my God, shouldn’t you go to the far left? How come the next two names don’t match to the left, but both go through the index?”

We can’t ignore mysql’s own query optimizer. We don’t need to write where conditions in order, because the query optimizer automatically checks the SQL to see which way it executes most efficiently, and finally generates the actual execution plan.

Matches the column on the left

① Match name in sequence

mysql> explain select * from user where `user_name` ="Li A Mu ";+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+---------- +-------+
| id | select_type | table | partitions | type | possible_keys    | key              | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+---------- +-------+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name_age_sex | idx_name_age_sex | 377     | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+---------- +-------+
1 row in set.1 warning (0.00 sec)
Copy the code

It is clear from the above figure that the union index is used according to the leftmost matching principle and the name index is used, no other indexes are used. Key_len is in bytes:


377 = 125 3 + 2 377 is 125 times 3 plus 2

Just right is the name index length value.

② Match name_age in sequence

mysql> explain select * from user where `user_name` ="Li Ah Mu"and `user_age` = 26;
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------+------+---- ------+-------+
| id | select_type | table | partitions | type | possible_keys    | key              | key_len | ref         | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------+------+---- ------+-------+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name_age_sex | idx_name_age_sex | 379     | const,const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------+------+---- ------+-------+
1 row in set.1 warning (0.00 sec)
Copy the code

Select name + age from left to right; select name + age from left to right; Key_len is in bytes:


379 = 125 3 + 2 + 2 ( s m a l l i n t The length of the ) Smallint size (smallint size) smallint size (smallint size)

Name + age index length value.

③ Match name_age_sex in sequence

mysql> explain select * from user where `user_name` ="Li Ah Mu"and `user_age` = 26 and `user_sex` = 1;
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
| id | select_type | table | partitions | type | possible_keys    | key              | key_len | ref               | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
|  1 | SIMPLE      | user  | NULL       | ref  | idx_name_age_sex | idx_name_age_sex | 380     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------------------+----- -+----------+-------+
1 row in set.1 warning (0.00 sec)
Copy the code

Select name + age from left to right; select name + age from left to right; Key_len is in bytes:


380 = 125 3 + 2 + 2 ( s m a l l i n t The length of the ) + 1 ( t i n y i n t The length of the ) Smallint size + 1(smallint size) + 1(smallint size)

Name + age + age index length value.

④ If they are not matched in sequence

mysql> explain select * from user where `user_age` = 26;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set.1 warning (0.00 sec)
Copy the code

Because none of the above matches start from the left, so there is no joint index, use index full index scan.

⑤ Add the index length calculation method

- 1. If not null is set for all index fields, a byte -2 is required. Int (4 bytes), date (3 bytes), char(n) (n characters) - 3 For the field vARCHar (n), there are n characters + two bytes - 4. The number of bytes a character occupies varies depending on the character set; Utf8mb4-1 occupies 4 bytes utF8-1 occupies 3 bytes GBK-1 occupies 2 bytes latin1-1 occupies 1 byteCopy the code

It is not clear which indexes are used by index length:


k e y l e n = 125 ( v a r c h a r type 3 byte ) 3 + 2 ( extra 2 byte ) + 2 ( s m a l l i n t type 2 byte ) + 1 ( t i n y i n t type 1 bytes ) = 380 ( byte ) Key_len = 125(vARCHar type 3 bytes)*3 + 2(smallint type 2 bytes) + 1(tinyint type 1 bytes) = 380(bytes)
Matching column prefix

Let’s look at the final execution results of the following three SQL statements:

① Prefix Matching

mysql> explain select * from user where `user_name` like 'l %';
+----+-------------+-------+------------+-------+------------------+------------------+---------+------+------+---------- +-----------------------+
| id | select_type | table | partitions | type  | possible_keys    | key              | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+------------------+------------------+---------+------+------+---------- +-----------------------+
|  1 | SIMPLE      | user  | NULL       | range | idx_name_age_sex | idx_name_age_sex | 377     | NULL |    3 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+------------------+------------------+---------+------+------+---------- +-----------------------+
1 row in set.1 warning (0.00 sec)
Copy the code

② Suffix matching

mysql> explain select * from user where `user_name` like '% l';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set.1 warning (0.00 sec)
Copy the code

③ Infix matching

mysql> explain select * from user where `user_name` like '% % li';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set.1 warning (0.00 sec)
Copy the code

From the above result we can see: the first prefix is sorted, so go to the index query; The second and third are full table scan queries

Note: The comparison rule looks like this if the column is a string; The first character of the string is compared, the second character is compared if they are the same, and so on.

Matching range value
Mysql > alter table select * from mysql. mysql; alter table select * from mysql. mysql> explain select * from user where `user_age` > 1 and `user_age` < 30;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+------- ----------------+
| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+------- ----------------+
|  1 | SIMPLE      | user  | NULL       | range | idx_age_sex   | idx_age_sex | 2       | NULL |    3 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+------- ----------------+
1 row in set.1 warning (0.00The SEC) in user_age # #>1 and user_age < 30The range sex is in order, so the rang range is used to query mysql> explain select * from user where `user_age` > 1 and `user_age` < 30 and `user_sex` > 0;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+------- ----------------+
| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+------- ----------------+
|  1 | SIMPLE      | user  | NULL       | range | idx_age_sex   | idx_age_sex | 2       | NULL |    3 |    33.33 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+------- ----------------+
1 row in set.1 warning (0.00 sec)
Copy the code

Conclusion: When a range query is performed on multiple columns, the index is used only for the leftmost column; If sex is ordered in the range 1<age<20, a rang range query is used; Otherwise, the unordered records can be filtered by sex > 0 after age query.

Matches exactly one column and ranges exactly the other
## If the left column is exactly searched, the right column can be scoped to mysql> explain select * from user where `user_name` ="Li Ah Mu"and `user_age` < 29;
+----+-------------+-------+------------+-------+------------------------------+------------------+---------+------+----- -+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys                | key              | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+------------------------------+------------------+---------+------+----- -+----------+-----------------------+
|  1 | SIMPLE      | user  | NULL       | range | idx_name_age_sex,idx_age_sex | idx_name_age_sex | 379     | NULL |    1 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+------------------------------+------------------+---------+------+----- -+----------+-----------------------+
1 row in set.1 warning (0.01 sec)
Copy the code

As you can see from the above, the left column values are exactly searched, and the left column values are ordered. A range lookup will follow the union index. Of course, this can be seen from the length of key_len:


k e y l e n = 125 3 + 2 + 2 = 377 + 2 = 399 key_len = 125*3 + 2 + 2 = 377 + 2 = 399

Back to the table

What is a table-back query? Simply put, it is a query that obtains a large number of non-indexed columns. In this case, the value information of the related columns in the table is obtained according to the primary key index tree.

Let’s look at the following two SQL statements:

No need to return the table, becauseselectColumns are all indexed columnsselect user_id, user_name from user where `user_name` ="Li A Mu "; orselect * from user where `user_id` = 1; Do not return to the table cause: Query by primary key ID in primary key B+Tree to search for data. ## need to return table, becauseselectIf a column has a non-index column, you need to query information in the table based on the primary key index. Actually two index queries are usedselect user_id, user_name, user_pwd from user where `user_name` ="Li A Mu ";Copy the code

Why minimize back to table operations, especially when the amount of data in the table is getting larger and larger?

For example: now we have scanned 10W pieces of data through idx_NAMe_age index. The primary key index is queried through the index and the related column information is queried in the back table. Mysql assumes a separate I/O operation cost for each return to the table.

Index query cost
  • CPU operating cost

r e s u l t = 100000 0.2 = 20000 Result = 100000 * 0.2 = 20000
  • I/OOperating costs

r e s u l t = 100000 1 = 100000 result = 100000 * 1 = 100000
  • The cost of two operations

r e s u l t = 20000 + 100000 = 120000 result = 20000 + 100000 = 120000
Full table scan query cost
mysql> SELECT TABLE_NAME,DATA_LENGTH,INDEX_LENGTH,(DATA_LENGTH+INDEX_LENGTH) as length,TABLE_ROWS,concat(round((DATA_LENGTH+INDEX_LENGTH)/1024/1024.3), 'MB') as total_size FROM information_schema.TABLES WHERE TABLE_SCHEMA='test' order by length desc;
+------------+-------------+--------------+--------+------------+------------+
| TABLE_NAME | DATA_LENGTH | INDEX_LENGTH | length | TABLE_ROWS | total_size |
+------------+-------------+--------------+--------+------------+------------+
| test       |       16384 |        16384 |  32768 |          3 | 0.031MB    |
| user       |       16384 |        16384 |  32768 |          4 | 0.031MB    |
+------------+-------------+--------------+--------+------------+------------+
2 rows in set (0.01SEC) to view the number of bytes in the full table by commanduserTable Data_length= 16384 
Copy the code

From the above we can calculate how many record pages need to be read for a full table scan:


p a g e = 16384 / 1024 / 16 = 1 page = 16384 / 1024 / 16 = 1

You can see that MY local record page is too few; Suppose the local full table has 16384000 bytes and there are 300000 records in the table; So we need to scan 1000 record pages; Now calculate the cost:


r e s u l t = C P U The cost of ( 300000 0.2 = 60000 ) + I / O The cost of ( 1000 1 = 1000 ) = 61000 Result = CPU cost (300000 x 0.2=60000) + I/O cost (1000 x 1 = 1000) = 61000

It looks like there are times when back table operation queries are more costly than full table scans; Therefore, when we do business requirements, we first estimate the amount of tables, and then reasonably build indexes, and see the specific performance through explain analysis.

Query cost composition and calculation
  • CPU cost
  • I/O costs
  • Be aware that mysql states that the cost of reading a page is1.0It’s not random; The cost of reading and checking whether records are indexed is0.2
  • Note: The cost is whether or not records are checked to meet the index criteria0.2; Just remember it

So you can see the calculation logic of the table, as follows:


Back to the table costs = C P U The cost of ( Record number 0.2 ) + I / O The cost of ( Record the number of pages 1 ) Table back cost = CPU cost (number of records * 0.2) + I/O cost (number of pages of records * 1)
A small summary

A clustered index (primary key or first unique index) does not return to the table, whereas a normal index does.

Minimize back table query to reduce query cost: ① can use primary key index or unique index is not secondary index; ② Overwrite indexes can be used

Reference and thanks

  • Mysql official

  • Mysql left-most matching principle