One, foreword

When doing SQL optimization in MySQL, there are often situations where there is some confusion about whether MySQL can take advantage of indexes. MySQL > select * from ‘mysqld’; select * from ‘mysqld’; select * from ‘mysqld’; MySQL > select * from ‘LIKE’; MySQL > select * from ‘LIKE’ 3, in what case can MySQL use index to sort? Today, I’m going to use a model to answer all of these questions so you don’t have to worry about using MySQL indexes.

Second, knowledge supplement

A key_len column in the EXPLAIN execution plan is used to indicate the length of the selected index in bytes for this query, which is usually used to determine how many columns of the federated index were selected. Here the key_len size is computed as follows:

  • Generally, key_len is equal to the length of the index column type in bytes, for example, int is 4 bytes and bigint is 8 bytes.
  • If the value is a string, consider the character set. For example, CHAR(30) UTF8. Key_len must be at least 90 bytes.
  • If the column type definition allows NULL, its key_len needs an additional bytes.
  • If the column type is variable length, such as VARCHAR (TEXT\BLOB does not allow indexes to be created for whole columns; creating partial indexes is also considered a dynamic column type), its key_len needs an additional 2 bytes.

What criteria can be used for indexes

First of all, THANK You very much to Dembo, who gave me a good inspiration. I made this picture based on his article _ and my own understanding:



At first glance, is not very dizzy, not urgent, we take a look slowly

The figure is divided into three parts:

1, Index Key: MySQL is used to determine the range of data to be scanned, which is actually the available MySQL Index part, reflected in the Key Length.

2, Index Filter: MySQL determines which data can be filtered by Index. After ICP is enabled, the Index can be used.

3. Table Filter: MySQL cannot use index Filter. After retrieving row data from the Table, Filter data from the server layer.

Now let’s expand it out.

Index Key is used to determine a scan range for MySQL, divided into upper and lower boundaries.

MySQL > select first key from where (‘ = ‘, ‘>’); MySQL > select first key from where (‘ = ‘, ‘>’); If it exists and is (>), the key value is added to the lower boundary to stop matching the next index key. If not, stop matching the lower boundary.

exp: Idx_c1_c2_c3 (c1,c2,c3) where c1>=1 and c2>2 and c3=1 and c2>2 and c1 =1 --> c1 Add lower bounds to stop matchingCopy the code

The last key is similar to the first key. First, check whether it is one of (=,<=). If yes, add delimit and continue with the next key value matching

exp: Idx_c1_c2_c3 (c1,c2,c3) where c1<=1 and c2=2 and c3<3 --> last key (c1,c2,c3) Continue to match the next --> c2 for '=' and join the upper bound. Continue to match the next --> C3 for '<' and join the upper bound and stop the matchCopy the code

Note: If the comparison symbol contains’ = ‘sign,’ >= ‘also contains’ =’, then the index key can be used to continue matching the following index key value; If there is no ‘=’, that is, ‘>’, ‘<‘, the following key will not match. At the same time, upper and lower boundaries can not be mixed, which boundary can use the index of the key value is the number of final use index key value.

Index Filter

You can filter by index. That is, the part of the field in the Index Key value that cannot be used to determine the Index Key.

exp:
idex_c1_c2_c3
where c1>=1 and c2<=2 and c3 =1
index key --> c1
index filter--> c2 c3
Copy the code

Why is the index key just c1? C2 is used to determine the upper boundary, but c1 does not appear on the upper boundary (<=,=), while c1 is >= on the lower boundary,c2 does not appear, so the index key only has c1 field. C2 and C3 both appear in the index and are treated as index filters.

Table Filter

If the index cannot be used for filtering, only table filter can be used. At this point, the engine layer will return the row data to the Server layer, and then the Server layer performs the Table filter.

4. Handling Between and Like

If there is a “between” and “like” in the query, how does MySQL handle it?

Between

Where c1 between ‘a’ and ‘b’ is equivalent to where c1>= ‘a’ and c1 <= ‘b’

Like

Select * from c1 where c1 like ‘%a’ where c1 like ‘%a’; Where c1 like ‘a%’ is the same as where c1>= ‘a’ and c1< ‘b’.

5. Sort the index

In a database, if the index cannot be used to complete the sorting, the cost of sorting will increase with the increase of the data volume of filtered data. Even if limit is adopted, the database will choose to sort all the result sets and then take the sorted limit records. MySQL has also been optimized for limits that can be sorted using indexes to reduce costs. Make sure it uses index It is very important to have ORDER BY with LIMIT executed without scanning and sorting full result set, So it is important for it to use index — in this case index range scan will be started and query execution stopped as soon as soon as required amount of rows generated.

CREATE TABLE `t1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `c1` int(11) NOT NULL DEFAULT '0', `c2` int(11) NOT NULL DEFAULT '0', `c3` int(11) NOT NULL DEFAULT '0', `c4` int(11) NOT NULL DEFAULT '0', `c5` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`id`), KEY `idx_c1_c2_c3` (`c1`,`c2`,`c3`) ) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8mb4 select * from t1; +----+----+----+----+----+----+ | id | c1 | c2 | c3 | c4 | c5 | +----+----+----+----+----+----+ | 1 | 3 | 3 | 2 | 0 | 0 | | 2 | 2 | | | | 0 5 4 0 | | | | 2 | 3 | 3 | 4 0 0 | | | 1 | 3 | 4 2 | | 0 0 | | | 1 | 3 | 3 | 0 5 | 0 | | 2 | 3 | | 6 5 | | 0 0 | | 7 | 3 | | | | | 0 0 6 + 2 - + - + - + - + - + - + 7 rows in the set (0.00 SEC) select c1, c2 and c3 from t1; +----+----+----+ | c1 | c2 | c3 | +----+----+----+ | 1 | 3 | 2 | | 1 | 3 | 3 | | 2 | 3 | 5 | | 2 | 4 | 5 | | 3 | 2 | 4 | 6 | | 2 | 3 | | 2 | 3 | 3 | + - + - + - + 7 rows in the set (0.00 SEC)Copy the code

Select c1,c2,c3 from T1; Select c1,c2,c3 from T1 order by c1,c2,c3; select c1,c2,c3 from T1 order by c1,c2,c3; Results.

So what are the order rules for indexes?

C1 =3,c2=2 — > c3 c1 in(1,2) — > c2 disordered, c3 disordered

There is a small rule, idx_C1_C2_C3, so how do you determine that a field is ordered? C1 is first in the index, and c2 is second. C2 is ordered only if C1 uniquely determines a value. If C1 has multiple values, then C2 is not necessarily ordered, and so is C3

Six, the summary

As for MySQL indexes, I have only mentioned the model in the case of single table queries. From this article, you should get an idea of how MySQL uses indexes in most cases.