preface

I have been trying to understand what the leftmost prefix principle is, and then I found many documents on the Internet. The explanation of the leftmost prefix (leftmost matching) principle in each document is vague, which makes people confused after reading it. The only conclusion I can come out is that the leftmost prefix principle may be related to the joint index.

In that case, only official documents are available.

Official Document Learning

We searched leftmost prefix to find the 8.3.6 multiple-column Indexes document, which has a paragraph that describes it like this:

MySQL can create composite indexes (that is, indexes on multiple columns). An index may consist of up to 16 columns. For certain data types, you can index a prefix of the column (see Section 8.3.5, “Column Indexes”).

Let’s translate this passage.

MYSQL can create federated indexes (that is, indexes with multiple columns). An index can contain up to 16 columns. For some data types, you can index the prefix of a column.

So how do we understand “for some data types, you can index the prefix of the column.” How about this sentence?

Let’s look at index prefixes in 8.3.5 Column Indexes:

With col_name(N) syntax in an index specification for a string column, you can create an index that uses only the first N characters of the column. Indexing only a prefix of column values in this way can make the index file much smaller. When you index a BLOB or TEXT column, you must specify a prefix length for the index. For example:

CREATE TABLE test (blob_col BLOB, INDEX(blob_col(10)));

When creating an index on a string column, we can use the col_name(N) syntax to create an index that uses only the first N characters of the column. Index files created this way are much smaller. When indexing BLOB or TEXT columns, you must specify a prefix length for the index.

This is the introduction of prefix index, let’s go back to the left – most prefix index.

MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.

The MYSQL joint index can be used for queries that contain all the columns in the index, or just the first column, first two columns, first three columns, and so on. If you specify the columns in the right order in the index definition, a joint index can speed up many different types of queries on the same table.

The sample

Let’s look again at the example of the leftmost prefix given in the official documentation.

First we have the following table:

CREATE TABLE test (
    id         INT NOT NULL,
    last_name  CHAR(30) NOT NULL,
    first_name CHAR(30) NOT NULL,
    PRIMARY KEY (id),
    INDEX name (last_name,first_name)
);
Copy the code

Name is a federated index containing the last_name and first_name columns. This index can be used with a range of last_name and first_name, or with only the last_name column. Since this column is a left-most prefix of the index, the index name can be used with the following query statements:

SELECT * FROM test WHERE last_name='Jones';

SELECT * FROM test
  WHERE last_name='Jones' AND first_name='John';

SELECT * FROM test
  WHERE last_name='Jones'
  AND (first_name='John' OR first_name='Jon');

SELECT * FROM test
  WHERE last_name='Jones'
  AND first_name >='M' AND first_name < 'N';
Copy the code

However, index name cannot be used for the following query:

SELECT * FROM test WHERE first_name='John';

SELECT * FROM test
  WHERE last_name='Jones' OR first_name='John';
Copy the code

If the table has a federated index, the left-most prefix of any index is used by the optimizer to find the column. For example, if you create a joint index with three columns (col1, col2, col3), your index is valid for (col1), (col1, col2), and (col1, col2, col3).

Intellectual development

Why use federated indexes

  • To reduce overhead. Create a federated index(col1,col2,col3)In fact, it has been built(col1).(col1,col2).(col1,col2,col3)Three indexes. Each additional index increases the overhead of write operations and disk space. For tables with a lot of data, using a federated index can greatly reduce overhead!
  • Cover index. Pair joint index(col1,col2,col3)If you have the following SQL:select col1,col2,col3 from test where col1=1 and col2=2;. MySQL can then retrieve data directly by traversing the index without returning to the table, which reduces a lot of random I/O operations. Reducing IO operations, especially random IO, is a major optimization strategy for DBAs. Therefore, in real application, overwriting index is one of the main optimization methods to improve performance.
  • High efficiency. The more indexed columns, the less data is filtered through the index. SQL > create table with 1000W rowsselect from table where col1=1 and col2=2 and col3=3, assuming that each condition can screen out 10% of the data, if there is only a single value index, then the index can screen out 1000W10%= 100W data, and then go back to the table to find data matching COL2 =2 and COL3 =3 from 100W data, and then sort, and then paging; If it is a joint index, through the index screening 1000W * 10% * 10% * 10% = 1W, efficiency improvement can be imagined!

Go index problem

SELECT * FROM test WHERE first_name=’John’; SQL > select * from ‘name’ where index = ‘name’;

Let’s look at the execution of the following two SQL statements.

mysql> explain select * from test where last_name='Jones';
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | test  | NULL       | ref  | name          | name | 90      | const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
1 row in set.1 warning (0.01 sec)
Copy the code
mysql> explain select * from test where first_name='Jones';
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------- ------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------- ------------+| | 1 SIMPLE | test | NULL | index | name | name | 180 | NULL | 1 | | 100.00 Using the where; Using index | +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------- ------------+
1 row in set.1 warning (0.00 sec)
Copy the code

SQL > select name, name, name, name, name, name, name, name, name, name, name, name, name, name, name, name, name, name, name, name, name, name

No, we can take a closer look at the results of the above two SQL statements. We will see that their types are different, one is ref and the other is index.

Mysql Explain (ref and index)

8.8.2 EXPLAIN Output Format (ref)

All rows with matching index values are read from this table for each combination of rows from the previous tables. ref is used if the join uses only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index (in other words, if the join cannot select a single row based on the key value). If the key that is used matches only a few rows, this is a good join type.

That is, only matching rows are read, and ref is used when the query column is the left-most prefix of an index or a normal index (not a primary key or unique index). That is, only a few rows will be queried.

The introduction to ref is as follows:

The index join type is the same as ALL, except that the index tree is scanned. This occurs two ways:

  • If the index is a covering index for the queries and can be used to satisfy all data required from the table, only the index tree is scanned. In this case, the Extra column says Using index. An index-only scan usually is faster than ALL because the size of the index usually is smaller than the table data.

  • A full table scan is performed using reads from the index to look up data rows in index order. Uses index does not appear in the Extra column.

MySQL can use this join type when the query uses only columns that are part of a single index.

The index connection type is the same as all, except that it scans the index tree. There are two main ways:

  • If the queried index overwrites the index and meets the data requirements to be queried from the table, only the index tree is scanned. In this case,ExtraAs aUsing index.An index scan is only faster than ALL because the index tree is usually smaller than the entire table.
  • By performing a full table scan on index reads to find rows in index order,Using indexWill not appear inExtraThe column.

MYSQL uses this type when the queried column is part of an index.

From the above comparison, we can see that ref scans only a few rows, while index scans the whole index. Although the same index is used, the execution efficiency is different.

Select * from test where first_name=’ liu ‘; select * from test where first_name=’ liu ‘; select * from test where first_name=’ liu ‘; All of our data is retrieved from the index, so we don’t need to scan the entire table, just scan the index tree. So type is index.

When we insert another field, such as description, there is no description field in the index tree at this time, so we cannot get the complete result by scanning the index tree. In this case, the type should be all.

Let’s verify:

mysql> alter table test add column `description` char(30) NOT NULL;


mysql> explain select * from test where first_name='liu';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+| | 1 SIMPLE | test | NULL | | NULL ALL | NULL | NULL | NULL | | | 20.00 5 Using the where | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set.1 warning (0.00 sec)
Copy the code

conclusion

Above, we have studied the introduction of documents related to the left-most prefix in official documents, and we can draw the following conclusions:

  • Left-most prefix rule: that is, when you create a federated index, any left-most prefix of the index can be used for queries. Like when you have a federated index (col1, col2, col3), all prefixes of the index are(col1),(col1, col2),(col1, col2, col3), all queries containing these columns will be queried using this index.
  • A federated index can contain up to 16 columns.
  • BLOBTEXTHow many bits before the index creation must be performed

Reference documentation

  • Mysql federation index leftmost matching principle
  • How to understand the MySQL index leftmost matching principle?
  • 8.3.6 Multiple – Column Indexes
  • The official documentation explains MySQL’s leftmost match (leftmost prefix) principle
  • 8.3.5 Column Indexes