You need to be aware of these "tricks" to work with MySQL syndication!

preface

What does MySQL data look like on disk? In this article we talk about how data in MySQL is stored on disk. Those of you who are interested can review it.

However, there is still a bit of a tail to it: the most used joint index and the left-most prefix principle in the work.

define

An index consisting of multiple fields.

We’ve been talking about single-valued indexes, where you only have one field, and now you have multiple fields.

The SQL statement

Table name (1, 2…); table name (1, 2…); USING BTREE

CREATE TABLE t_employee  (
  id bigint(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  create_time datetime NOT NULL,
  update_time datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
  name varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT 'name',
  age int(10) UNSIGNED NOT NULL COMMENT 'age',
  dept varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT 'department'.PRIMARY KEY (id) USING BTREE,
  INDEX idx_name_age_dept(name, age, dept) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci COMMENT = 'Staff list' ROW_FORMAT = Compact;
Copy the code

To create a joint index on an existing table, execute the following statement: create index index name on table name (field name 1, field name 2…) ;

create index idx_name_age_dept on t_employee(name, age, dept);
Copy the code

If you want to create a unique index, use unique before index, otherwise it is normal.

UNIQUE INDEX uk_name_age_dept(name, age, dept) USING BTREE
Copy the code

create unique index uk_name_age_dept on t_employee(name, age, dept);
Copy the code

The form of storage on disk

Insert 10 records

As mentioned earlier, MySQL’s data structure on disk is stored as B+tree.

And all indexes except the primary key index are called secondary indexes: the data on the leaf node stores only the primary key value except the index itself.

The name field of the bottom leaf node is orderly, increasing from left to right.

Globally, the AGE and DEPT fields are unordered;

But locally, when the first field name is equal, the value of the age field is sorted. If age is still equal, sort by the third field.

Underlying lookup process

The first field (name) is compared, the second field (age) is compared, and the third field (dept) is compared.

In the data page directory (data node), the binary method is used to quickly locate the data, and then the specific data item can be found by traversing the records in the corresponding interval.

After the data is found, the primary key value (1) stored on the leaf node is returned to the table to find the complete record in the clustered index tree and read it out.

Left-most prefix rule

The leftmost prefix principle is the way data is searched in the associative index B+tree.

If the leftmost index column is not used, the index is invalid, but if it is used, it may be valid.

Idx_name_age_dept (name, age, dept)

statements	The effect	instructions
select * from t_employee	❌	No query conditions, direct full table lookup
~ where name = ‘HanMeimei’	✔	The index uses the name field
~ where name = ‘HanMeimei’ and age = 18	✔	The index uses the name and age fields
~ where age = 18 and name = ‘HanMeimei’	✔	The index uses the name and age fields. The query optimizer analyzes the search criteria and decides which search criteria to use first and which to use later, based on the order of columns in the available indexes.
~ where name = ‘HanMeimei’ and age < 18	✔	The index uses the name and age fields
~ where name = ‘HanMeimei’ and age = 18 and dept = ‘finance’	✔	The index uses the name, age, and DEPT fields
~ where name = ‘HanMeimei’ and dept = ‘finance’	✔	The index uses only the name field; through`B+tree`Select * from dept where name = ‘HanMeimei
~ where age = 18	❌	Query condition does not bring the leftmost name field, direct full table lookup
~ where name < ‘WangWu’ and age = 18	✔	The index uses only the name field; through`B+tree`Find the node whose name < ‘WangWu’ and compare the age field from it
~ where name like ‘%Han%’	❌	On the far left is %. The exact value is ambiguous and cannot be found`B+tree`Is used to compare values.
~ where name like ‘Han%’	✔	The index uses the name field; through`B+tree`Find data where name starts with ‘Han’
~ where name = ‘HanMeimei’ or age = 22	❌	SQL: index(name) index(age);
~ where substring(name, 3) = ‘hhh’	❌	Indexes are not used, and implicit conversions, calculations, or function operations on index columns are invalid

advantages

Reduce index building overhead

For example, idx_name_AGe_DEPT (name, age, dept) creates 3 indexes at the same time

idx(name)
idx(name, age)
idx(name, age, dept)

Cover index

An override index is a select column of data that can be retrieved only from the index, not from the rows.

That is, the field information we want to query can be directly obtained from the joint index tree structure, instead of the primary key value after the query table.

Select * from T_employee where XXX; select * from T_employee where XXX;

Select name, age, dept from T_employee where name =’HanMeimei’ and age = 18 select dept from T_employee where name =’HanMeimei’ and age = 18

The name, age, and DEPT fields can be retrieved directly from their own federated index structures.

This is an important means of SQL optimization, sometimes also artificially constructed overwrite index, this query method, is called overwrite index.

Effectively narrow down the screening field

Select * from T_employee where name = ‘HanMeimei’ and age = 18 and dept = ‘finance’ select * from T_employee where name = ‘HanMeimei’ and age = 18 and dept = ‘finance’ Suppose 10% of the data can be filtered out for each condition.

When there is no joint index, only single value index, then filter 10% of the data through the single value index, and then query the table from the 100W data.
When there is a joint index, it is much more efficient to select 10% × 10% × 10% data from the index, and then query the table from 1W data.

Sorting faster

SELECT * FROM T_employee WHERE name = ‘HanMeimei’ ORDER BY create_time;

Sort the results by create_time field.

Execute the plan using Explain to see the results

SELECT * FROM T_employee WHERE name = ‘HanMeimei’ ORDER BY age; SELECT * FROM T_employee WHERE name = ‘HanMeimei’ ORDER BY age;

Using index condition: The query uses the index, but the data needs to be retrieved from the table.

Using filesort: Use file sorting.

You can see that the latter sort by one step less than the formerfilesortOperation.

Filesort sort is a way to sort data in memory or on disk after a query. Sometimes, if the result set is too large to be sorted in memory, it is possible to temporarily use disk space to store intermediate results, and then return the sorted result set to the client.

When you need to sort the fields in the index, because the index B+tree is already in order, eliminating file sorting will also improve efficiency.

Pay attention to

The order of the syndicated index columns

The order of the columns in the federated index is important

Index (name, age) is not the same as index(name) + index(age)

Index (name, age) is not the same as index(age, name)

The value of the indexed field is NULL

Read a lot of information, say what all have… Experiment is the sole criterion for testing truth

Now change the name and DEPT fields to null

And add an HHH field, the union index and the single-value index are set to unique index.

insertnullData, see the effect 👇

As a result, null values can be inserted into both unique indexes.

Query null data

Use the Explain execution plan to verify that the index is valid when the query is executed.
- Validate federated indexuk_name_age_deptThe validity of
```
EXPLAIN SELECT * FROM t_employee WHERE ISNULL(name);
Copy the code
```
Here you can see the use of the union index;
- Validate a single-valued indexidx_hhhThe validity of
```
EXPLAIN SELECT * FROM t_employee WHERE ISNULL(hhh);
Copy the code
```
Here you can see that single-valued indexes are also used;

Now let’s make a change: set all the first 10 rows to NULL

Go through the plan again

Select idX_hhh, idx_hHH, idx_hHH, idx_hHH, idx_hHH Here’s why:

In MySQL, there is something called the query optimizer, which calculates statistics on the records in the table and then uses those statistics to calculate the number of records that need to be returned to the table based on the query criteria.

The more records need to be returned to the table, the more inclined to use full table scan;

Instead, the secondary index + back table approach is preferred.

Of course the optimizer does more analysis than that, but it’s basically the process.

In general, limiting the number of records retrieved by a query makes the optimizer more inclined to use the secondary index + back table approach, because the fewer records returned to the table, the higher the performance improvement.

Select * from HHH; select * from HHH; select * from HHH;

SQL > alter table HHH; alter table HHH; alter table HHH;

Actually,MySQLThere are also explanations on the official website

UNIQUE index allows columns that can contain NULL to use multiple NULL values.

MySQL can use indexes to search for NULL data.

But when we design the database table, it is better to have a field set toNOT NULLAs for the habit ofNULLAnother article today will tell you the answer.

The more indexes, the better

The index itself is also data, and also needs to be stored.

It takes time to create an index, and it takes time to maintain the index as the table inserts or updates data, both of which increase with the volume of data.

So when every field is indexed, it’s not necessary.

The resources

How MySQL Works: Understanding MySQL from the Root
www.jb51.net/article/157…
Dev.mysql.com/doc/refman/…
Dev.mysql.com/doc/refman/…

The last

Index is a good thing, but do not greedy ah, to consider the big picture ~ the author’s ability is limited, if the wrong place to write welcome in the comments section to communicate.

If this article helped you, please like 👍.

Share technology, hold on, we can win 💪!

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

You need to be aware of these “tricks” to work with MySQL syndication!

preface

define

The SQL statement

The form of storage on disk

Underlying lookup process

Left-most prefix rule

advantages

Reduce index building overhead

Cover index

Effectively narrow down the screening field

Sorting faster

Pay attention to

The order of the syndicated index columns

The value of the indexed field is NULL

The more indexes, the better

The resources

The last

You need to be aware of these “tricks” to work with MySQL syndication!

preface

define

The SQL statement

The form of storage on disk

Underlying lookup process

Left-most prefix rule

advantages

Reduce index building overhead

Cover index

Effectively narrow down the screening field

Sorting faster

Pay attention to

The order of the syndicated index columns

The value of the indexed field is NULL

The more indexes, the better

The resources

The last

Related Posts

Implementation of accounting system Demo based on SSM

Paxos Distributed consistency protocol

SpringBoot lock – Mybatis