This is the 15th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021
Use of prefix indexes
I’m sure you all know the login function? Many login functions support mailbox login, then we will appear in the background such a SQL. Suppose our table structure looks like this:
create table t_User( ID bigint unsigned primary key, name varchar(64), email varchar(64), ... ) engine=innodb;Copy the code
We know that if we do not index the email field, the SQL will scan the whole thing. So we index the email as follows:
alter table t_User add index index1(email);
Copy the code
The index tree created will look like the figure below:
We execute this SQL:
select x1 from t_User where email = '[email protected]';
Copy the code
- When we use the index1 index, the engine is executed in the following order:
1. Find the record whose index is’ [email protected] ‘from the index1 index tree and obtain the value of ID1.
2. Find the row whose primary key value is ID1, check whether the value of email is correct, and add the row to the result set.
3. Select the next record in the index1 index tree and find that the condition of email=’[email protected] ‘is no longer met. The loop ends.
In this process, we only need to go back to the primary key index once, so the system thinks that only one row has been scanned, and we only need to go back to the table once.
But MySQL supports prefix indexing, which means that we can index a part of a string, like this:
alter table t_User add index index2(email(6));
Copy the code
This statement takes the first 6 bytes of the email field as the index.
The index tree created will look like the figure below:
Obviously, index2 with the first six bytes as the index takes less space, but may increase the number of additional scans.
We execute this SQL:
select x1 from t_User where email = '[email protected]';
Copy the code
- When we use the index2 index, the engine is executed in the following order:
-
Select ‘budong’ from index tree (ID1, ID1);
-
The primary key is ID1, and the value of email is not ‘[email protected]’. This row is discarded.
-
Under take index2 just to check the position of a record, the discovery is still budong, remove ID2, to take the entire line ID index and judgment, value for this time, to add the row to the result set;
-
Repeat the previous step until the value on IDxe2 is not ‘budong’, the loop ends.
In this process, the primary key index is fetched four times, or four rows are scanned. So we find that using prefix indexes causes us to read data more often.
But if we define index2 as email(9), then the index will fetch ‘budong112’ for the first time. There is only one prefix for this, so it’s done with just one scan.
summary
That is, using a prefix index with a defined length can save space without adding too much query cost.