Did you know that strings can be indexed like this? MySQL series 7

series

MySQL series 3

MySQL MySQL series 4

MySQL > select * from MySQL; select * from MySQL;

MySQL > select an index for MySQL > select an index for MySQL

If you want to index a string, you have never set the length of the string.

How to build an index

Most systems will have a user table, and the initial design of the system using mobile phone number login.

This is where the product came up with a requirement for the system to support email login as well.

You do know that a full table scan is performed if you do not index a mailbox field.

Add an index to the email field and you’re done! But do complex requirements well, simple requirements to the best, reduce all the pressure on the system.

Alter table table_name add index idx_field (field)

MySQL > create index with phpmyadmin; MySQL > create index with phpmyadmin;

What’s the difference between a string index and a string index?

An index created using the alter table table_name add index idx_field (field) command contains the entire string by default.

Alter table table_name add index idx_field (field(6))

A diagram of a thousand worries, to see what the two index structures look like.

Index a structure diagram

Index a structure diagram

Index ii structure diagram

Index ii structure diagram

As you can see from the figure, if you specify an index length of 6 and only take the first 6 fields of the mailbox field, each node stores more data than if the index contains the entire string.

The index article also taught you that indexing should be as small as possible.

Every coin has two sides, and every bad has its good. One of the factors that caused the misselection of index in the sixth issue was the number of scanned lines.

The effect of reducing the index length is to increase the index cardinality, which increases the number of additional scan records (execute the ROW field of Explain).

Select id,name,email from mac_user where email=’[email protected]’;

Add index execution flow to the entire string

1. Search the email index tree for the record that meets [email protected] and obtain the primary key ID 1

Select * from primary key; select * from primary key; select * from primary key; select * from primary key;

3. Repeat the first step until the query conditions are not met, and the cycle ends.

Specifies the index length to execute the process

1. Find the record 139739 from the email index tree and obtain the primary key ID 1

2. Find the record based on the index tree from 1 to primary key and determine that the email is incorrect. Discard the record.

3. Search for the next record in the email index tree and find that it is still 139739. Remove ID2 and then judge in the index tree of ID.

4. Repeat the previous step until the query conditions are not met, and the cycle ends.

conclusion

In the simulation execution process, it is easy to find that the use of prefixed indexes leads to an increase in the number of reads. Does that mean that the use of prefixed indexes increases the query cost?

Of course not. Imagine setting the length to 7 or 8 if the definition is 6! Would it be much better? The case in the figure set three same data for convenience, but the actual situation basically does not happen.

Index establishment is concerned with the distinction, only the higher the distinction, the less repeated values, the higher the query efficiency.

So using prefix indexes, as long as the length is defined, you can sit in a way that saves space and does not add too much extra query cost.

How to determine the length of the prefix to use

The MySQL keyword distinct can return a different result set for this column.

For example, query the number of distinct email values in the email column. Select count(distinct email) as num from mac_user.

How do I calculate how many rows there are with different prefixes

Select count(distinct left (email,4)) as num4 from mac_user; select count(distinct left (email,4)) as num4 from mac_user

Then divide the total by this value to get the ratio, which can be determined according to the business situation.

The impact of using prefix indexes

Using a prefix index increases the number of scanned rows and invalidates the overwrite index.

Why does it affect the coverage index?

If the statement is select ID,email from mac_user where email = ‘[email protected]’.

Using the entire string index structure, you can use an overwrite index to retrieve results from the email index and return them to the table.

If the prefix index is used, after the email index gets the result, it needs to go back to the ID index to check whether the email value queried is correct.

Even if the value is greater than the length of the email, it will be returned to the table again, because MySQL does not know whether the defined prefix intercepts the full information.

conclusion

Using a prefix index increases the number of scanned rows and also does not use an overwrite index. This factor is one you should consider when choosing whether to use prefixed indexes.

If you don’t know whether to use a prefix index or a full string index, test locally and choose a suitable solution for the production environment.

Four, how to change can not be used

Assume that the identity authentication system stores the id number, should all know that the id number of the first 6 is address code, with the county id number of the first 6 is generally the same.

In this case, prefix index differentiation is very low, not only does not speed up the query, but also causes index differentiation to have little impact on the query performance.

If the index length is longer, each node stores fewer index values, and the query efficiency becomes low.

If you solve this scenario

The first option

When storing data, the data is stored in reverse order, and when querying, it can be processed in positive order

Second option

Add a field to the table to store the hash value of the data and add a prefix index to the hash.

The difference between

These two schemes do not support range query in common, and can only be equivalent query.

In terms of space occupied: The flashback mode does not add additional storage space, but hash adds a field. The two are equal in space

In terms of CPU consumption, reverse is used for flashback and crc32 is used for hash, which consumes little CPU consumption

In terms of query efficiency, hash queries are more stable. Although crC32 computs values, the probability of conflicts is very small. The average number of scanned rows for each query is close to 1. The flashback prefix index method also increases the number of scanned lines.

Five, the summary

Create space for strings directly.

Creating a prefix index saves space and increases the number of scanned rows. Overwriting indexes cannot be used.

Flashback storage, create prefix index to solve the problem of small distinction.

Hash provides stable query, but does not support range query.

“

Insist on learning, insist on writing, insist on sharing is the belief that Kaka has been upholding since he started his career. May the article in the big Internet can give you a little help, I am kaka, see you next time.

“

Did you know that strings can be indexed like this? MySQL series 7

series

How to build an index

How to determine the length of the prefix to use

The impact of using prefix indexes

Four, how to change can not be used

Five, the summary

Related Posts

Java – A Guide to Concurrent Programming (Part 1)

Set up a WebSocket cluster in 10 minutes

Learn ES (1) with requirements