In some cases, you want to query directly through a field as a condition. If the field has no index, the statement can only do a full table scan. So how do you index the string field?
Method for adding indexes to string fields
- Prefix index – defines part of the string as an index – If the statement creating the index does not specify the prefix length – the index will contain the entire string
Alter table SUser add index index1(email);
2. alter table SUser add index index2(email(6));- The pros and cons of prefix indexes
- It takes up less space
- Additional record scanning times may be added
- Select id, name, email from SUser where email=’ select id, name, email from SUser where email=’[email protected]’;
- Index1 [index structure of email entire string]
- Select * from index1 where ID = ‘[email protected]’; select * from index1 where ID = ‘[email protected]’;
- Select ‘ID’ from ‘primary key’; select ‘ID’ from ‘primary key’; select ‘ID’ from ‘primary key’; select ‘ID’ from ‘primary key’;
- =’XX’; =’XX’; =’XX’; =’XX’;
You only need to retrieve data from the primary key index once. The system considers that only one row is scanned.
- Index [email(6) index structure]
- Select * from index tree where index = ‘zhangs’; select * from index tree where index = ‘ID1’;
- Select * from ‘ID1′ where ’email’ is not ‘[email protected]’; select * from ‘ID1′ where ’email’ is ‘[email protected]’;
- Select * from index2 where ‘zhangs’ is found; select * from ID2 where ID2 is found;
- I’m going to take the whole row from the ID index and I’m going to judge the value pairs and I’m going to add that row to the result set;
- Repeat steps 3 and 4. If the value of index2 index tree is not ‘zhangs’, the loop ends.
Get data back to primary key index 4 times – scan 4 rows
- Index1 [index structure of email entire string]
If the prefix index is used, the number of times the query statement reads data may increase Use the prefix index, define the length, can save space + do not add too much extra query cost
How do I determine how long a prefix to use?
The indexing is focused on differentiability - the more differentiability the better - the fewer duplicate keys by counting how many different values there are on the index to determine how long the prefix should be used to pre-set an acceptable percentage of losses - find values that are not less than L*95%Copy the code
- Effect of prefix indexes on overwrite Indexes Cannot be used with prefix Indexes Optimization of query performance – The system is not sure whether the definition of a prefix index truncates complete information or queries will go back to the primary key
- The pros and cons of prefix indexes
- Other ways – When the prefix is not distinguishable
- Reverse storage – for example, the last 6 digits of the ID card have enough distinctiveness, the first ones do not
- Hash field – Creates an integer field on the table – Stores the VERIFICATION code of the ID card and creates an index on this field
- The difference between reverse storage and hash fields:
- None of them support range queries
- Reverse order how to check by original order range
- Hash – Equivalent query
- None of them support range queries
All kinds of | Stored in reverse chronological order | Hash fields |
---|---|---|
Extra space taken up | On the primary key index, no consumption | A field needs to be added. The prefix length of the reverse order storage is slightly longer, and the consumption is basically the same |
CPU consumption | Each read and write – Calls the reverse function one extra time | Hash extra call to the crc32() function – In terms of computational complexity, the additional CPU consumption of the Reverse function is smaller |
The query efficiency | Prefix index – Increases the number of rows scanned | Query performance is relatively stable, crC32 calculates the probability of conflicting values is very small. The average number of rows is close to 1 |