Quote:
Recently I read high Performance mysql. Although I haven’t finished reading it yet, I think it’s really well written. Mysql > create index (); mysql > create index (); I made some notes on the index to refer back to when I had a problem.
1. Advantages of indexes:
If the basic concepts of mysql indexes are not clear, take a look at my previous two blog posts. Mysql > alter table btree; mysql > alter table btree;
1.1 Indexes greatly reduce the number of rows the server needs to scan 1.2 Indexes can help the server avoid sorting and temporary tables 1.3 Indexes can change random I/O to sequential I/O
2. Use index policies
2.1 Independent columns
If the index column is a separate column, the index cannot be part of an expression or a parameter to a function.
1. select user_id from user whereuser_id + 1 = 7; Select * from user_id where user_id = 6; where to_days(current_date)-to_days(date_col) <=10Copy the code
Note: If the column of the index, as in the above example, is indexed, but because the column of the index is manipulated (to user_id + 1), or because the column of the index is manipulated using a function. This will result in index invalidation.
2.2 Prefix index and index selectivity
2.2.1
When there are some long columns to be queried, we can create a prefix index, which is the first length of the column, such as Alibabayushishidadao, we create the first eight digits of the index, which is Alibaba, and use this prefix to search the corresponding column. But the problem here is, how do you determine the length of this prefix index? There’s a concept called index selectivity. Index selectivity refers to the ratio of non-duplicate indexes to the total number of data records T. The value ranges from 1/T to 1. The more selective the index is, the more efficient the query is, because it means that the index covers more non-duplicate data and filters out more rows during the query. With an index selectivity of 1, this index is extremely high performance. Therefore, we need to set a reasonable prefix index length to make the index selection rate higher. Example:
select count(distinct left(phone, 3))/count(*) as prefix3,
count(distinct left(phone, 5))/count(*) as prefix5,
count(distinct left(phone, 7))/count(*) as prefix7
from table;
Copy the code
Let’s assume that the field phone is prefixed and indexed. Count can be used to calculate which ratio is closest to using the index selection rate of the complete phone directly. In this way, we can change the index to prefix index, so as to reduce the length of the index, improve the query efficiency.
2.2.2 Creating prefix Indexes:
Alter table user add key (user_name(7)
2.2.3 Disadvantages of prefix Indexes:
1. The prefix index is not the full length of the column, so the group by and order by 2 cannot be performed. Also, because it is not the full length of the column, the overwrite index cannot be reached
2.2.4 note:Other uses of prefix indexes are to use a unique ID for long data, or sometimes to use a suffix index (of course mysql does not support this, but we store the data upside down).
2.3 Multi-column indexes
If you find type=index_merge in Explain, then you need to consider the rationality of index creation. This kind of index merging usually consumes a lot of CPU and memory resources, and more importantly, the optimizer doesn’t factor this into the cost of the query. The optimizer only cares about how much data is read from random pages.
2.4 Select an appropriate index order
There is no fixed rule for the order of indexes, which should be created based on actual usage. However, in general, we put the fields with high frequency of use and high index selection rate in the front, and the fields involving range query and low frequency of use in the back.
3.5 Cluster Index
3.5.1 track of
The primary key should be a self-incrementing ID, so that each time new data comes in, the data in the cluster index should only be added to the last column, even if the current data page is full, and the data should only be added from the new page. If the uUID is not in order, the data is inserted in order in InnoDB. If the primary key of the uUI is smaller than the previous primary key, the previous data will be moved and new data will be inserted. If the data page is full, it will consume more resources to deal with such situation. Will keep have split page, page of split will lead to debris, so will than normal on the primary key to take up more space.
3.5.2
Disadvantages of auto-increment primary keys: in the case of concurrency, it can lead to resource contention, because the upper bound of auto-increment ID is contested by each thread, all inserts need to get the latest maximum auto-increment ID, and concurrency keeps this upper bound constantly changing.
3.5.3
Mysql can no longer perform a like operation on an index. Comparisons with the leftmost prefix of a like operation can be performed on the index because they will be converted to a simple comparison. SQL > select * from ‘% XXX %’; SQL > select * from ‘% XXX %’; Because search engines can’t compare wildcards with specific indexes.
tips:select sum(description = 3),sum(category_type = 2) from shop_page_field; This allows you to count the number of columns that belong to a value, which is easier to write than count, but it’s not clear how well it performs.
4.5. Overwrite indexes
4.5.1 Benefits of overwriting indexes:
1. The number of entries covering index data is smaller than the total amount of data, so the query speed is faster 2. Innod does not need to run a second query because it obtains data directly from the index
4.6. Unused indexes
Open userStates in the server (off by default) and let the server run for a while. The usage of an index can be queried at information_schemaindex.statistcs. If an index is not in use, it can be dropped.
4.7 Indexes and Locks
Innodb locks are granular to row level, row level and table level. Here, the row lock must be added to the index to achieve, because the index stores the primary key and index and information, in order to precisely lock the row data. If no index is added to the table, the table will be locked during operations such as update.
4.8 Indexing and Sorting
If a particular field is often used for sorting, add an index to it. The extra field will return filesort (mysql is called file sorting, although disk files may not be used). Suppose we have the index (A, B, C) and then we have the query statement, whether the sort under the object works
(1)select * from table where A = 'a'order by B, C; (Index takes effect) (2) Select * from tablewhere A = 'a'order by B; (3)select * from tablewhere A = 'a'order by A, B; (index takes effect) (4) Select * from tablewhere A = 'a'order by C; Select * from table (select * from table)where A = 'a'Select * from table order by B, D (select * from tablewhere A > 'a'Select * from table where (select * from table where (select * from table where (select * from table where (select * from table)where A = 'a' and B in ('b1'.'b2'B) in order by C) in order byinIndex invalidation is also the case for range query.Copy the code
4.9 Other Optimization Strategies
When the query content is similar to THE URL, the efficiency of using Btree is not high, because the URL is usually long, and the index search times and efficiency are not good. At this point we can perform crc32 or crc64 on the URL, calculate its hash value and store it. However, crc32/crc64 will collide, so the query condition must be the original URL; select * from url_table where url_hash = “1342134234” and url = “www.baidu.com”. First, it will find the corresponding URL according to url_hash, which may have collision but the query is fast. Then it will filter according to the URL value, so that the query performance will be high. Here url_hash is still the btree index, but filtering urls is much faster than direct long urls. Can be a pseudo hash index
Note: both crC64 and fnv64() require additional mysql plugins. So if it’s not installed, we can save a hash value by doing MD5 or something like that while writing data in the program.