What is an index
An index is a data structure derived from source data to speed up retrieval
The organizational structure of the index
Hash index based on hash achieve O1 time complexity of the query, the obvious shortcomings only support equivalent query, and to solve the conflict of the hash b + tree index b + tree is a tree of N, each node has N N KeZi keyword tree, leaf node to store the actual data and leaf nodes are strung together by chain table, the leaf nodes stored key. Advantages logNch query is supported. The disadvantage is that frequent insertion and deletion will lead to leaf splitting and merging
Why is it not a red black tree, why is it a B+ tree
Each time data is read, data is read from disk to memory through disk I/O, and the number of disk I/O depends on the height of the tree. The height of B+ tree is much lower than that of red-black tree, and the leaf nodes of B+ number are some and connected by linked list, supporting range retrieval. The range retrieval of red-black tree can only obtain the structure through middle order traversal
Optimization of indexes
- Since the keys in B+ tree are sorted from left to right according to keywords, and the comparison is made by comparing K1,k2 and k3 first to determine whether the node is left or right, k1 cannot be skipped and k2 can be directly compared, which is the left-most principle
- You can’t use left fuzzy matching because it doesn’t follow the leftmost rule
- Do not use ambiguous words, such as! =, not null and not in, etc
- You cannot use a function on an index field because it would destroy the index structure of the key
- For vARCHAR fields, it is an integer comparison because int cannot be strongly converted to String
- You can avoid back to the table by index overwriting
- Back to the table can be reduced by index recursion
- Through MRR, the discrete table read can be improved to a circular read
Index creation principles
- Indexes are a space-for-time strategy, so don’t create too many
- Since the size of each page is 16KB, if there are too many index fields, each node will have fewer keys, resulting in a higher tree
- The distinction must be high. If the distinction is low, the index will be less able to filter data
- Index index_email(email(6));
Order by process
Full-field sort fetches all the data directly, puts the data into sort Buffer, and returns the data sorted by the keywords in order by. If the sort buff is not enough, it will do an external sort. Primary key sort will not load all the data into sort buffer, but sort the primary key and the keywords in order by, and then return the table after the primary key