Mysql Study Notes: B* tree, principles and Use of indexes

1. Search process without index:

The primary key to find: Search page by page, locate the page where the record is located, and use binary bits in the page to the slot to facilitate the search of records in the slot.

Non-primary key lookup: Page by page to find data.

2. Index scheme:

The primary key indexPage: in the actual data of the upper, with the smallest primary key data page and the page number to build a directory, the directory data is stored in the data page, if too much data will continue to split directory page, each time to find according to the primary key layer orientation to the data in the directory page number, and then to data pages using binary search groove, locating the data; When the number of pages in the directory layer is greater than 1, an advanced directory layer is constructed on the upper layer of the directory page according to the primary key and directory page number, which is used to locate the lower directory item page number. It is constructed into a B+ tree, and searched from top to bottom every time to find the complete data of leaf nodes. The primary key index is automatically created by Innodb engine.

Secondary indexes: The secondary index needs to rebuild a B+ index tree. The leaf node records the primary key data sorted according to the search criteria. The directory layer also sorts according to the search criteria and records the page numbers of the primary key and the leaf node (the primary key is recorded to prevent the excessive repetition of search criteria, which reduces the index search efficiency). The secondary index search first traverses the secondary index tree to find the corresponding primary key, and then searches the primary key index tree to find the specific data (back to the table).

Joint index: Sort by the size of multiple columns, such as the joint index of C1 and C2, that is, the records are sorted by C1, and if C1 is the same, they are sorted by C2. A union index is essentially a secondary index that uses multiple fields as a sort. Build only one index tree. 3. Index tree generation process:In the primary key index, for example, the start will be generated for the index page, a root node behind when inserting data, are stored in the page, when data is increasing, with node space is used up, the root node data will be copied to a new distribution page a, then the new page for page divided operation, get page b, the newly inserted record page size according to the key value assigned to a or b, The root node page is upgraded to a page that stores directory entry records.

4.MyISAM index scheme:MyISAM also adopts the index scheme, but the complete data is stored in a complete data file out of order, no data pages, each data record a row number;

Index information is stored in index files in the format of primary key + row number. During query, search for the corresponding row number through the index tree, and then search for records based on the row number.

5. Index cost:(1) Index tree takes up space. (2) All index trees need to be modified each time the add, delete, or change operation.6. Full value matching:

If the search criteria are consistent with the index column, it is called full-time matching. When writing a query statement, the order of the query criteria does not need to be exactly the same. The Mysql optimizer will process the query.

7. Match the left column:If the query condition contains the leftmost column field of the union index, the union index can also be used. If the query condition does not contain the leftmost column field, the union index cannot be used even if the following index field is included.

8. Match column prefixes:Because string sorting is done left-to-right character by character, the first character is compared first, and if the first character is the same, the second character is compared, so in addition to matching the leftmost column, you can also use the index based on the prefix of the query condition, for example: SELECT * FROM person_info WHERE name LIKE ‘As%’; You can also use the leftmost index of the name field.

9. Matching range value:If the range of a single field matches, the index can be used. If the range of multiple fields matches, the index can be used only for the leftmost column of the index. For example, if the idx_name_birthday_phone index and name and birthday are both range queries, only the name part of the index can be used. The birthday part is sorted after obtaining the name query result. Because birthday in the index can only be sorted if name is the same, the name range lookup results in multiple name values and cannot use the birthday order in the index.

10. Exactly match one column and range match another column:

For example, idx_name_birthday_phone, name = ‘Ashburn’ AND birthday > ‘1980-01-01’ because the leftmost name is an exact query AND the value of name is the same. So the Birthday field can directly use the birthday field with the same name in the index without reordering.11. Sort by index:

If the sort field is the same as the index field in the same order or the column to the left of the index, you can use the index and directly fetch it without sorting it in memory.

However, if asC and desc are used in a sort condition, the index cannot be used directly.

12. For grouping:

If the order of grouping is the same as that of index columns, it can be used in grouping.

13. Return table cost:

When the table is returned, it obtains some consecutive primary key values based on the range lookup, and then searches for data in the primary key index tree based on the primary key values. If these primary keys are connected, they may be stored in contiguous data pages that can be read from disk using sequential IO. If the data is out of order and distributed on different data pages, the I/O access is random and the speed is slower. If a large amount of data needs to be returned to the table, full table scan may be used, which is inefficient. Therefore, using limit to limit the number of entries can reduce the number of entries back to the table and improve efficiency.

14. Overwrite index:

If the index list contains the data to be queried, the required data is directly fetched from the secondary index tree without returning to the table, which is called an overwrite index. It is also recommended to query specific data without *.

14. Index selection:

(1) Create index for search, sort, and group columns.

(2) Create indexes for columns with a large cardinality, that is, columns with fewer duplicate values, which provides better performance.

(3) The type of index column should be smaller, saving index storage space and faster comparison operation.

(4) When the value of the query column is very long, only the prefix can be saved in the index to save space and speed up the string comparison. However, you cannot use an overwrite index and must return to the table. (5) The index column must appear alone and cannot be called as an expression or function.

14. Primary key insert order:

The primary key should be AUTO_INCREMENT order to reduce page splitting when data is inserted.14. Redundant and duplicate indexes:

In the leftmost column of the federated index, you do not need to define another normal index.

Mysql Study Notes: B* tree, principles and Use of indexes

Related Posts

Build your own desktop wechat using Vue + Electron from 0 to 1

A brief introduction to the principle and use of Java thread pool

Asynchronous community this week half price ebook