Image source network, delete. Image from gold prospectus
The index
There are many different types of indexes in Mysql: Hash index, B tree index, B+ tree index, and full-text index. Mysql has multiple storage engines, and each storage engine may have different support for indexes.
What
Mysql index is a data structure that can improve the random access speed of database tables. Indexes can be created by specifying one or more specified columns.
So an index is a data structure.
Why
When the amount of data reaches a certain level, if all data is scanned in sequence, the data query effect is low and the query time is long, resulting in poor user experience. In order to improve the efficiency of data access, indexes are introduced.
How
When creating an index, you need to specify at least one column of data to organize the data in a B+ tree manner according to certain rules. When querying data, the Mysql query optimizer determines whether the statement has available indexes, calculates scores based on available indexes, compares the scores of each query result, and selects the best query method to query data. Therefore, after the index is created, Mysql does not necessarily use the index to perform queries.
The following are the InnoDB engine and MyIsam engine based on Mysql
Execution plan View
When you need to execute an SQL statement, you can view the execution plan generated by Mysql through explain provided by Mysql.
-
select_type
Type of query: Indicates the type of query statement, whether it is a simple query, or whether it uses a federated query, etc.
-
table
The name of the table
-
type
Access mode for a single table
- const
- eq_ref
- Ref: Queries using ordinary secondary index columns
- Fulltext: indicates the full-text index
- Ref_or_null: Normal secondary index, but the index may be Null
- Index_merge: query in index merge mode
- Unique_subquery: The cluster index of the ID column is used by the subquery
- Index_subquery: Subqueries use normal indexes
- Range: indicates the range query
- Index: Overwrite indexes can be used, but all indexes need to be scanned
- All
-
possible_keys
Possible indexes
-
key
The actual index used
-
key_len
The index length
-
ref
When the index column equivalence query is used, the information about the object matching the index column equivalence.
Its values are displayed depending on type and are displayed only when const, eq_ref, ref, ref_OR_NULL, unique_subquery, and index_subquery are displayed.
-
rows
Estimate the number of records to be read
-
filtered
The percentage of the estimated number of records that need to be read after being filtered by search criteria
-
extra
Additional information
The index type
Mysql divides an index into two types: clustered index and non-clustered index, based on whether there is data in the index.
-
Clustering index
A clustered index is one that stores complete data information in the leaf nodes of an index tree, and is called a clustered index. The primary key index created by Mysql’s InnoDB engine is a cluster index, and the data file is an index file. Primary key indexes in MyIsam engine are also non-clustered indexes.
-
Non-clustered index
A non-clustered index is one in which the leaf node of the index tree does not hold complete data information, but a pointer to the data or address information (MyIsam’s primary key index is the address information to the data, InnoDB’s primary key Id). If you want to obtain all data information, you need to perform a table back operation.
-
Joint index
A federated index is an index that is created on multiple columns at the same time. The leaf nodes in the index tree contain the values of multiple columns at the same time. The sorting between leaf nodes is based on the order of the columns when the index is created, and the sorting satisfies the left-most prefix rule.
Leftmost prefix rule: when there is A joint index, for example, including A, B, C three columns, if the query is to use the A, [A, B], [A, B, C] way of query, you can use this index, if it is used as query conditions B, C 】 【, does not meet the leftmost prefix principle, will not be using this index.
When Mysql creates a joint index, it compares the values of the first column. When the values of the first column are identical, it compares the values of the second column. When the values of the second column are identical, it compares the values of the third column. Therefore, when the value of the first column is skipped and the value of the second column is directly matched, the value of the second column is not necessarily ordered, so it cannot be used.
-
Cover index
If all the fields that you want to query already exist in the index, you can directly return to the table without performing operations based on the primary key ID. This is called an overwrite index.
If we create an index for name and age. Overwrite indexes are used when we query the age and Id of data whose name is the specified value. Since the name and age values already exist in the index data (the primary key ID value is present in every index), the data will not be queried back into the table cluster index based on the ID.
If the queried field data does not exist in the index, the system searches for other information in the table cluster index based on the primary key ID of the data after finding the data that meets the conditions, and finally returns the data.
-
The only index
The unique index requires that the values of the specified column cannot have the same value. The database checks the data every time it adds or modifies the data. If the data has duplicate values, the database directly returns a failure.
The index principle
The data type used in Mysql indexes is B+ tree.
why
-
Why is the data structure used for the index a B+ tree instead of a binary or a B tree?
When binary tree query is used, the query time complexity is O(log N), and the query time efficiency is very fast. However, the problem of binary tree is that each branch has only two branches at most. When there is a large amount of data, the height of the tree will be very high, and the NUMBER of I/OS will increase during query, and the query efficiency will decrease. Using B tree or B+ tree, a node can have multiple branches, which can greatly reduce the height of the tree, reduce the number of I/OS, and improve the efficiency of query.
-
Why did you choose a B+ tree over a B tree?
- The characteristic of A B tree is that each node stores keys and data, while only leaf nodes in a B+ tree store data information, while other nodes only store key information. Mysql can load more key data in memory at a time for query because the data volume of other nodes is small.
- The leaf nodes of the B tree are independent, and there are Pointers between the leaf nodes of the B+ tree to connect the leaf nodes. Mysql is a relational database, there may be a certain relationship between multiple data, when querying a data, it may query and some other data related to it, which can well support range query.
- The query efficiency of B tree is unstable, but that of B+ tree is stable. When querying data, the B-tree will return data information after meeting the requirements, and will not go to the leaf node. However, when B+ tree is querying, it will go to the leaf node anyway, and will get the data and return the data information.
- Since the leaf nodes of a B+ tree do not store data information, there is more space to store key information, which can lower the height of the tree, reduce the number of I/OS, and increase efficiency.
How
When adding data or modifying index column data, Mysql needs to maintain index information. The following is based on the clustered index, in the non-clustered index, the leaf node stores the primary key ID (if the primary key ID is not specified manually, Mysql will actively maintain a primary key ID), the other is the same.
The basic unit of Mysql data stored in the index tree is a page, that is, multiple Mysql data items (let’s call both user data and directory data items) are grouped into a page. When the page runs out of size, a new column is split and then connected to the previous column using Pointers to form a linked list.
Because the location of these nodes in memory is not necessarily contiguous, if you want to quickly find the corresponding data in memory, you need to create a directory for these data, each page corresponds to a directory entry. Then assemble the entries into a page, which forms the following format:
The minimum value and page number of each data page recorded in the directory entry can be compared to find the corresponding page, and the corresponding data can be quickly located.
When there are insufficient spatial addresses in the directory page, the directory page is split and then, as with the data page, the two directory pages are concatenated using Pointers to form a bidirectional linked list.
And so on, until you have a root directory (actually, you create the root directory first, and then gradually add the following directory pages and data pages). And then you have a B+ tree. During the query, you can compare the size of the Id recorded in the root directory to find the next level of the directory page corresponding to the Id, and then find the final leaf node by analogy, so that you can quickly find the corresponding data.
Therefore, from the above process, it is better to create the primary key Id, the self-increment Id, because the order of self-increment Id only needs to be appended at the end, if the data is irregular, the next Id order may be smaller than the previous one, Mysql to maintain the order of Id, the data will be moved. It involves data migration, page splitting, etc.
Index of the failure
An index failure means that an index is created for a data column, but the function of the index does not take effect even though the index column is used. In the final query, the index tree is not used.
Index failure conditions
-
Operations on index columns
- Type conversion
- To calculate
-
A like query starts with “%”
Mysql compares strings from left to right, which complies with the leftmost matching rule. If a query starts with “%”, the leftmost matching rule will be broken, so the index cannot be used.
-
Breaking the leftmost prefix rule: If the leftmost prefix rule is not met in a federated index query, the index will not be used
-
The right index of a range query is invalidated
When there is A joint index of A, B, and C, if
where A = 123 and B > 20 and C = 456; Copy the code
The index of C is invalid. Because B will return multiple values, the matching query will be performed based on the returned multiple values. Because in the case of multiple values, the sort of C is not necessarily ordered, may be out of order, can not use the index.
-
The or query
An OR query produces multiple pieces of data that are sorted in an indeterminate way.
-
Judge with ≠
-
String without single quotes: implicit conversion of type occurs
-
A full table scan is expected to be faster than an index
Sort the cause of filesort
When sorting a query, if the sorted fields do not use indexes or do not meet the left-most prefix rule, file sorting is used to complete sorting.
An index pushdown
what
Index push-down means that the Mysql server pushes queries down to the storage engine.
How
If all the query conditions in the query statement exist in the index, the Mysql server pushes the query conditions to the storage engine. The storage engine returns data if the conditions are met. If the conditions are not met, the storage engine continues to the next data.
If some of the query conditions are not the column data information in the index, the storage engine will return the data after meeting the index conditions, and the Mysql server will obtain the whole data and determine whether the data meets the requirements. Otherwise, the data will be discarded.
Why
Index push-down can reduce the number of I/OS to a certain extent, because the returned data meets the conditions of the data, the Mysql server does not need to obtain data judgment again.
The index combined
www.cnblogs.com/digdeep/p/4…
When we perform a data query, we may have multiple query criteria in our WHERE statement, which may go to multiple indexes. Prior to Mysql 5.1, only one index was selected for data query and then handed over to Mysql server for data filtering. Index Merge, introduced after 5.1, allows multiple indexes to query data and then compute the results.
- Index-Intersection
- Index-Union
- Index-Sort-Union