“This is my 32nd day of participating in the First Challenge 2022. For details: First Challenge 2022”
preface
This article mainly introduces the index structure of Mysql, will focus on the following questions:
- What types do InnoDB index structures support?
- What is the structure of a B tree?
- What is the structure of a B+ tree?
- What is the structure of a Hash index?
- Why y index structure uses B+ tree, not B tree, Hash, binary, balanced binary, red black tree?
- What is the composite index structure?
- What are the structural differences between clustered indexes and non-clustered indexes?
- How to calculate the B+ tree index hierarchy given the amount of data?
- Why are B+ trees always 3-4 stories high in production environments?
- Why can’t index keys be too long?
The InnoDB index structure supports those types
The following table shows the index types supported by each storage engine
Storage Engine | Permissible Index Types |
---|---|
InnoDB |
BTREE |
MyISAM |
BTREE |
MEMORY /HEAP |
HASH .BTREE |
NDB |
HASH .BTREE (See note in text) | |
Then let’s look at InnoDB’s introductory feature table
Characteristics of the | support |
---|---|
B-tree indexes | Yes. |
Backup/point-in-time recovery(Implemented in server, not storage engine.(Generally, binary logs are generated in the server layer)) | Yes. |
Cluster Database Support | Don’t |
The aggregation index | Yes. |
Compressed data | Yes. |
Data cache | Yes. |
Encrypt the data | Yes (implemented in the server via encryption functions; Static data encryption is supported in MySQL 5.7 and later.) |
Foreign key support | Yes. |
Full-text search index | Yes (MySQL 5.6 and later provide support for FULLTEXT indexes.) |
Geospatial data type support | Yes. |
Geospatial index support | Yes (MySQL 5.7 and later provides support for geospatial indexes.) |
A Hash index | No (InnoDB uses hash indexes internally to implement its adaptive hash index functionality.) |
The index buffer | Yes. |
Locking granularity | line |
MVCC | Yes. |
Replication support(Implemented in the server, not in the storage engine.(Generally, binary logs are generated in the server layer)) | Yes. |
Storage limits | 64TB |
T – tree indexes | Don’t |
The transaction | Yes. |
Update statistics for the data dictionary | Yes. |
InnoDB supports only B-tree indexes, and Hash indexes are used internally.
What is the structure of a B tree
A B-tree is a self-balancing tree that can keep data in order. This data structure allows data lookup, sequential access, insertion, and deletion to occur in logarithmic time. B tree is a generalized binary search tree. A node can have more than two children. Unlike self-balanced binary search trees, B-trees are suitable for storage systems that read and write relatively large blocks of data, such as disks. B trees reduce the intermediate process of locating records and thus speed up access. A B-tree is a data structure that can be used to describe external storage. This data structure is applied to database and file system implementations.
define
A b-tree of order m is a tree with the following properties:
- Each node has at most m child nodes
- Each non-leaf node (except root node) has at least ⌈m/2⌉ child node (This feature allows the tree to be adjusted to preserve its properties when new values are deleted or inserted into the tree.)
- If the root node is not a leaf node, it has at least two child nodes
- A non-leaf node with K children has k-1 keys
- All the leaf nodes are in the same layer
The keys of each internal node separate the subtree of the node. For example, if an internal node has three subtree nodes (subtrees), it must have two keys: A1 and A2. All values of the left subtree must be less than A1, all values of the middle subtree must be between A1 and A2, and all values of the right subtree must be greater than A2.
Some balanced trees store values only in leaf nodes, and leaf nodes and inner nodes use different structures. B trees store values in each node, and all nodes have the same structure. However, leaf nodes have no children, so the performance of b-trees can be improved by using specialized structures.
The complexity of the
algorithm | On average, | The worst | |
---|---|---|---|
space | O(n) | O(n) | |
search | O(log n) | O(log n) | |
insert | O(log n) | O(log n) | |
delete | O(log n) | O(log n) |
Using the concept of
- The values are kept in order, traversed sequentially
- Use hierarchical indexes to minimize disk reads
- Use incomplete filled blocks to speed up inserts and deletions
- Index balance is maintained through an elegant traversal algorithm
In addition, B-trees optimize the minimum space waste by ensuring that internal nodes are at least half full. A B-tree can have any number of insertions and deletions.
disadvantages
The maximum length of a key cannot be changed unless the database is rebuilt. This causes many database systems to truncate people’s names to less than 70 characters.
variant
The term B-tree can specify a particular scheme, it can specify a general class of schemes. In a narrow sense, a B-tree stores key values in its internal nodes, but does not need to store records of those key values in leaf nodes. The general class includes variations such as B+ trees and B* trees.
Now let’s move on to B+ trees.
What is the structure of a B+ tree
A B+ tree is a data structure commonly used in databases and file systems of operating systems. The characteristic of B+ tree is that it can keep the data stable and orderly, and its insertion and modification have relatively stable logarithmic time complexity. B+ tree elements insert themselves up, as opposed to binary trees.
B+ trees have real advantages over alternative implementations when the node access time is much longer than the in-node access time. This usually occurs when most nodes are in this level of storage such as hard disks. By maximizing the number of children within each internal node to reduce the height of the tree, balancing operations occur less often, and efficiency increases. Establishing this value usually requires each node to occupy a full disk block or approximate size in secondary storage.
The idea behind B+ is that internal nodes can have a variable number of destination children within a predetermined range. Therefore, B+ trees do not need to be rebalanced as often as other self-balanced binary search trees. The low and high bounds on the number of children are fixed for a particular implementation. For example, in a 2-3 B tree (often referred to simply as a 2-3 tree), each internal node may have only two or three child nodes. If a node has an invalid number of children, it is considered to be in an illegal state.
define
Nodes in a B+ tree are usually represented as an ordered set of elements and subneedles.
A B+ tree of order m is a tree with the following properties:
- Each node except the root node contains at least ⌊m/2⌋ elements and at most m-1 elements
- There are at most m child Pointers for any node.
- For all internal nodes, there is always one more child pointer than there is element.
- All leaves are at the same height, and the leaf nodes themselves are linked in order of keyword size from small to large.
Note:
Here, M means that a node can have at most M child nodes rather than the hierarchy of the tree, which is the maximum number of elements on each node. M in the figure above is 4, so there are at most 4 subtrees, and at most 3 elements. The most subtrees are definitely 1 greater than the most elements.
To find the
The lookup is done in a typical manner, similar to a binary lookup tree. Starting at the root node, traverse the tree from the top down, selecting the child pointer on either side of the value you want to split. Binary lookups are typically used within nodes to determine this location.
insert
For a node to be in an offending state, it must contain an unacceptable number of elements.
- First, find the location of the node to insert. The value is then inserted into the node.
- If no node is in the violation state, the handling is complete.
- If a node has too many elements, split it into two nodes, each with a minimum number of elements. Continue up the tree recursively until you reach the root node, and if the root node is split, create a new root node. For this to work, the minimum and maximum number of elements typically must be selected to be at least half of the maximum number.
delete
- First, find the value to delete, and then remove it from the node that contains it.
- If no node is in the violation state, the handling is complete.
- If the node is in an illegal state, there are two possible situations:
- Its sibling nodes, children of the same parent node, can transfer one or more of its children to the current node and return it to the legal state. If so, the processing ends after changing the separation values for the parent and two siblings.
- Its sibling nodes have no additional children because they are on low boundaries. In this case the two siblings are merged into a single node, and we recurse to the parent because it removes a child node. This continues until the current node is legal or reaches the root node, on which the children of the root node are merged and the merged node becomes the new root node.
Structural comparison of B+ trees and B trees
The tree type | Any node | Internal node (non-leaf node (except root node)) | The root node | A leaf node |
---|---|---|---|---|
B tree | Each node has at most m child nodes (The same) | Each non-leaf node (except root node) has at least ⌈m/2⌉ child node; There are kChild nodesNon-leaf nodes have k-1 keys | If the root node is not a leaf node, it has at least two child nodes; There are kChild nodesNon-leaf nodes have k-1 keys | All the leaf nodes are in the same layer |
B + tree | There are at most m child Pointers for any node (The same) | For all internal nodes, there is always one more child pointer than there is element. Each node except the root node contains at least ⌊m/2⌋ elements and at most m-1 elements | All leaves are at the same height, and the leaf nodes themselves are linked from small to large according to the keyword size; Each node except the root node contains at least ⌊m/2⌋ elements and at most m-1 elements |
It should be noted that B+ tree is a deformation of B tree, so B+ root node should also meet the requirements of B root node.
You can see that there’s not much difference between a B+ tree and a B tree. In addition, the requirement of the minimum number of internal nodes is more strict than that of B-tree (at most one more), and at most M-1; Leaf nodes are also more stringent than B-trees, requiring all keywords, small to large links, at least ⌊m/2⌋ elements, and at most m-1 elements.
What is the structure of a Hash index? (Adaptive Hash index)
As mentioned above, InnoDB indexes do not support Hash indexes, but use Hash indexes internally to implement adaptive Hash indexes. Let’s take a quick look.
Adaptive indexes InnoDB is designed to help B+ tree indexes build hash indexes with prefixes of indexes that are frequently accessed. The key is a hash and the value is a pointer to the index value (the index leaf node).
InnoDB has a mechanism to monitor index searches and will do so automatically if InnoDB mainly benefits from queries that can build hash indexes.
However, in the case of high concurrency, the access to the adaptive hash index sometimes becomes a competing resource; And hashing can only locate explicit values. So the innodb_adaptive_hash_INDEx_parts variable can be turned on or off as needed, MySql8.0 is turned on by default.
Why index structure using B+ tree, not B tree, Hash, binary, balanced binary, red black tree?
You need to know why, you need to know about other data structures. Red black tree see Red black Tree Understanding and Manual implementation.
Why not Hash?
Hash computes Hash mappings based on keywords. It is suitable for equivalent query, but cannot compare the size of a range of queries, cannot be partially queried, cannot be sorted, must be returned to the table. Indexes cannot be directly used to return results or join, and the Hash is duplicate and the same key.
Why not use a binary tree?
Directly compared with balanced tree, binary tree cannot self-balance, data is not uniform, query efficiency is unstable and prone to extreme situations, so it is better to consider balanced binary tree.
Why not balance a binary tree?
Only one keyword can be stored on a node, which is not conducive to the use of complete block storage and cannot give full play to the efficiency of sequential disk read and prefetch. A node has a maximum of two subtrees. In the case of a large amount of data, the tree level is very deep, which makes it difficult to read and construct the tree from the disk. At the same time, it is inefficient to adjust the tree structure to maintain balance.
Balanced binary tree, if the extreme case is encountered, each insertion of data is larger than the last insertion, thus becoming a single-chain tree, time complexity O(N).
Why not use a red black tree?
Red-black trees are less expensive to maintain than balanced binary trees (they don’t need to be constantly tweaked to maintain balance), but they have the disadvantages of balanced binary trees.
Why not use B trees?
From the comparison of the B+ tree and the B+ tree above, it can be found that the B+ tree is more strict than the B tree.
- The maximum number of nodes is M-1, which does not cause too many internal nodes.
- The leaf node includes all the keywords to make it easy to fetch bulk data directly from the leaf node.
- Leaf nodes from small to large, easy to sort and search in order.
What is the composite index structure?
Composite index of the underlying data structures or B + tree, only more than one column of B + tree child nodes of the keyword, such as A, B, C three fields of composite index, then the order of the keywords is A B after the final C first, that is to say, only under the condition of same B and orderly, under the condition of A and B are the same C will only be orderly, in fact is easy to understand, Just like comparing the size of a string, only the preceding characters determine the order, no matter how long the string is.
The general structure is shown in the figure below, which is the combined index of first name, first name and year of birth.
What are the structural differences between clustered indexes and non-clustered indexes?
You can see the difference between InnoDB aggregated index and non-aggregated index in the figure above.
Clustering index
In InnoDB, clustered index is not a single index type. The underlying structure is still B+ tree, but the whole row of data is stored in the leaf node. The difference from MyISAM is obvious (index structure and data storage are separated). If there is no primary key, the first unique index is used. If there is no primary key, InnoDB creates an implicit clustered index -Row ID is 6 bytes (see InnoDB for details).
The author think
The keyword and row data of the leaf nodes of the clustered index are stored together, which obviously reduces disk I/O operations. But suppose I only need the Id of one table and then go to other tables for associative queries, wouldn’t it save I/O if the leaf nodes of the primary key index were not stored together with the data and didn’t need to traverse all rows? But is there such a scenario? The answer is very few. If you want all ids, why don’t you just check the association table directly? What’s the point of association? Clustered indexes can also be faster if you need to filter with other columns and have no other indexes.
Non-clustered index
A non-clustered index is also called a secondary index, such as the secondary key index in the figure above. A leaf node has a primary key column as well as an index column. To query a row, you need to go through the cluster index to retrieve the row based on the primary key of the leaf node.
The author think
So much trouble with non-clustered indexes? A second search of the B+ tree is required, which is better than MyISAM. If a leaf node holds a row pointer, what if it does not need to return the table and only needs to return the secondary index row value and primary key value? What about syntables? So the primary key optimizes the query, and InnoDB also optimizes the adaptive hash index (see section above), so it’s much better than MyISAM.
How to calculate the B+ tree index hierarchy given the amount of data?
The first thing we need to know is how many bytes a field, an index pointer, or an InnoDB page is.
How to calculate the number of bytes for different types of fields?
Common ones are given below
- Tinyint: 1 byte
- Int: 4 bytes
- Bigint: 8 bytes
- Datetime: 8 bytes
- Varchar: UtF8 plus 1 or 2 bytes of length, depending on the encoding (latin1, 1 byte for a character, GBK, 2 bytes for a character, utF8, 3 bytes for a character)
- Char: based on encoding (latin1 encoding, one byte for a character, GBK encoding, two bytes for a character, utF8 encoding, three bytes for a character), usually added with UTF8.
- Decimal (M,D) : refer to Mysql decimal for details on how to calculate decimal(10,2).
How many bytes does the index pointer take?
Official documents show:
Each index record contains a 6-byte header. The header is used to link together consecutive records, and for row-level locking.
Copy the code
The index takes 6 bytes, which is 2^48=256T, which is plenty.
How many bytes does InnoDB take on a page?
The default length of an official document is 16 bytes.
What is the maximum control InnoDB B+ tree has on the number of children inside a node?
I haven’t found it yet, perhaps because it is difficult to define the index keyword field with different length types, so I directly constrain it according to page size.
Have the author that knows can comment message, the author undertakes updating again.
To calculate
We can calculate the common bigINT autoincrement primary key, how many rows of data can be organized by B+ number, assuming that one row is 1K.
- One page can hold 16 lines of data
- Index child nodes account for 8 (Bigint) +6 (pointer) =14 bytes, because B + tree element +1 equals the number of Pointers. If a page can have x Pointers, then 8*(x-1)+6x=16*1024, the result of x forensics down is 1170. So there’s about 1170 Pointers on a page.
so
- If it’s two layers it’s 16*1170=18720 rows. (More than 10,000)
- If it’s three, it’s six11701170=21902400 rows of data. (more than 20 million)
- If it’s four, it’s six11701170*1170=25625808000 lines of data. (more than 25.6 billion)
3-4 layers can meet most scenarios, and when reaching 4 layers, we need to start to consider the division of libraries and tables. The more layers there are, the lower the efficiency of B+ trees will be, and the more resources will be consumed.
Why are B+ trees always 3-4 stories high in production environments?
In order to verify this, we can query the Mysql metadata table. Here are the results of the author’s query:
mysql> select b.name,a.name,index_id,type,a.space,a.PAGE_NO from information_schema.INNODB_INDEXES a,information_schema.INNODB_TABLES b where a.table_id=b.table_id AND a.space <> 0; +----------------+-----------------+----------+------+-------+---------+ | name | name | index_id | type | space | PAGE_NO | +----------------+-----------------+----------+------+-------+---------+ | sys/sys_config | PRIMARY | 152 | 3 | 1 | 4 | | my_test/test | test_name_index | 155 | 0 | 2 | 5 | | my_test/test | PRIMARY | 153 | 3 | 2 | 4 | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- + 3 rows in the set (0.02 SEC)Copy the code
My_test is my local test library. Test is my test table. You can find the root page of the primary key index on the second page. Let’s look at the page level stored in the root page
First let’s see where the table is stored:
mysql> show global variables like "%datadir%"; +---------------+--------------------------+ | Variable_name | Value | +---------------+--------------------------+ | Datadir | / opt/homebrew/var/mysql / | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + 1 row in the set (0.01 SEC)Copy the code
So table location for/opt/homebrew/var/mysql/my_test/test. The idb
4161024+64=65600 page level is stored at the root offset of 64
xxx xxx % hexdump -s 65600 -n 10 test.ibd
0010040 00 00 00 00 00 00 00 00 00 99
001004
Copy the code
It can be found that the page level is 0, the page level is the layer where the leaf is removed, and the index is 1, so the index is not used because the author’s data is only a few lines.
Why can’t index keys be too long?
As can be seen from the above calculation index hierarchy, the larger the index key (including single column and multiple columns) is, the smaller the number of Pointers of nodes will be. Therefore, in the case of the same B+ number hierarchy, the smaller the number of data rows that can be indexed will be, or the larger the hierarchy of the same data row tree will be, the lower the efficiency will be. Therefore, the smaller the point is set according to the business as much as possible.
reference
B+Tree
B tree
Create a performance index for mysql
High Performance MySQL: Clustering index
Mysql website
Why are B+ trees always 3-4 stories high in production environments?