preface

In MySQL, both Innodb and MyIsam use B+ tree as index structure (hash and other indexes are not considered here). This article will start with the most common binary search tree, and gradually explain the problems solved by various trees as well as the new problems, so as to explain why MySQL chose B+ tree as the index structure.

Binary search tree (BST) : unbalanced

Binary Search Tree (BST), also known as Binary sorting Tree, on the basis of Binary Tree, all nodes in the left subtree of any node must have values not greater than the root node, and all nodes in the right subtree of any node must have values not less than the root node. Here is a BST:

Storing data in BST is a common choice when quick lookups are needed, because the query time is dependent on tree height and the average time complexity is O(LGN). However, the BST may become skewed and unbalanced, as shown in the figure below, when the BST degenerates into a linked list and its time complexity degenerates into O(n).

To solve this problem, a balanced binary tree is introduced.

2. Balanced binary tree (AVL) : rotation time

AVL trees are strictly balanced binary trees, and the height difference between left and right subtrees of all nodes cannot exceed 1. AVL tree lookup, insertion, and deletion are O(LGN) on average and in the worst case.

The key to AVL balance is rotation operations: insertion and deletion can upset the balance of the binary tree, requiring one or more tree rotations to rebalance the tree. When inserting data, only 1 rotation is required at most (single or double rotation); However, when deleting data, the tree will be unbalanced, and AVL needs to maintain the balance of all nodes along the path from the deleted node to the root node with an order of O(LGN) of rotation.

Due to the time consuming of rotation, AVL tree is very inefficient in deleting data; When there are many deletions, the cost of maintaining balance may outweigh the benefits, so AVL is not widely used.

Red and black trees: Trees are too tall

In contrast to AVL trees, red-black trees are not strictly balanced, but roughly balanced: just ensuring that the longest possible path from root to leaf is no more than twice as long as the shortest possible path. From the perspective of implementation, the biggest characteristic of red-black tree is that each node belongs to one of two colors (red or black), and the division of node color needs to meet specific rules (detailed rules are omitted). The following is an example of a red-black tree:

Compared with AVL tree, the query efficiency of red-black tree will decrease, because the balance of the tree becomes worse and the height is higher. However, the deletion efficiency of red-black trees is greatly improved, because red-black trees also introduce color. When inserting or deleting data, only O(1) rotations and color changes are required to ensure basic balance, instead of O(LGN) rotations as AVL trees do. In general, red-black trees have better statistical performance than AVL.

Therefore, in practical applications, AVL trees are used relatively little, while red-black trees are widely used. For example, TreeMap in Java uses red-black trees to store sorting key-value pairs; HashMap in Java8 uses linked lists + red-black trees to resolve hash collisions (linked lists when there are few conflicting nodes, red-black trees when there are many).

Red-black trees perform very well when the data is in memory (such as TreeMap and HashMap above). But in the case of data on secondary storage devices such as disks (such as databases such as MySQL), red-black trees are not good because they are still too tall. When data is on disk, disk I/O becomes the biggest performance bottleneck. The design goal should be to minimize the number of I/OS. The higher the tree height is, the more I/OS required for add, delete, modify, and query, which seriously affects the performance.

B tree: For disk

A B-tree, also known as a B-tree (not a minus sign), is a multi-path balanced search tree designed for auxiliary storage devices such as disks. Compared with a binary tree, each non-leaf node of a tree can have more than one sub-tree. Therefore, when the total number of nodes is the same, the height of B tree is much smaller than AVL tree and red-black tree (B tree is a “dumpy”), and the disk I/O times are greatly reduced.

The most important concept to define B-tree is Order. For an M-order B-tree, the following conditions should be met:

Each node contains a maximum of m child nodes.

If the root node contains child nodes, it contains at least two child nodes. In addition to the root node, each non-leaf node contains at least M /2 child nodes.

A non-leaf node with K child nodes will contain K-1 records.

All leaf nodes are in the same layer.

It can be seen that the definition of B-tree is mainly to limit the number of child nodes and records of non-leaf nodes.

Here is an example of a third-order B-tree:

In addition to the advantages of small tree height, B trees also make use of the principle of access locality. The so-called locality principle means that when a data is used, the nearby data has a high probability of being used in a short time. B tree stores data with similar keys in the same node. When accessing one of the data, the database will read the whole node into the cache. When adjacent data is immediately accessed, it can be read directly from the cache without disk I/O. In other words, b-trees have a higher cache hit ratio.

B tree has some applications in database, such as mongodb index using B tree structure. However, in many database applications, the variant B+ tree is used.

Fifth, B + tree

B+ tree is also a multi-path balanced search tree, which differs from B tree mainly in:

In a B tree, every node (including leaf nodes and non-leaf nodes) stores real data. In a B+ tree, only leaf nodes store real data, while non-leaf nodes only store keys. In MySQL, the real data may be the entire data of the row (as in Innodb’s clustered index), the primary key of the row (as in Innodb’s secondary index), or the address of the row (as in MyIsam’s non-clustered index).

Whereas a record in a B tree appears only once and does not repeat itself, a key in a B+ tree may repeat itself — either at a leaf or at a non-leaf node.

The leaves of a B+ tree are linked by a bidirectional linked list.

The number of non-leaf nodes in B tree is 1 less than the number of child nodes. The number of records in a B+ tree is the same as the number of child nodes.

Therefore, compared with B trees, B+ trees have the following advantages:

** Fewer IO times: ** The non-leaf nodes of a B+ tree only contain keys, not real data, so each node stores a much larger number of records than the B number (that is, order M is larger), so the height of the B+ tree is lower and fewer IO times are required to access. In addition, because there are more records stored per node, the principle of access locality is better utilized and the cache hit ratio is higher.

More suitable for range query: when range query is carried out in B tree, the lower limit to be searched is found first, and then the middle order traversal of the B tree is carried out until the upper limit of the search is found; The range query of B+ tree only needs to traverse the linked list.

More stable query efficiency: The query time complexity of A B tree is between 1 and tree height (recorded at the root and leaf nodes respectively), whereas the query complexity of a B+ tree is stable at tree height because all data is located at the leaf node.

B+ trees also have a disadvantage: they take up more space because the keys are repeated. However, the spatial disadvantage is often acceptable compared to the performance advantage, so B+ trees are more widely used in databases than B trees.

6. Feel the power of the B+ tree

As mentioned earlier, the biggest advantage of a B tree /B+ tree over a binary tree like a red-black tree is that the tree is smaller in height. In fact, for Innodb’s B+ index, the height of the tree is usually 2-4 levels. Now let’s do some concrete estimates.

The height of a tree is determined by the order, and the higher the order, the shorter the tree; The size of the order depends on how many records each node can store. Each Innodb node uses a page, the size of the page is 16KB, the metadata is only about 128 bytes (including file management header information, page header information, etc.), most of the space is used to store data.

For non-leaf nodes, the record contains only the key of the index and a pointer to the node at the next level. Assuming 1000 records are stored per non-leaf page, each record occupies about 16 bytes. This assumption is reasonable when the index is an integer or a short string. By extension, we often hear the advice that the index column length should not be too large for this reason: if the index column is too long and each node contains too few records, the tree will be too tall, the index will be less effective, and the index will waste more space.

For leaf nodes, records contain keys and values of indexes (values may be the primary key of a row, a full row of data, and so on, as described above), and the amount of data is larger. It is assumed that 100 records are stored per leaf page (in fact, this number may be less than 100 when the index is clustered; When the index is secondary, this number can be much higher than 100; Can be estimated according to the actual situation).

For a 3-tier B+ tree, the first tier (root node) has 1 page and can store 1000 records; The second layer has 1000 pages and can store 1000 * 1000 records; The third layer (leaf node) has 1000 * 1000 pages, and each page can store 100 records, so it can store 1000 * 1000 * 100 records, or 100 million records. For binary trees, storing 100 million records requires about 26 layers.

Seven,

Finally, summarize the problems that various trees solve and the new problems they face:

Binary search tree (BST) : solves the basic problem of sorting, but can not guarantee balance, may degenerate into a linked list;

Balanced binary tree (AVL) : the problem of balance is solved by rotation, but the efficiency of rotation operation is too low.

Red-black tree: The problem of low AVL rotation efficiency was solved by eliminating strict balance and introducing red-black nodes, but the tree was still too high and IO times were too many in disk and other scenarios

B tree: the problem of too high tree is solved by changing binary tree to multi-path balanced search tree.

B+ tree: On the basis of B tree, non-leaf nodes are transformed into pure index nodes that do not store data, further reducing the height of the tree; In addition, leaf nodes are linked into linked lists using Pointers, which makes range query more efficient.

Original: www.jb51.net/article/203…

Follow my wechat official number, there are more dry goods waiting for you to take