1. A red-black tree
1.1. Definition and properties of red-black trees
Red-black tree is a binary search tree with self-balancing red-black nodes. It must have the following properties:
- Each node is either black or red.
- The root node is black.
- Each leaf node (NIL) is black.
- Both children of every red node must be black.
- The path from any node to each leaf node contains the same number of black nodes.
1.2. Self-equilibrium of red-black trees
Red-black trees can achieve self-balance by three main operations: left rotation, right rotation, and color change.
Left-rotation: a node is used as the fulcrum (rotation node), and its right child becomes the parent of the rotation node, the left child of the right child becomes the right child of the rotation node, and the left child remains unchanged.
Dextral: a node is used as the fulcrum (rotation node), and its left child becomes the parent of the rotation node, the right child of the left child becomes the left child of the rotation node, and the right child remains unchanged.
Left-rotation only affects the structure of the rotated node and its right subtree, moving the node of the right subtree to the left subtree. Dextral rotation only affects the structure of the rotated node and its left subtree, moving the node of the left subtree to the right.
Discoloration: the color of the node changes from red to black or from black to red.
2.B tree (Multi-path balanced search tree)
B tree is a balanced search tree designed for external storage devices such as disks. When the system reads data from disks to memory, the basic unit is disk blocks. The data in the same disk block is read at a time instead of what is needed.
We can represent data records on disk as a binary of the form [key, val], where key is the primary key value in the table, and data is the data corresponding to the primary key. The key values are different for different records.
In the B tree, we can regard each disk block as a node of the B tree. Each node contains key primary keys in ascending order. These key primary keys contain corresponding data data, and Pointers to child nodes are separated. The pointer on the right points to a larger key than the current value.
As shown in the figure below, simulate the search for data with key 29:
- Locate disk block 1 based on the root node and read it into memory. [Disk I/O operation 1]
- Compare keyword 29 in the interval (17,35) to find pointer P2 to disk block 1.
- Locate disk block 3 according to P2 pointer and read into memory. [Disk I/O operation 2nd]
- Compare keyword 29 in interval (26,30) to find pointer P2 to disk block 3.
- Locate disk block 8 according to P2 pointer and read into memory. [Disk I/O operation 3]
- Find keyword 29 in the keyword list in disk block 8.
3. B + tree
The B+ tree is an optimized version of the B tree to make it more suitable for external storage index structure. InnoDB storage engine uses the B+ tree to achieve its index structure.
Each node in the B tree contains not only the key value of the data, but also the data value. The storage space of each page is limited. If the data is large, the number of keys that can be stored on each node (that is, a page) is small. If the data is large, the B-tree depth is large, which increases the disk I/O times during query and affects the query efficiency.
In a B+ tree: 1. Non-leaf nodes store only key information. 2. There is a chain pointer between all leaf nodes. 3. Data Data records are stored in leaf nodes.
There are usually two head Pointers on a B+ tree, one to the root node and the other to the leaf node with the smallest keyword, and there is a chain-ring structure between all the leaf nodes (that is, the data nodes). So you can do two kinds of lookups on B+ trees: range and paging lookups on primary keys, and random lookups starting at the root node.
4. Summary of B tree and B+ tree
A B tree (B+ tree) is a multi-path balanced search tree. In a B tree, each node contains multiple key keys and data (in ascending order by key value). The key value separates Pointers to child nodes.
All data data in B+ tree are placed in leaf nodes, and a chain ring structure is formed in leaf nodes.