1. What is B+ tree

B+ trees actually evolved from the data structure of trees. At first, trees evolved into binary trees, balanced binary trees, B trees (B- trees), and then evolved into B+ trees. Later, some people evolved, and now there are B* trees. The evolution of this tree is a multi-fork tree designed for disks or other storage devices, primarily to reduce disk IO.

2. Why is B+ more suitable for database index storage

B+ tree non-leaf node values store index without access to relevant data, so non-leaf nodes are smaller. If the indexes of the same internal node are stored in the same disk block, the more indexes are read into memory at a time, and the relative I/O read times are reduced

3. B+ tree search process

Suppose you have the employee table structure

CREATE TABLE `employee` (
  `id` int(11) NOT NULL,
  `age` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_age` (`age`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into employee values(100, 43);
insert into employee values(200, 48);
insert into employee values(300, 36);
insert into employee values(400, 32);
insert into employee values(500, 37);
insert into employee values(600, 49);
insert into employee values(700, 28);
Copy the code

select * from Temployee where age=32; What does it look like

  1. First draw the index structure diagram of idx_age index, roughly as follows:

2. Then draw the primary key index of ID. Let’s draw the cluster index structure diagram as follows:

Therefore, this SQL query is executed roughly as follows: search the idx_age index tree, load disk block 1 into memory, search the left branch to address disk block 2 since 32<37. Age =32; id = 400; age=32; If id=400, go back to the primary key index tree. Search id primary key index tree, load disk block 1 into memory, traversal in memory, find 400, but B+ tree index non-leaf nodes do not save data. The index continues to search the right branch of 400 to address the disk block 3. Load the disk block 3 into memory, traverse the memory, find the record id=400, get the data line R4, ok, done. Therefore, this SQL query performs several tree searches, is not a step. In particular, after the primary key ID is found in the idx_age secondary index tree, the search process of the primary key index is called back to the table.

Reference: juejin. Cn/post / 692378…