Preface:

I have read many articles about B+ Tree before, and I always felt that I understood them at that time, but soon I forgot them again. It wasn’t until I read the Nuggets handbook: How Mysql runs chapters 7 and 8 that I finally understood what B+ Tree was all about. If you are interested in how to implement Mysql internally, you can go to the booklet. I read these 2 chapters by myself, and each chapter took 2 hours. After all, it introduces a lot of concepts. This article is an overview of the two chapters and can also be read as a guide. After reading this, it may not be so difficult to read the booklet. (PS, this is the most detailed series of articles on the inner workings of Mysql I’ve ever seen.)

Question:

1. How is data stored in InnoDB? 2. 3. What is the process of searching for records? 4. What is page splitting?

The preparatory work

Yeah, first of all, forget the data structure. This was mentioned in the introduction to Buffer pools

InnoDB reads data from disk files into memory in pages (typically 16K in size). Multiple records on a page (at least 2 records per page)

CREATE TABLE `school` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

delimiter;;
CREATE PROCEDURE idata ( ) BEGIN
	DECLARE
		i INT;
	
	SET i = 1;
	WHILE
			( i <= 25 ) DO
			INSERT INTO school
		VALUES
			( i, 'school' );
		SET i = i + 1;
		
	END WHILE;
	
END;;

delimiter;
CALL idata ( );
Copy the code

Now we have created a school table and inserted 25 records. In the real world, hundreds or thousands of records can be recorded on one page. For the sake of understanding, we assume that only five records can be stored on a page. Then the storage situation is shown as follows:

  • Page to page isTwo-way linked list, which facilitates the range search.
  • Page numberNot necessarily continuous
  • Records in a page are sorted by Key. Between recordsThe list

Knowing where the records are stored, finding a record with an ID becomes two problems:

  1. How doRapid positioningOn which page does a Key(id in this case) exist?
  2. How to quickly find the record corresponding to the Key in the page?

How to quickly locate a page

For quick location, the easiest thing to think of is to set up a dictionary, such as (1,200),(2,200),(6,210), which does meet the need for quick lookup, but is a waste of space. So switch methods:

  • We use key,value, key for each pageThe minimum value, value Indicates the corresponding recordData pagethePage number. So that’s what happensDirectory entry.For example, now we have 25 records, 5 records per page, thenAt least,It’s five pages, soAt least,There are fiveDirectory entry.
  • Directory entryManagement is also done in the form of pages. Five records make one page, so five entries make onePage directory. Contents pagerecord(that is, table of contents) are also arranged in order

So this is the simplest B+Tree. The leaf node is the data page, storing data. (Why is the index data for clustered indexes?) Non-leaf nodes hold directory entries.

The search process

select * from student where id = 14
Copy the code

So we’re going to start with id = 14. Then start the search from the root node (in this case, page 400), page 400 has 5 records, search through the binary search algorithm, because 11 < 14 < 16 so 14 records correspond to the same page as 11 records correspond to the page. Keep in mind the following two features: A key in a directory entry records the minimum value of a key in the corresponding page. And no matter the data page or table of contents page, the records are in order. So 14 is page 212. Then go to page 212 and do a binary search to find the data.

Page divided

If I insert another piece of data at this point.

insert into student values (28.'Cathy')
Copy the code

Since the data pages are full, a new page must be opened to store the record 28. Let’s say the page is 310. But at the same time, because the directory page 400 is also full, it cannot record the directory item (28,310), so we must add a new directory page such as page 410 to store the new directory item. At this time, because there are two directory pages, so need to create a new directory page to manage the directory page, ‘ ‘yes, the directory page in the records are not necessarily pointing to the data page, may also point to the directory page. It will look like this :(the picture is a little ugly, if you have a good drawing software, please recommend it)



The build process is different from the one I mentioned above.
Whether it's a data page or a catalog page
All roads lead to Rome

  • The root is fixed. (The change of the root node from 400 to page 420 would not exist in a real database.) A real flow would look something like this: suppose you start with only five pieces of data on page 10Page 10 is both the data page and root node.At this point, Innodb will insert a new dataPages 10Replication isData page 20And then toData page 20To divide, to divideData page 20Part of the data migration is obtainedData page 30, and then the newly inserted data is inserted into the corresponding index according to the size of the indexPage 20 or page 30In the. This time the originalPage 10 becomes the contents page,The recorded table of contents entries correspond to pages 20 and 30.But page 10 is still the root node.

conclusion

  • Data records are stored on a page basis, and each page contains multiple records.
  • For convenience from numerousData pageFind the corresponding record in the designPage directory. The catalog and data pages form a B+ Tree.
  • B+ Tree The root node is fixed and data is insertedPages apart.
  • Maybe the whole article did not mention what the data structure of B+ Tree is, but I hope that through the application of B+ Tree in the real scene of InnoDB, you can have a concrete concept of the function of B+ Tree, and what the leaf nodes and non-leaf nodes in B+ Tree are. Have a good understanding of the search process and insert process described below. Go back to the article that introduces the data structure of B+Tree and you will understand it more easily. Any data structure exists to serve certain scenarios,It is easier and more impressive to see the implementation after understanding the application scenario.