In-depth understanding of MySQL index underlying data structures

Author: IT Wang Xiaoer

Blog: itwxe.com

There are two types of data structures associated with MySQL indexes, one is B+tree, the other is Hash, so why 99.99% of the time use B+tree indexes?

What is the underlying data structure of the index?

Next, listen to xiao Er.

What is the index

An index is a sorted data structure that helps MySQL obtain data efficiently. Therefore, it can be concluded that indexes are data structures!

Of course, the above two sentences may seem abstract, but what are some examples of indexes in life?

Small 2 above figure “jin Ping Mei” this is an example, the catalogue of books is arranged sequentially, there is the first chapter, the second chapter… This is a sorted data structure.

Table of contents can quickly help us to quickly locate the chapter we want to read by page number. For example, if we want to read the third chapter of “Jin Ping Mei”, turn to page 55…

Children can not watch “Jin Ping Mei”, or they will be like a small second nosebleed, ha ha ha.

That is a bit more complicated, want to go to the library to find the book “Jin Ping Mei”.

Libraries tend to store books in categories. The index is made up of nodes, the root node (library) has intermediate nodes (categories for each floor), which are followed by sub-nodes (categories for each row of each floor), and the last layer is leaf nodes (specific books).

As you can see, an index is really just an upside-down tree, a data structure.

With a visual data structure website, it’s easy to learn data structures.

Visual data structure website: www.cs.usfca.edu/~galles/vis…

Foreign website access is slightly slow, of course, you are not easy to go in and ask why is not Chinese, that is of course because of the translation of small two points.

Binary tree series

B+tree binary tree binary tree binary tree binary tree

1. The binary tree

What is “dichotomy”?

Each node has at most two subtrees, left and right, and the order cannot be reversed.

Logically there are five basic forms of binary trees: empty binary trees, binary trees with only one root, only left subtrees, only right subtrees, and complete binary trees (full binary trees are the special cases).

Traversal is the most basic operation of the tree. The so-called traversal of binary tree is to walk through all nodes of the binary tree according to certain rules and order, so that each node is visited once, and only once, with pre-order, middle-order and post-order traversal.

However, the small two will not talk about the various forms of binary tree, traversal… B+tree

Of course, this time there will be guests to jump out, small 2 is clearly not proficient in data structure, but also said so good.

Hahaha, there is a reason for this, of course, the data structure of university study is all returned to the lovely teacher.

Wait for small 2 read “the beauty of data structure and algorithm” again a period of data structure topic, now casually look at the drawing of the two graphs will be about it, the structure is relatively simple.

Because the database index is a data structure that requires good ordering, so the binary tree is not satisfied with the use scenario, so in order to solve the problem of good ordering, then the binary search tree is introduced.

2. Binary search tree

What is binary search tree?

Binary search tree is also called binary search tree. Under the condition of binary tree, the node value of the left subtree is always less than the node value of the root, and the node value of the right subtree is always greater than the node value of the root, that is, the node value of the left subtree < the node value of the root < the node value of the right subtree.

The design in the figure below converts a group of data into a binary search tree, and the time complexity of searching a node data in a reasonable binary search tree is the same as that in binary search.

But if design doesn’t care what design looks like?

Poorly designed and unbalanced, the binary search tree can even become sequential rather than binary, as shown in the figure below, the binary search tree eventually designed from the same array.

As you can see, when looking for 69 and this data, in the first graph, it only takes 2 IO to build a reasonable binary search tree to find the data; In the second figure, the highly unbalanced binary search tree takes six disk I/OS to find the data.

Therefore, binary search tree solves the principle of database index ordering, but binary search tree construction may be extremely unbalanced, and finally build a linked list, at this time, we need to use the balanced binary tree, which is commonly referred to as AVL tree.

3. Balanced binary tree (AVL tree)

What is “balanced binary”?

First, it conforms to the definition of a binary search tree, and then it must satisfy that the absolute value of the difference between the heights of the two subtrees of any node does not exceed 1.

Under this condition, the binary search tree is guaranteed to be balanced, similar to the following structure.

Compared with binary search tree, balanced binary search tree has more stable search efficiency and higher overall search speed.

The query speed of balanced binary tree is very fast, but the cost of maintaining a balanced binary tree is very high. In general, one or more left and right rotations are required to achieve tree balance after insertion, update, and deletion.

Balanced binary tree As data increases, the height of balanced binary tree will get higher and higher, about 1000 data has 9-10 layers, that is, it may take 9-10 IO to find a data.

Generally speaking, the general mechanical disk can do at least 100 IO per second, an IO time is basically in 0.01 seconds, that is to say, 1000 pieces of data in the search for 0.1 seconds, that is, if it is 10000 pieces, 1000000 pieces of it.

Therefore, b-tree and B+ Tree data structures are proposed to solve the IO problem caused by high balanced binary tree height.

B+tree B+tree

In order to solve the I/O problem caused by the high height of the balanced binary tree, we thought of a solution, can we put more elements on each node, so as to reduce the height of the balanced binary tree, reduce the DISK I/O, so there are B-tree and B+tree.

1. The b-tree tree (B)

What is the definition of b-tree?

A B-tree is a multi-path search tree. A B-tree of order M has the following features:

Each node has a maximum of M children.

If the root node is not a leaf node, there are at least 2 children.

Each node except the root node and leaf node has at least Ceil(m/2) children.

Each non-leaf node node contains n keyword information (K1, K2… , Kn). The number of keywords n is Ceil(m/2)-1 <= N <= M-1.

Ki(I = 1, 2… N) is the keyword, and the keyword is sorted in ascending order.

Pi(I = 1, 2… , n) is the pointer to the sub-root node. The keywords of all nodes of the sub-tree pointed to by P(i-1) are all less than K(I), but all are greater than K(i-1).

All leaf nodes are in the same layer and contain no other keyword information.

Note:

Order M refers to a node with a maximum of m child nodes, rather than the height of the tree. In other words, order 3 b-tree means that each node has a maximum of three child nodes.

Ceil(m/2) is rounded up, for example, Ceil(3/2)=2, Ceil(5/2)=3

Bala Bala said so many, the passenger officer dizzy did not, the data organization, of course, have to combine the figure, such as a 3 order B-tree structure diagram.

Of course, if you think this map is not good, then small two to change the simplified version of the map, so intimate is worth a praise it.

In the figure, you can see that the B-tree keyword (index) and data (column data except index) are put together. No redundant keyword (index) is stored, and Pointers are used to point to child nodes. The child nodes to the left of the keyword are smaller than the keyword, and the child nodes to the right of the keyword are larger than the keyword.

B-tree greatly reduces the height of the tree by multi-path search, which greatly reduces the disk I/O required to search for a data. For example, if I want to search for the information of element 6, it only takes three disk I/OS to find the desired data.

MySQL > select * from B+tree; MySQL > select * from B+tree; MySQL > select * from B+tree;

2. The B + tree (B + tree)

How do I define B plus tree?

B+ tree is a variant of B-tree and a multi-path search tree. Its definition is basically the same as b-tree, with the following differences:

Non-leaf nodes have the same number of subtree Pointers and keywords

Non-leaf nodes store only keyword information (that is, non-leaf nodes store only indexes and do not store other field information except indexes).

All leaf nodes are connected by Pointers to the next leaf node. (Of course, MySQL has been optimized to make two-way circular lists. See the picture for more information.)

Data records are stored in leaf nodes.

A rank 3 B+tree is shown in the figure below.

Of course, for understanding, the simplified version of the structure of the picture is also prepared.

It can be seen that all data of B+tree is stored in leaf nodes, while non-leaf nodes only store redundant indexes, and bidirectional circular linked lists are used to link leaf nodes.

MySQL > select B+tree from B+tree

Why doesn’t B+tree store data other than indexes in non-leaf nodes?

A: To continue to reduce the height of the tree while allowing non-leaf nodes to store more indexes.

Show global status like ‘Innodb_page_size’; The default is 16384b, which means that the size of a data page is 16KB.

In MySQL InnoDB, a pointer is defined as 6 bytes, so assuming the primary key type is Bigint (8 bytes), a single data in the leaf node is 1K (1024 bytes, usually a table of 20 or 30 fields is not larger than 1K, unless it is a big data type), So how much data can a height 3 B+tree store?

Number of layer 1 storage indexes: 16384/(8+6) = 1170

Number of Layer 2 storage indexes: 16384/(8+6) = 1170

Number of data stored in leaf nodes at layer 3:16384/1024 = 16

That is, a 3-level B+tree can store 1170*1170*16 = 21902400 pieces of data.

If the primary key is int, 1638*1638*16 = 42928704.

If the primary key type is Bigint and the data size is 1K, how many bytes can a 3-storey B-tree store?

Number of data stored at Layer 1:16384/(1024+6) = 15

Number of Data stored at Layer 2:16384/(1024+6) = 15

Number of layer 3 storage data: 16384/(1024) = 16

That is, a 3-level B-tree can store 15*15*16 = 3600 pieces of data.

Compare how much data can be stored between B+ trees and B-trees at the same height. If there is such a huge difference, then of course CHOOSE B+ Tree.

Why do leaf nodes need to be optimized for bidirectional circular lists?

A: Because all the data of B+tree is in leaf nodes, the bidirectional circular linked list between leaf nodes is to improve the performance of interval access and facilitate the range search data.

Reasons for using B+tree Not using B-tree Summary: Further reduce the height of the tree + the number of indexes stored in non-leaf nodes + the bidirectional circular linked list between leaf nodes to improve the performance of interval access and facilitate the range search.

Fourth, the Hash table

MySQL index data structure uses B+tree instead of various binary and b-tree, so why 99.99 times B+tree instead of Hash?

Let’s take a look at what Hash does as an index.

A Hash is performed on the key of the index to determine the location of the data store.

The value can only be “=” or “IN”, but range query is not supported.

Hash conflict problem.

It can be seen that Hash only supports “=” and “IN”, and range query is not supported because it is out of order. IN addition, Hash conflicts are likely to occur when big data is stored IN a database. Therefore, B+ Tree is usually used.

Five, MyISAM index implementation

Although MyISAM was deprecated in MySQL8.0, the current version of MyISAM is MySQL5.7, so let’s talk about the underlying data structure of MyISAM index.

MySQL > install MySQL > install MySQL

Install MySQL5.7.26 on Linux(CentOS7)
Docker builds MySQL and mounts data

Small 2 using the docker installation way, so my data in/itwxe dockerData/mysql/data directory.

Blog_test (test_innoDB, test_myISam); blog_test (test_innoDB, test_myISam);

CREATE TABLE `test_innodb` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `age` int(11) NOT NULL.PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `test_myisam` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `age` int(11) NOT NULL.PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;
Copy the code

Take a look at how MySQL stores the created libraries and tables on disk.

InnoDB storage engine stores two files separately:

FRM: table structure file
Ibd: table index and data file

MyISAM storage engine stores three files respectively:

FRM: table data structure file
MYI: table index file
MYD: table data file

MyISAM storage engine stores index files and data files separately, as shown in the figure.

MyISAM storage engine index also uses B+tree, but the difference is that the data is not stored in the leaf node, the leaf node only stores all the index + data file row records of the disk address.

InnoDB index implementation

We have already seen how InnoDB stores files. Unlike MyISAM, InnoDB indexes and data are placed in leaf nodes.

After looking at the index implementation of the InnoDB storage engine, consider the primary key requirements of a DBA in a company.

In general, corporate DBAs recommend that InnoDB tables must have primary keys and use integral incrementally added primary keys. Why?

Why does the corporate DBA recommend that InnoDB tables have primary keys?
- Table datafile itself is an index structure file organized according to B+tree. If the developer does not create a primary key, MySQL will select a unique index as the primary key. If there is no unique index, MySQL will create a hidden column as the primary key.
- If you can query a single record with a primary key, you can avoid returning to the table.
Why is an integer increment primary key recommended?
- Not under the plastic keys, such as the UUID, when indexing is size problem (plastic compare size faster than UUID, UUID need each character converted to ASCII individually, although little impact performance), at the same time a plastic primary key is usually occupy the space is larger, which means that an index page to store indexes as plastic keys.
- Under non-orthopedic primary keys, the value of the primary key is approximately random. Therefore, each new record is inserted into the middle of the existing index page. MySQL has to move the data in order to insert the new record into the right place, which increases the disk overhead. In the case of orthopedic auto-increment primary key, the data only needs to be kept behind the leaf node, which greatly reduces the disk overhead caused by index page splitting and data moving.

Seventh, secondary index and joint index realization

SQL > select * from B+tree; SQL > select * from B+tree; SQL > select * from B+tree;

1. Secondary index (secondary index)

ALTER TABLE test_innodb ADD INDEX idx_name(name) USING BTREE; For example, create a secondary index B+tree.

SELECT * FROM test_innodb WHERE name = ‘ITwxe’; SELECT * FROM test_innodb WHERE name = ‘ITwxe’; So how do you find it?

First, ITwxe will be queried through secondary index IDx_name (name). After the primary key ID of leaf node is 58, ITwxe will be queried through 58 in B+tree constructed by primary key index, which is called back table.

Isn’t it faster for non-primary key index structure leaves to store full row records? Why does MySQL non-primary key index leaf nodes store primary key values to retrieve table queries instead of storing all row field information?

Consistency problem: if every non-primary key index leaf node stores complete row records, when updating a record, only the record that has been successfully updated in all indexes can be said to have been successfully updated, which increases the complexity and transaction overhead of updating records.
Disk usage problems, if each of the primary key index leaf nodes are stored a complete row, although the speed will be faster than back to the table query, but how many indexes will be how many copies of complete data, so the original occupy 1 gb size table, if there are four primary key index, then the original occupy 1 gb size table will occupy 5 gb of disk, do more harm than good.

2. Combined index (compound index)

ALTER TABLE test_innodb ADD INDEX idex_name_age(name, age) USING BTREE; For example, create a joint index B+tree.

As you can see, when creating a joint index B+tree, the data is sorted in the order in which the index is created, first by name and then by name. If the value of the name field is the same, such as Frank, the row will be sorted in order of age, and so on. Finally, the corresponding primary key ID will be found, and the complete row information will be queried in the B+tree constructed with the primary key.

After reading the associative index structure, we already know that the optimal left prefix principle is why the first field (name) must be in order, because the second field (age) must be in order only if the first field is the same.

If two indexes idx_name(name) and idex_name_age(name, age) exist at the same time, idx_name(name) is a redundant index. Idex_name_age (name, age); idex_name_age(name, age)

3. Clustered index and sparse index

Clustered index means that leaf nodes contain all complete row records, leaf nodes contain indexes and other field information. B+ Tree built with primary key index in InnoDB storage engine is clustered index, and other indexes are sparse. For example, secondary indexes, federated indexes, MyISAM storage engine indexes are all sparse indexes.

Now that you’ve read this, please like, comment, follow and bookmark it!