An index is a structure that sorts the values of one or more columns of a database table. Using an index allows quick access to specific information in a database table. An analogy can be drawn between a database index and a book’s table of contents, in which we can quickly find the location of a chapter, whereas without a table of contents we would have to turn page by page.

Indexed data structure

There are many index structures that can be used to improve query efficiency, including B tree index, hash index and B+ tree index. We’ll take a look at each of these indexes and explain why InnoDB uses B+ trees as indexes.

Disk I/o

Files are stored on the hard disk. Currently, the read speed of hard disks is very limited. Therefore, you should minimize disk I/O times when querying and locating certain data.

Disk to proofread

Due to the characteristics of the storage medium, the disk itself access is much slower than the main memory, coupled with the cost of mechanical movement, the disk access speed is often a few hundredths of the main memory, so in order to improve efficiency, to minimize disk I/O. To achieve this goal, the disk does not read data strictly on demand. Instead, the disk prereads each time. Even if only one byte is required, the disk reads a certain length of data backward from this position into memory. The theory behind this is the famous principle of locality in computer science: when a piece of data is used, nearby data is usually used immediately. The data required during the running of a program is usually concentrated.

Principle of locality: When a CPU accesses memory, whether it is to read or write instructions or data, the accessed memory units tend to be clustered in a small contiguous area.

The prefetch length is an integer multiple of a page. A page is a logical block of computer-managed memory. Hardware and operating systems often divide the main memory and disk storage areas into contiguous, equal-sized blocks. Each block is called a page (in many operating systems, pages are typically 4k in size). When the data that the program is trying to read is not in main memory, a page-missing exception will be triggered. At this time, the system will send a disk read signal to the disk. The disk will find the starting location of the data and load one or more pages back into memory successively.

Make proper use of disk prefetch

Generally, indexes themselves are too large to be stored in memory, so indexes are often stored on disk as index files. Therefore, the disk I/O consumption of index lookups is several orders of magnitude higher than that of memory access, so the most important indicator for evaluating a data structure as an index is the number of disk I/O operations performed during the lookups. In other words, indexes should be structured to minimize the amount of disk I/O needed to access a lookup.

If we can properly use the disk prefetch feature to make every page data read by disk IO useful, we can greatly improve the data query efficiency.

B-tree indexes

B-tree can be regarded as an extension of binary search tree. B-tree allows each node to have m-1 child nodes. B-tree has the following characteristics:

The root node has at least two children;
Each node contains M-1 pieces of data, which are sorted in ascending order by mounting index.
In the node, there are at most M Pointers to the node at the next layer, which are located between multiple data of the node. All data values of the node at the next layer are greater than the data on the left of the pointer and smaller than the data on the right of the pointer.
Each node contains at least M/2 pieces of data.

Next, we use the user data represented in the following example to build a B-tree, as shown in the table. The user data contains three fields, name, gender and age. We take the user age as the primary key of the database (assuming that age is unique), so the structure of the constructed B-tree is shown in the figure below.

| | | | | | | | | | | | — – | — – | — – | — – | — – | — – | — – | — – | — – | — – | — – | | | name Chen Er scattered | | zhang lee thought dancing king | | Zhao Liu colophon | | | sun period weeks wine history | | zheng wu gender | | | | male male female female | | | | male male male female | | | 5 10 20 | | | | | age 28 35 56 25 | | 80 | | 90 | |

! [B tree index]

Compared with common binary trees, a node of B tree stores more data, which can effectively reduce the number of DISK I/OS in a data search process:

The space between nodes is discrete. Therefore, each node corresponds to one DISK I/O, and the I/O times for searching data is O(log2log_2log2N).
The node of the B tree can store M-1 data. If m-1 data can be placed on a page, the IO times for searching data in the B tree are O(logMlog_MlogMN).

The hash index

Hash indexes are implemented based on hash tables, and only queries that exactly match all columns of the index are valid. A hash table is a structure that stores data by key-value. Users can find the corresponding Value by Key within O(1) time complexity.

Hash table is usually an array, the position of data in the array can be installed in accordance with the index value of the hash algorithm to calculate, if two data index values calculated at the same position, then can usually chain address method is used to solve conflict (other solution to address conflict and open the custom method, chain address method, public overflow area method, hashing, etc).

Data is shown in the list below, we still according to the age of the user data for the user to establish index (assuming the user age, but not the same), we adopt the hash algorithm to addr = % age 10, we can build the length of the array as a hash table of 10, according to the hash function a user into the hash table, according to the user age for users, The user’s location can be directly calculated to obtain user information. The final hash table and query process are shown in the figure below.


The name	Chen Er	A loose	Lee thought	The king dance	Zhao Liu	Sun period	Zhou l ‘envoi	Wu wine	The history of zheng
gender	male	male	female	female	male	male	male	female	male
age	5	10	20	28	35	56	25	80	90

Hash indexes have the following advantages:

The extra space required to create a hash index for data is O(N), independent of the length of the index field.
Query speed is very fast, the hash function is reasonable, the program can find data in O(1) disk IO times;

Hash indexes have the following disadvantages:

Range query cannot be performed, the sequence of indexes has been lost in the hashing process;
Unable to sort the data for lookups, such as finding the oldest user;
Unable to use partial index lookup, such as prefix lookup, etc.
When the hash function is unreasonable, it will lead to hash conflict and lower query efficiency.

B + tree index

InnoDB uses index data structure is B+ tree, each index in the database table definition corresponds to a B+ tree, the default cluster index is also a B+ tree, B+ tree has the following characteristics:

The keywords of all nodes are arranged in ascending order and follow the principle of smaller on the left and larger on the right.
The number of child nodes of non-leaf nodes is between 1 and M (M is 3 in the figure below), except for empty trees.
The number of indexes of non-leaf nodes is greater than or equal to CeiL (M/2) and less than or equal to M.
All leaf nodes are in the same layer, and there are Pointers from left to right between leaf nodes.
Data is stored in leaf nodes, while non-leaf nodes only store indexes.

Next, we use several sample user data to construct B+ tree. As shown in the table, the user data contains three fields: name, gender and age. We take the user age as the primary key of the database (assuming that age is unique), so the structure of the constructed B+ tree is shown in the figure below.


The name	Chen Er	A loose	Lee thought	The king dance	Zhao Liu	Sun period	Zhou l ‘envoi	Wu wine	The history of zheng
gender	male	male	female	female	male	male	male	female	male
age	5	10	20	28	35	56	25	80	90

B+ tree index data structures have several advantages listed below:

The query performance is stable. The NUMBER of I/OS required to query a data is usually the height of the tree.
Range query is efficient. When you install index range query, you can search for the first data that meets the requirements, and then traverse backwards until the first data that does not meet the requirements, and the data in the middle meets the requirements.
The query efficiency is high, requiring only two to three DISK I/OS for data query.
Leaf nodes store all data and do not need to look for data outside the B+ tree;

Why InnoDB uses B+ trees

In The InnoDB engine, we create indexes for the database in the form of B+ tree, why InnoDB does not use hash index or B tree index? Mainly based on the following reasons:

Database queries often have non-equivalent queries, and hash indexes do not work in this case;
Compared with a B tree, the non-leaf index nodes of a B+ tree do not store data. Therefore, a disk can read more index data at a time, effectively reducing the number of DISK I/OS.
Range query often occurs in database query. The bottom leaf nodes of B+ tree are arranged in order to achieve range query more effectively.

Since the primary key

From the above we know that B+ trees need to maintain index order.

When the user inserts data into the B+ tree, if the node corresponding to the insertion point has a free position, it only needs to move the data in the node and put the data to be inserted into the B+ tree.
When the user inserts data into the B+ tree, if the node corresponding to the insertion point has no free space, a new node needs to be generated and part of the data needs to be moved. This situation not only affects the insertion efficiency, but also reduces the utilization of space because only part of the data is split from the nodes.
When the user deletes the data in the B+ tree, if the data volume of the node or adjacent nodes is very small, it only needs to delete the data directly and move other data in the node.
When the user deletes data in the B+ tree, if the data volume of the node and its neighboring nodes is small, the node and its neighboring nodes may need to be merged after the deletion to improve space utilization.

Since B+ trees need to maintain index order, we propose the following suggestions for index fields:

For scenarios with a lot of inserts, the primary key index field is best incremented. Each time an incremental primary key inserts a new record, it is an append operation that does not involve moving other records and does not trigger splitting of leaf nodes.
The length of the primary key index should be as small as possible. The smaller the primary key length, the smaller the leaf nodes of the normal index and the smaller the space occupied by the normal index.

In InnoDB, we should use auto-added primary keys as much as possible. Auto-added primary keys have the advantages of high insertion efficiency and small footprint.

Data void and rebuild index

Data is empty

When you make changes to InnoDB, such as deleting rows, the rows are marked as “deleted” and not actually physically deleted from the index, so space is not actually freed. InnoDB’s Purge thread asynchronously cleans up unused index keys and rows, but it still doesn’t return the freed space to the operating system for reuse, resulting in a lot of empty pages. If the table structure contains dynamic length fields, these holes may not even be reused by InnoDB to store new rows because the space space is insufficient.

Problems caused by data holes:

After data in a table is deleted, the space occupied by the table does not decrease, resulting in space waste.
Slows down data queries because voids take up page space;

We can check the size of the void in the database using the following SQL statement. The DATA_FREE in the result indicates the size of the free data block in the table.

select data_length,data_free from information_schema.tables where table_schema='test' and table_name='test';
Copy the code

Rebuilding the index

If there are too many data holes in the index of a table, the efficiency of SQL statement execution will be affected. In this case, we need to clean these data holes.

A better way to clean up data holes is to rebuild the index, because in the process of index reconstruction, the index will be sorted according to the size of the index, and the index created is relatively compact.

Is there any way to rebuild the index? The obvious idea would be to drop the index first and then rebuild it. However, whether a primary key is deleted or created, the entire table is rebuilt. So if you execute these two statements in tandem, the first statement is done for nothing.

alter table user_info drop primary key;
alter table user_info add primary key(id);
Copy the code

All indexes of a table can be rebuilt in InnoDB by using the following statements to transform the data engine. This is because in the process of converting the data engine (even if there is no actual conversion), all the data in the table is read and written again, and the void is released in the process. Note that this method takes a long time to rebuild the index.

alter table test engine=innodb
Copy the code

This article was first published to wechat public account, all rights reserved, reprint prohibited!

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

InnoDB learning (7) index structure

Indexed data structure

Disk I/o

Disk to proofread

Make proper use of disk prefetch

B-tree indexes

The hash index

B + tree index

Why InnoDB uses B+ trees

Since the primary key

Data void and rebuild index

Data is empty

Rebuilding the index

InnoDB learning (7) index structure

Indexed data structure

Disk I/o

Disk to proofread

Make proper use of disk prefetch

B-tree indexes

The hash index

B + tree index

Why InnoDB uses B+ trees

Since the primary key

Data void and rebuild index

Data is empty

Rebuilding the index

Related Posts

How to ensure cache and database consistency?

Rough Processing of video transcoding and cover image by FFmpeg (1)

1202, packaging Docker image still can’t?