Author: liqiangchn

Original: www.cnblogs.com/liqiangchn/…

MySQL index? This thing is easy to talk about? It was obviously digging a hole, but luckily the old man had prepared for it and listened to me.

What is the index?

Indexes are data structures that help MySQL efficiently retrieve data.

What does an index do?

Indexes are critical, especially as the amount of data in a table increases, and their impact on performance becomes more important. Indexes can easily improve query performance by several orders of magnitude, and overall, significantly improve query efficiency.

Iii. Classification of indexes?

1. Storage structure: BTree index (B-tree or B+Tree index), Hash index, full-index, r-tree index. What is described here is the form in which the index is stored,

2, from the application level to divide: common index, unique index, composite index

3, according to the physical order of the data and the logical (index) order relationship of the key value: clustered index, non-clustered index.

The index type usually refers to the division at the application level.

Just like the phone classification: Android phone, IOS phone and Huawei phone, Apple phone, OPPO phone.

Normal index: that is, an index contains only a single column. A table can have multiple single-column indexes

Unique index: The value of the indexed column must be unique, but empty values are allowed

Composite index: An index composed of multiple column values that is specifically used for combined searches and is more efficient than index merges

Clustered index (clustered index) : not a separate type of index, but a way of storing data. The details depend on the implementation, InnoDB’s clustered index is actually a b-tree index (technically B+Tree) and rows in the same structure.

Non-clustered index: if it is not a clustered index, it is a non-clustered index

Fourth, the underlying implementation of index

Mysql default storage engine InnoDB only explicitly supports B-tree (technically B+Tree) indexes. For frequently accessed tables, InnoDB transparently creates adaptive hash indexes, that is, creating hash indexes based on B-tree indexes can significantly improve search efficiency. It is transparent to clients. Out of control, implicit.

Don’t talk about storage engines, just implementation (abstraction)

A Hash index

Based on hash table implementation, only queries that accurately match all columns of the index are valid. For each row of data, the storage engine computes a Hash code for all columns of the index. Hash indexes store all hash codes in the index and store Pointers to each row in the index table.

MySQL > select * from B+Tree;

B-tree speeds up data access because the storage engine does not need to perform a full table scan to obtain data, which is distributed among nodes.

B + Tree index

B-tree is an improved version of the B-tree and is the storage structure used by the database index index. The data is on the leaf nodes, and sequential access Pointers are added, with each leaf node pointing to the address of the adjacent leaf node. Compared with B-tree, only two nodes need to be searched and traversed. B-tree obtains all nodes, which is more efficient than B+Tree.

Discuss with storage engines (generally B+Tree is used by default)

Example: Suppose you have a table of students with id as the primary key

id name birthday
1 Tom 1996-01-01
2 Jann 1996-01-04
3 Ray 1996-01-08
4 Michael 1996-01-10
5 Jack 1996-01-13
6 Steven 1996-01-23
7 Lily 1996-01-25

Implementation in MyISAM engine (secondary index is also implemented in this way)

Implementation in InnoDB

5, why use B+Tree as the default index structure instead of Hash, binary Tree, red black Tree?

B-tree: A B-tree stores data on both leaf nodes and non-leaf nodes. As a result, the number of Pointers that can be saved on non-leaf nodes decreases (some data is also called fan out). If the number of Pointers is small, the height of the tree can only be increased to save a large amount of data, resulting in more I/O operations and lower query performance.

Hash: Fast location, but no sequence, high I/O complexity.

Binary tree: the height of the tree is not uniform, cannot self-balance, search efficiency is related to data (tree height), and IO cost is high.

Red-black tree: The tree height increases with the data volume, and the I/O cost is high.

Why is it officially recommended to use self-growing primary keys as indexes?

In combination with the characteristics of B+Tree, the auto-added primary key is continuous, and the page splitting is minimized during the insertion process. Even if the page splitting is required, only a small part of the page splitting will be performed. And it can reduce the movement of data, every insert is inserted to the end. The idea is to reduce the frequency of splitting and movement.

Insert continuous data:

Insert discontinuous data:

Seven, a brief summary

1, MySQL uses B+Tree as index data structure.

2. When new data is added to a B+Tree, the old B+Tree is adjusted according to the value of the specified column in the index.

3. In terms of physical storage structure, both B-tree and B+Tree divide node size by page (4K). However, since the middle node of B+Tree does not store data, B+Tree can store more keys among nodes of the same size to improve search efficiency.

4, the performance of MySQL lookup is mainly affected by disk I/O times, most of which is the time spent on moving the magnetic head to the specified track.

5. Index and data store are separated under MyISAM storage engine, InnoDB index and data store together. Index under 6, the InnoDB storage engine, auxiliary index () are all dependent on the primary index established (auxiliary index of data is not stored in a leaf node address, or the value of the primary index, as a result, all depends on the secondary index is according to the first auxiliary index to the main index, according to the main index data address).

If InnoDB index is not auto-increment (id as primary key), then every time new data is inserted, it is likely to rearrange the primary index of B+Tree, affecting performance. Therefore, try to use the increment ID as the primary index of InnoDB.


Make progress together, learn and share

Welcome to pay attention to my public number [calm as code], massive Java related articles, learning materials will be updated in it, sorting out the data will be placed in it.

If you think it’s written well, click a “like” and add a follow! Point attention, do not get lost, continue to update!!