Recently write database, many friends leave a message to ask MySQL index bottom implementation, today simple chat, less about “how”, more about “why is designed like this”.

Q.1. Why do databases design indexes?

There are 1000W books in the library, from which we should find the Path of Architects. When should we check them?

So the librarian devised a set of rules:

(1) History on the first floor, literature on the second floor, IT on the third floor…

(2)IT class, and divided into software class, hardware class…

(3) Software class, and in accordance with the title of the phonological order…

To find a book quickly.

Shenjian = ‘shenjian’; shenjian = ‘shenjian’; shenjian = ‘shenjian’;

Therefore, to have an index, used to improve the speed of the database search.

Question 2. Hashes are faster than trees. Why is the index structure designed as a tree?

There are two common types of data structures that speed up lookups:

(1) Hash, such as HashMap, the average time complexity of query/insert/modify/delete is O(1);

(2) trees, such as balanced binary search trees, the average time complexity of query/insert/modify/delete is O(lg(n));

As you can see, whether it’s a read request or a write request, a hash index is faster than a tree index. So why is the index structure tree?

* Voiceover: 80% of the students can't answer the interview. *

The index is designed as a tree, which is relevant to the requirements of SQL.

SQL requirements for such a single-row query:

Select * from t where name= “shenjian”;

It is true that hash indexes are faster because only one record is queried at a time.

* Voiceover: So, if the business requirements are single-line access, such as passport, you can indeed use hash indexes. *

But for the SQL requirements for sorting queries:

  • Group by group
  • Order by
  • Comparison: <, >

A hash index degrades to O(n) in time, but the order nature of the tree still maintains O(log(n)) efficiency.

Any design that deviates from the requirements is rogue.

By the way, InnoDB does not support hash indexes.

Question 3. Why do database indexes use B+ trees?

In order to maintain the integrity of the knowledge system, briefly introduce some trees.

First: binary search tree

Binary search trees, shown above, are one of the most well-known data structures that I won’t go into. Why are they not suitable for database indexing?

(1) When there is a large amount of data, the height of the tree will be relatively high; when there is a large amount of data, the query will be slow;

(2) Each node stores only one record, which may lead to many disk I/OS for a query;

* Voiceover: This tree is best known because it often appears in college textbooks. *

Second: B trees

B tree, as shown above, has the following characteristics:

(1) It is no longer binary search, but M-fork search;

(2) Both leaf and non-leaf nodes store data;

(3) In order traversal, can obtain all nodes;

* Voiceover, I do not want to introduce this feature: the number of non-root nodes containing keywords J meet, (chrysene m/2 gp)-1 <= j <= m-1, node split to meet this condition. *

B-trees were created as data structures to implement indexes because they exploit the “locality principle” perfectly.

What is the locality principle?

The logic of the locality principle goes like this:

(1) memory read and write block, disk read and write slow, and much slower;

(2) Disk prefetch: Disk read and write does not read on demand, but prefetch on page. Data is read one page at a time and more data is loaded each time. If the data to be read in the future is in this page, disk I/O can be avoided and efficiency can be improved.

Voice-over: Typically, one page of data is 4K.

(3) The principle of locality: software design should try to follow the “data read concentration” and “use a data, the probability of using its nearby data”, so that disk prefetch can fully improve disk IO;

Why are B-trees good for indexing?

(1) Due to the bifurcation of M, the height can be greatly reduced;

(2) Each node can store J records. If the node size is set to page size, such as 4K, the prefetch feature can be fully utilized to greatly reduce disk IO;

Third: B+ trees

The B+ tree, as shown in the figure above, is still an M-fork search tree, with some improvements made on the basis of the B tree:

(1) Non-leaf nodes no longer store data, and data is only stored on leaf nodes of the same layer;

* Voiceover: In a B+ tree, the path from the root to every node is the same length, which is not the case in a B tree. *

(2) Between leaves, a linked list is added to obtain all nodes, and middle-order traversal is no longer needed;

These improvements give B+ trees superior properties over B trees:

(1) Range search: after min and Max are located, the middle leaf node is the result set without middle order backtracking;

* Voiceover: Range queries are used a lot in SQL, which is the biggest advantage of B+ trees over B trees. *

(2) The leaf node stores the actual record line, which is relatively close and suitable for disk storage with large data volume; Non-leaf node storage record PK, used for query acceleration, suitable for memory storage;

(3) If the non-leaf node does not store the actual record, but only stores the KEY of the record, then the B+ tree can store more indexes with the same memory;

Finally, to quantify, ** why is the height of the B+ tree of the m fork much lower than that of the binary search tree? **

Do the math:

(1) Locality principle, the size of a node is set as one page, one page 4K, assuming that a KEY has 8 bytes, a node can store 500 keys, that is, J =500

(2)m tree, approximately m/2<= j <=m, that is, it can be almost 1000 fork tree

(3) then:

Layer 1 tree: 1 node, 1 x 500 keys, size 4K

Layer 2 tree: 1000 nodes, 1000500=50W keys, size 10004K=4M

Three-layer tree: 10001000 nodes, 10001000500= 500 million keys, 10001000 x 4K=4G

* VOiceover: Well, please check if there are any mistakes. *

As you can see, storing a large amount of data (500 million) does not require much tree depth (height 3), nor does the index take up much memory (4G).

conclusion

  • Database indexes are used to speed up queries
  • Although the hash index is O(1) and the tree index is O(log(n)), SQL has many “ordering” requirements, so the database uses tree indexes
  • InnoDB does not support hash indexes
  • The idea of data prefetch is as follows: Disk read and write data is prefetch on a page rather than on demand. The system reads one page of data at a time and loads more data each time to reduce disk I/OS in the future
  • Principle of locality: Software design should comply with “data read concentration” and “once a piece of data is used, there is a high probability that nearby data will be used”. In this way, disk prefetch can fully improve disk I/O
  • Database indexes are most commonly used by B+ trees:

(1) Very suitable for disk storage, can make full use of the principle of locality, disk preread;

(2) Very low tree height, can store a lot of data;

(3) The index itself occupies very little memory;

(4) Can well support single point query, range query, ordered query;

Although both are B+ trees, the next chapter will discuss the differences between InnoDB and MyISAM’s index implementation.