The interviewer asked me about the MySQL index and I

Interviewer: I see you write MySQL on your resume. Do you know the index of MySQL InnoDB engine?

Candidate: Well, using indexes speeds up queries, essentially ordering unordered data.

Candidate: In the InnoDB engine, the underlying data structure of the index is a B+ tree

Interviewer: Why not use red black trees or B trees?

Candidate: MySQL data is stored on hard disk, so it is not possible to load all data into memory “at once” during query

Candidate: A red-black tree is a variant of a binary search tree. A Node can store only one Key and one Value

Candidates: B and B+ trees, unlike red-black trees, are “multi-path search trees”. A Node can store more information than a binary search tree, which is lower in height than a binary search tree.

Candidates: to understand the difference, in fact, it is easy to find, the data can not be a loaded into memory, under the scenarios of data need to be retrieved, choose B or B + tree reason is good (a Node Node information stored more (compared with binary search tree), lower the height of the tree, the tree height affect retrieval speed)

Candidates: B+ trees have two more properties compared to B trees.

The non-leaf nodes of a B+ tree do not store data. For the same amount of data, B+ trees are shorter and stronger. (This should not be explained too much, the data is stored in the leaf node, the non-leaf node storage can hold more indexes, so the whole tree is shorter.)

2. Create a linked list between the leaf nodes of the B+ tree to facilitate traversal query (traversal operation is common in MySQL).

Candidate: Let me explain a little bit. You can picture it

Candidate: We create a B+ tree for every index we create in the MySQL InnoDB engine.

Candidate: If the index is a clustered index, the leaf node of the current B+ tree stores primary key and current row data.

Candidate: If the index is a “non-clustered index”, the leaf node of the current B+ tree stores the “primary key and current index column value”.

Select * from user where id >=10 select * from user where id >=10 select * from user where id >=10

Candidates: Since b-trees store data in non-leaf nodes, traversal may require cross-layer retrieval, which is more cumbersome.

Candidate: Based on the hierarchy of trees and the characteristics of business usage scenarios, MySQL has chosen B+ trees as the underlying data structure of the index.

Candidate: For hash structure, InnoDB engine is actually “adaptive” to hash index creation (hash index creation is automatically optimized by InnoDB storage engine engine, we can’t interfere)

Interviewer: HMM… I see. By the way, do you know what a return table is?

Candidate: When we use the index to query data, the retrieved data may contain other columns, but the leaf node of the index tree can only find the current column value and the primary key ID, so we need to look up the data again based on the primary key ID to obtain the required column

Select orderId,orderName from OrderDetail where orderId = 123 from orderDetail where orderId = 123

MySQL > select orderName from orderId; MySQL > select orderName from orderName; MySQL > select orderName from orderName; MySQL > select orderName from orderName; MySQL > select orderName from orderName

Candidate: If you want to avoid a back table, you can also use an overwrite index (use it if you can because you avoid back table operations).

If I create an orderId and orderName joint index, and I need to query orderId and orderName as well, the data will be stored on the leaf node of the index tree, so I don’t need to go back to the table.

Interviewer: Since you mentioned syndication, do you know anything about the leftmost matching principle?

Candidate: Well, it’s easier to illustrate the concept with an example

Candidate: If there are indexes (A, B, C, and D) and the query conditions are a=1 and b=2 and C >3 and d=4, a, B, and C are matched on each node in sequence. D is not matched

Candidates: first match leftmost key, index can only be used to find whether there is a key (equal), in the case of range query (>, <, between, like left match) can not further match, then degenerate to linear search

Candidate: This is the leftmost matching rule

Interviewer: Yes, I also want to ask you how the primary key is generated?

Candidate: primary key increment

Interviewer: So if I don’t use MySQL’s self-added primary key, what do you think might be the problem?

Candidates: The primary key needs to be unique and as short as possible, which are two things to consider.

Candidate: Also, because of the nature of indexes (ordered), the performance of inserts is worse than auto-incrementation if a primary key like uUID is generated

Candidate: Because of the generated UUID, it may be necessary to move the disk block at insert time (for example, if the space in the block is already full at the current time, but the newly generated UUID needs to be inserted into the full block, the data in the block needs to be moved)

Interviewer: OK…

This paper concludes:

Why B+ tree? Data cannot be loaded into memory at one time. B+ tree is a multi-path search tree, and only leaf nodes store data, which are associated with each other in a linked list. (Tree is short and easy to traverse)
What is a return table? Non-clustered indexes only store column values and primary key IDS in leaf nodes. If possible, overwrite indexes can be used to avoid back table operations and improve query speed
What is the leftmost matching principle? Continuous matching from the leftmost starting point is encountered with range query termination
What’s wrong with a non-increment primary key? The insertion efficiency decreases and there are data problems of moving blocks

Welcome to follow my wechat official account [Java3y] to talk about Java interview

Online Interviewer – Mobile seriesTwo continuous updates a week!

Line – to – line interviewers – computer – end seriesTwo continuous updates a week!

Original is not easy!! Three times!!

The interviewer asked me about the MySQL index and I

Related Posts

Python series of simulated login netease cloud personal playlist downloader

GRPC uses Protobuf to build microservices

Figure out eight commonly used algorithm ideas!