MySQL Interview Cheat sheet

I am fat brother, an unprofessional interviewer!

I am embarrassed, a young rookie who is actively looking for a job, embarrassed said: The most afraid of interview is that the knowledge point asked by the interviewer is too general, and I can not quickly locate the key points!!


This is the main interview question

What is your understanding of the index?Copy the code
The interviewer will ask you to explain why the computer level index is fast.Copy the code
Why not use hash structure as index structure?Copy the code
Why not use binary trees as index structures?Copy the code
Why not use b-tree instead of B+Tree?Copy the code
The interviewer's index is to speed up the query, so should the table be indexed as many columns as possible?Copy the code


What is your understanding of the index?

When it comes to indexes, the first thing that comes to mind is probably the dictionary directory. According to the official definition of MySQL, indexes are used to help MySQL obtain data efficiently.

In essence: An index is an ordered fast-lookup data structure designed to find data quickly and efficiently.

To put it simply, it is analogous to a dictionary catalogue or a train list.

The interviewer will ask you to explain why the computer level index is fast.

When a computer gets data from disk and loads it into memory, it usually goes through three general time-consuming processes:

2. Rotation delay (time) : the time it takes to determine the data to be read in which sector of the track. 3. Data transfer (time) : the time it takes to load data into memory

Each load of data is called a disk IO, and each IO operation takes time = seek + rotation delay + data transfer (time is short and negligible).

In fact, the actual time to load data into memory is very short, and the time spent on an IO operation is mainly due to seek and rotation delays.

Generally speaking, an IO operation takes only a few ms. If it is 4ms, although it seems very short, it takes 4000s to load a database of millions of levels of data, which is devastating for a system.

What we need is to reduce the number of disk I/OS, which is the point of using indexes!! Indexes can guarantee data in the hundreds of millions with only 2 to 4 disk I/OS, which is a boon!

Why not use hash structure as index structure?

In normal service scenarios, most queries are similar to range queries:

select id.name, age from sys_user where age between 18 and 28;
Copy the code

The HASH structure is used as the index, so the storage engine calculates the HASH value for each row of the table record. The HASH index stores the HASH code.

The HASH code is generated randomly

The lack of regular HASH codes leads to randomly distributed storage of data, which makes it highly likely that even two very similar rows will be allocated to different buckets (disk blocks).

The worst case scenario is to do disk IO for every record found.

Advantages: Hash structure such key-val key-value pair form is very sensitive to precise search and friendly to full value matching, so the efficiency of single record query is very high and the time complexity is 1. However, for our daily business, the most commonly used is range search, so hash structure is not suitable.

Just remember that Hash indexes are good for exact lookup, full value matching, not range lookup.

MySQL currently has a Memory engine and an NDB engine to support Hash indexing.

Why not use binary trees as index structures?

Let’s look at the binary tree structure

A binary tree has a maximum of two child nodes. As a result, the height of the tree is very high and the I/O times are increased. In special cases, the tree may be a linked list structure, which is equivalent to full table scan and full disk I/O.

Assuming a binary tree structure as an index, ideally a complete binary tree, then a complete binary tree with n nodes has a depth of log2x+1

(where x represents the largest integer not greater than n)

If a piece of data is at the 100th level of the binary tree, it takes 100 disk I/OS to find the data. Even worse, the binary tree degenerates into a linked list structure, i.e., an oblique binary tree.

Similarly, balanced binary trees are also very tall.

Why not use b-tree instead of B+Tree?

The height of the binary Tree increases the disk I/O during query. B-tree stores more data and has a lower height. Why not choose b-tree? But the B + Tree?

B-tree is a multi-path balanced search Tree, which can greatly optimize disk I/O times compared with binary Tree. However, each node of a B-tree contains not only the key (index value) but also the data (entire row record). The advantage of b-tree is that data records are found when the index is found.

Why not use the B-tree structure? Same old problem, disk IO number!!

We know that MySQL reads data on a page basis (disk block), and storage space per page (or disk block) is limited

If data is large, the number of indexes stored per page will be small

Therefore, when a large amount of data is stored in a data table, the depth of the B-tree is also large, which increases the disk I/O times and affects the query efficiency.

As for B+Tree, B+Tree is an optimized structure for B-tree, making it more suitable for implementing external storage index structure

1. Non-leaf nodes only store key-value information (index information). 2Copy the code

Benefits: Non-leaf nodes of a B+Tree store only key value information, so each page can store more indexes, the height of the Tree is reduced to a low level, and the disk I/O times are reduced. Generally, the desired record can be queried after 2 to 4 I/OS.

And because the table data are sequentially stored in the B+Tree structure of the leaf node, so for the range lookup is very friendly, high efficiency!

The interviewer’s index is to speed up the query, so should the table be indexed as many columns as possible?

Although indexes speed up query efficiency and reduce disk I/O times, creating too many indexes blindly greatly increases the time and space cost of index maintenance.

First, the benefits of indexing

1, reduce IO times, improve retrieval efficiency 2, reduce data sorting cost, can reduce CPU consumptionCopy the code

Time cost

Because indexes are ordered fast lookup structures, maintaining the fast lookup and ordered nature of indexes requires constant adjustment, and adjustment requires time costs.

Creating and maintaining indexes is time consuming, and indexes need to be maintained dynamically as data in a table is added, deleted, or modified, which slows down data maintenance.

And this time cost increases with the amount of data!

The cost of space

Second, each index is a B+Tree that holds indexes and references to entity tables and takes up space.

If you build a clustered index, the data and primary keys are stored in the index file, and the space cost is higher.

Please look forward to embarrassing little white index two face content!

For more exciting content, please pay attention to wechat public account: Jiongmefeishi (or search: Jiongmefeishi)

The MySQL Interview Cheat sheet is a summary of the index questions