preface

MySQL index is the most common question in the interview, the author received a few days ago a HR little sister interview phone, the little sister said that the company implemented 996, asked me if I can accept 😳, I have not 996, where DO I know 996 is what the feeling ah ah, I said a perfunctor should be ok, and then I went home to think carefully, Or politely declined an invitation for an interview.

The most interesting thing was the first technical interview with HR little sister, who asked me a lot of questions about MySQL index. What was the trouble? I was asked a lot of technical questions in an HR interview, which I didn’t agree with (just kidding). It also reflects that the company is really busy, and the HR needs to filter some resumes through technical questions, so as to reduce the time cost of development.

Last week, however, a repository for 996-ICU on Github suddenly became popular and currently has a 11K star. I’ve read a lot of discussions about 996, and the key points I’ve come to conclude are: developers feel uncomfortable about forcing 996 and taking 996 for granted. They think overtime is acceptable if the company is busy, and most development projects will voluntarily work overtime if things are not finished. I personally share these views.

Well, having said that, let’s introduce some types of MySQL indexes, some of which refer to MySQL High Performance.

Type of the MySQL index

There are many types of indexes that can provide better performance for different scenarios. In MySQL, indexes work at the storage engine layer. So there is no universal index standard, different indexes work differently, and not all engines support all index types. The following five types of indexes are commonly used, and we’ll examine their advantages and disadvantages.

Index optimization is probably the most effective way to optimize query performance.

B-tree indexes

When people talk about indexes, they are referring to b-tree indexes unless otherwise specified, which are supported by most MySQL engines. Different MySQL engines implement b-tree indexes in slightly different ways, each with its own advantages. For example, MyISAM uses prefix compression to make index space smaller, while InnoDB stores raw data. For example, MyISAM indexes refer to indexed rows by the physical address of the data, whereas InnoDB indexes refer to indexed rows by primary key fields.

B-tree indexes greatly speed up query efficiency. Because the query process does not need to scan all rows in the whole table, it only needs to search down from the root node, and search to the left child node if the node is smaller than the change node; otherwise, search to the right child node, and finally find the value to be queried quickly.

Assume the following table:

  CREATE TABLE People(
    last_name varchar(20) not null,
    first_name varchar(20) not null,
    dob date null,
    gender enum('m'.'f') not null.key(last_name, first_name, dob)
  );
Copy the code

For each row in the table, the index contains the values of the last_name, first_name, and DOB columns.

Examples of partial entries in an index tree

Select last_name (last_name), first_name (doB) from last_name (last_name), first_name (doB) from last_name (last_name), first_name (doB) from last_name (last_name) When last_name and first_name are equal in the bottom right corner of the figure, the order is sorted by date of birth.

The query type that can be used for the B-tree index

  • Full value of the query

    A full-value query matches all index columns. For example, the aforementioned index can be used to find a person named Cuba Allen, born in 1960-01-01.

  • Matches the leftmost prefix query

    The matching left-most prefix query matches index fields in left-to-right order, such as the aforementioned index used to query for people with the last name Allen.

  • Match column prefix query

    The match column prefix matches the beginning of a column. For example, the aforementioned index can be used to find all last names beginning with Ja. Specific query statements.

    SELECT * FROM People WHERE last_name Like 'Ja%';
    Copy the code
  • Matching range query

    A range query is simply a query of a range, for example, the aforementioned index can be used to query everyone with the last name Allen through Astaire.

  • Matches one column and ranges to another

    The aforementioned index also looks for all last names Allen and names beginning with the letter K (for example, Kim, Karl, and so on). That is, the first column lsit_name all matches, and the second column first_NAME prefix matches.

  • Queries that access only indexes

    B-tree supports index-only query. That is, only indexes are accessed, not rows. This query method is called “overwrite index”. Simply speaking, overwrite index avoids the overhead of querying data rows back to the table and improves the query efficiency. Interested students can search the relevant knowledge of “overwrite index” by themselves.

Some restrictions on the use of b-tree indexes

  • If the b-tree index is not queried from the leftmost column, the index cannot be used. For example, in the previous example, the index could not query for a person with the name Kim, or for a person with a particular birthday, because these two columns are not the leftmost columns. Similarly, you can’t find people whose last names end with certain letters.

  • Columns of the index cannot be skipped. The previous index cannot query all persons whose last name is Allen and birthday is 1960-01-01. In this case, only the first column of the index can be used.

  • If a column in the query uses a range query, all columns to the right of that column cannot use the index optimization query. For example, this query statement:

    SELECT * FROM People WHERE last_name='Allen' AND first_name like 'K%' AND dob='1930-07-12';
    Copy the code

The above query can only use the first two columns of the index because the second query condition is a range query. If the conditions of a range query are limited, multiple equivalent condition queries can be used instead.

summary

The order in which the indexes are created is very important, and the order of the query conditions is also very important. We must avoid these pits when creating MySQL tables at work. We find that because of business changes, we actually need to go back and optimize the structure of the table we created, including the optimization of the index structure,

The hash index

A hash index is implemented based on a hash table, and only queries that accurately match all columns of the index are valid. In MySQL, only the Memory engine supports hash indexes, which is the default index type of the Memory engine. Here is an example. Assume the following table:

CREATE TABLE testhash(
  fname varchar(50) not null,
  lname varchar(50) not null.KEY USING HASH(fname)
)ENGINE=MEMORY;
Copy the code

Insert data:

INSERT INTO testhash(fname, lname) values('Arjen'.'Lentz'), ('Baron'.'Schwartz'), ('Peter'.'Zitsevo'), ('Vadim'.'Tachenkoe');
Copy the code

Table data is as follows

fname lname
Arjen Lentz
Baron Schwartz
Peter Zaitsev
Vadim Tkachenko

Suppose we have a hash function f(), which returns the following data (the hashes are all sample data) :

f(‘Arjen’) = 2098

f(‘Baron’) = 4920

f(‘Peter’) = 8372

f(‘Vadim’) = 5293

Hash indexes are stored as follows:

Groove (Slot) Value (value)
2098 Point to row 1
4920 Point to line 2
5293 Point to line 4
8372 Point to line 3

Now let’s look at the following query:

SELECT * FROM testhash WHERE fname='Peter';
Copy the code

MySQL > select * from ‘Peter’; MySQL > select * from ‘Peter’; Hash indexes only need to store the corresponding hash value, so the hash index structure is relatively compact, which also makes the query speed of hash index very fast. However, hash indexes have some limitations:

  • Hash indexes store row Pointers, not row data, so each query cannot avoid reading rows.
  • Hash indexes are not stored in index order and therefore cannot be used for sorting.
  • Hash indexes do not support partial index column matching queries because hash indexes store all index columns to compute hash values.
  • Hash indexes can only be used for equivalent queries and do not support any range queries.
  • Some indexes can be expensive to maintain if there are many hash conflicts.

Because of these limitations, hash indexes are only suitable for certain application scenarios, and when they are used, the performance gains are significant.

Spatial Data Index (R-tree)

MyISAM tables support spatial indexes that can be used to store geographic data. Unlike the B-tree index, this type of index does not need prefix query. Spatial indexes index data from various dimensions. You can effectively use any combination of dimensions to query. You must use MySQL’s GIS-related functions, such as MBRCONTAINS(), to maintain data. MySQL’s GIS function is not perfect enough, so most people won’t use this feature. One of the best GIS solutions for open source databases is PostgreSQL’s PostGIS.

The full text indexing

Full-text index is a special way of indexing, he is looking for the key words in the text, the general user keyword retrieval business scene, full-text index is more similar to what search engines do. Full text indexing is rarely used in practical work. Generally, the search engine framework of ElasticSearch is used for keyword retrieval with larger data volume.