I almost always ask in an interview: What are your principles for indexing? How to avoid index failure?

Before we have introduced in detail about the principle of the principle of index and the index of the query, the so-called work must first sharpen his doing, everyone in the learning phase must be step by step to learn the knowledge, must not have grandiose, must be not rashness, strive for a radish, a pit, learned after one-time had the knowledge, and then to use them.

As discussed in the previous article, indexes should be designed according to the WHERE condition and the fields following ORDER BY and GROUP BY. The reasons for this are explained in my previous article on the principles of MySQL indexes. Here’s a quick overview.

MySQL for the primary key index will maintain a B + tree structure, this is what we call the cluster index, for the primary key (usually is to establish a joint index) will prioritize the index field, and then start from the first field value comparison, the first field value is same for the next field value comparison, push back in turn.

If the fields in the federated index all have the same value, then sort by primary key. In addition, the B+ tree of the clustered index (primary key index) stores all the information of the row, while the non-clustered index (non-primary key index) only stores the value of the index field and the value of the primary key field.

Ok, so far from reviewing the principles of indexing, in this article, we will continue to introduce the basic principles of MySQL setup, which are well understood, and which principles should be followed when designing and building indexes, according to “standard” indexes. Today we will explain all the principles of index design at once.

Just a few more words, about this point of knowledge, in the interview, I often ask candidates, in order to judge whether he really understand the index, rather than simply reciting the eight-part essay!

The primary key index

The primary key index is actually the simplest, but there are a few caveat here.

It is not recommended to use UUID as a primary key.

Why is that? Because uIDS are unordered, MySQL maintains a clustered index that is sorted by primary key. This means that the data in each data page must be sorted by primary key, and the data is previously linked to the data by a unidirectional list. The value of the largest primary key in the previous data page must be smaller than the value of the smallest primary key in the next data page, and data pages are maintained by a two-way linked list.

Let me draw a picture, as usual, to help you understand

If the primary key is the increased, MySQL only need according to the positioning of the primary key directory can quickly to the new record should be inserted in the end, if the primary key is not on the need to start from scratch every time comparison, and then find the right position, then insert the record, this really seriously affect the efficiency, so the design of the primary key must be on the.

In addition, unique indexes are similar to primary key indexes, but unique indexes are not necessarily incrementing, so the cost of maintaining a unique index is certainly greater than that of a primary key index.

A unique index has a unique value (a unique index can have a value of NULL), which makes it easier to identify a record from an index field, but may require a backtable query (I don’t know what a backtable is, because the previous article has explained this in detail).

Create indexes for frequently queried fields

When creating indexes, we need to create indexes for the fields that are often used as query conditions, which can improve the query speed of the whole table.

However, the query condition is usually not a single field, so there are more joint indexes.

In addition, fuzzy queries such as “like” are usually included in the query conditions. In fuzzy queries, it is better to follow the left-most prefix query principle.

Avoid indexing large fields

This can be translated into other words: try to use a small amount of data as the index.

For example, if there are two fields, one is varchar(5) and the other is varchar(200), the index of varchar(5) is preferred, because MySQL maintains indexes with the values of both fields. This will inevitably lead to the index taking up more space, and more time to compare when sorting.

What if you want to index varchar(100)? For example, if the address type is varchar(200), you can write:

CREATE INDEX  tbl_address ON dual(address(20));
Copy the code

Select the most distinguished column as the index

What does that mean? Let me give you an example that I think you’ll see right away.

Suppose you have a “gender” field that holds data that is either male or female. Such fields are not suitable for indexing.

The main characteristic of the value of such a field is that it is not distinguishable enough, and it is not suitable for indexing. Why?

Because if the values are almost equally likely to appear, you’re likely to get half the data no matter which value you search for.

In these cases, it is better not to have indexes because MySQL also has a query optimizer, and when the query optimizer finds a high percentage of a value in a table’s rows, it generally ignores the indexes and performs a full table scan.

The usual percentage line is “30 per cent “. The query will abandon the index when the amount of data matched exceeds a certain limit (this is one of the scenarios where index failure occurs).

That’s why. So this should give you an idea of why you should try to avoid using small cardinality fields as indexes. In fact, this is a proper word for MySQL.

Try to create indexes for fields following ORDER BY and GROUP BY

Create indexes for the fields after Order By, so that there is no need to sort again when querying, because we have known that the records in B+ tree are sorted after the establishment of indexes.

GROUP BY and ORDER BY are actually similar, so we put them together.

Because in theGROUP BYWhen also want to first according toGROUP BYThe following fields are sorted and then aggregated.

If the GROUP BY columns are not sorted, then MySQL needs to sort them first. This will create a temporary table, which will be sorted, and then aggregate the temporary table. MySQL does not need to sort, and will not generate temporary tables.

If the GROUP BY column is not the same as the ORDER BY column, the temporary table will be created even if the GROUP BY column is not the same as the ORDER BY column. Because the actual scenario is certainly not so simple and innocent

If GROUP BY columns have no indexes, create a temporary table. SELECT * from GROUP BY; SELECT * from GROUP BY; SELECT * from GROUP BY; SELECT * from GROUP BY; If GROUP BY columns have indexes,ORDER BY columns have no indexes. If the GROUP BY column is different from the ORDER BY column, a temporary table will be generated even if both columns have indexes. If the column of GROUP BY or ORDER BY is not from the first table in the JOIN statement. 7. The columns of GROUP BY and ORDER BY are the same and have primary keys, but the SELECT column contains columns other than the GROUP BY column, a temporary table is generatedCopy the code

Do not use functions in conditions

If a function is performed on a field that is already indexed, then the index is no longer available.

Why is that?

Since MySQL maintains a B+ tree for the index based on the original data from the field, if you add a function to the process of using the index, MySQL will not consider this to be the original field, and it will not be indexed.

But what if someone is stubborn and I need to use a function? You can’t change your business for an index, can you? If an index is invalid using a MySQL internal function, the index can be created with the function.

What does that mean? Suppose you have a field called age and have created an index for it, but this is what it looks like when you use it

SELECT * FROM student WHERE round(age) = 2;
Copy the code

The index is not available at this point, so if you really want round(Age) to be indexed, you can create the index this way

create index stu_age_round on test(round(age)); 
Copy the code

At this time through the above way to query, the index is effective, I believe that we can understand this.

Don’t create too many indexes

Because MySQL maintains indexes that are space and performance intensive, MySQL maintains a B+ tree for each index field.

So if there are too many indexes, this will definitely add to MySQL’s burden.

Do not index fields that are frequently added or deleted

This makes sense, because as we’ve already explained, MySQL needs to re-maintain the index if the field changes.

If a field changes frequently, that means that the index needs to be rebuilt frequently, which will inevitably affect MySQL performance. No more will be said here.

Most of what I said here is that we need to pay attention to some principles when we design. In fact, the real principles still need to change according to the actual business. There is no so-called “formula”, as long as the design that suits our actual business scenario is the best. Therefore, we should not be too pursuit of “optimization”, because this tends to backfire, after all, out of the business to talk about technology is playing rogue.

All right, so let’s focus on some of the ways in which indexes fail. (PS: This article is basically all theory, I want to draw a picture to express, the results found that it is impossible to start. I hope you stick to it, it will be finished soon.)

Common scenarios of index failure

useORKeywords can invalidate the index, but if you want to use OR without invalidating the index, you need toorEach column in the condition is indexed. This obviously goes against the above don’t create too many indexes.
A federated index that does not follow the leftmost prefix rule will also become invalid
Using fuzzy queries that start with % also invalidates the index (I will not repeat the reasons for this, as I have already mentioned them in the previous article, but this will help you remember).
Indexing columns that use implicit conversions also invalidates the index

Suppose the age field is of type int

SELECT * FROM student WHERE age=15
Copy the code

You can use the index in this case, but if you write it this way

SELECT * FROM student WHERE age='15'
Copy the code

In this case, the index is not used, i.e. the index of the age column is invalid.

Small field cardinality can also lead to index invalidation, as explained in detail in the previous section of this article, which is caused by the MySQL query optimizer.

Some of the other principles I would like you to take a look at the principles of indexing and the basic principles of querying, which might seem a little empty without the pretexts. So please you in the index of this piece of learning must step by step, this piece of basic is also we usually use MySQL when some of the core knowledge.

Follow the public account [Hollis], reply “God map” in the background, you can get Java engineer advanced mind map.