This topic covers index query optimization, congratulations, you have reached the intermediate level of mysql optimization. In this article we are going to talk about the key of mysql optimization — index optimization. The interviewer absolutely has to ask

directory

More about the index, divided into the following points to explain:

Index overview (what is an index, index advantages and disadvantages)

Creating an index

Iii. Basic principles of indexes (interview key points)

4. Index data structure (B tree, Hash)

Five, the principle of creating an index (most important, interview must ask! Please collect!

How to delete data with a level of 6 million or above


I. Overview of indexes

1) What is an index?

Indexes are special files (indexes on InnoDB tables are part of the table space) that contain Pointers to all the records in the table. More generally, an index is a table of contents. When you are using Xinhua dictionary, help you tear off the catalog, you can only search the idiom at the beginning of a word from the first page to the thousandth page. Tired! Give the directory back to you, you can quickly locate!

2) Advantages and disadvantages of indexes:

It can greatly speed up the retrieval of data, which is the main reason for creating indexes. , and through the use of index, can be in the process of query, use optimization hide, improve the performance of the system. However, indexes also have disadvantages: they require additional maintenance costs; Because index files are separate files, the addition, modification, and deletion of data will generate additional operations on index files, which consume extra IO and reduce the efficiency of adding, changing, and deleting data.

Second, the basic use of indexes

1) create index :(3 ways)

The first way:

ALTER TABLE ALTER TABLE ALTER TABLE

ALTER TABLE creates a normal, UNIQUE, or PRIMARY KEY index.

Table_name indicates the name of the table to which the index is to be added. Column_list indicates the column to which the index is to be added. If there are multiple columns, the columns are separated by commas.

The index name index_name is self-naming. By default, MySQL assigns a name based on the first index column. In addition, ALTER TABLE allows multiple tables to be changed in a single statement, so multiple indexes can be created at the same time.

Third method: Run the CREATE INDEX command to CREATE the INDEX

CREATE INDEX Can add a normal or UNIQUE INDEX to a table. (However, you cannot create a PRIMARY KEY index.)

Three, the basic principle of the index (do not want to like other articles a lot of length nonsense)

Indexes are used to quickly find records that have specific values. If there is no index, the query will generally traverse the entire table.

The principle of indexing is simple: it turns unordered data into ordered queries

1. Sort the contents of the indexed columns

2. Generate an inversion list for sorting results

3. Spell the data address chain on the inversion list

4, in the query, first get inversion list content, and then take out the data address chain, so as to get specific data

4. Index data structure (B tree, Hash)

1) B-tree index

Mysql uses storage engine to fetch data, and almost 90% of people use InnoDB. According to the implementation, InnoDB has only two index types: BTREE index and HASH index. B-tree index is the most frequently used index type in Mysql database. Almost all storage engines support BTree index. Mysql > select * from BTREE; select * from BTREE; select * from BTREE;

Query mode:

Primary key index area :PI(address of associated saved data) press the primary key to query,

Common index area :si(address of the associated ID, and then to the address above). So press the primary key query, the fastest

B + tree properties:

1.) N subtrees contain n keywords, which are not used to store data but to store indexes of data.

2.) All leaf nodes contain information of all keywords and Pointers to records containing these keywords, and leaf nodes themselves are linked in large order according to the size of keywords.

3.) All non-terminal nodes can be regarded as index parts, which only contain the maximum (or minimum) keyword in the sub-tree.

4.) In B+ tree, data objects are inserted and deleted only on leaf nodes.

5.) B+ trees have two head Pointers, one is the root node of the tree and the other is the leaf node of the minimum key code.

2) Hash index

Briefly said, simple implementation of a HASH table is similar to the data structures (HASH), when we use HASH index in mysql, mainly through the HASH algorithm (common HASH algorithm is directly addressing method, in the square method and folding method, the divisor residual method, random number method), puts the data into a database field long HASH value, The row pointer to this data is stored in the Hash table; If a Hash collision occurs (two different keywords have the same Hash value), they are stored in a linked list under the corresponding Hash key. Of course, this is just a rough simulation.

Ps: About data structure, if you are interested in it, you can follow me and check the topic “Data Structure” later. There is no detailed explanation here.

5. Principles for index creation

Indexing is good, but it is not unlimited. It is best to comply with the following principles

Mysql will keep matching to the right until it hits a range query (>, <, between, like). A = 1 and b = 2 and c > 3 and d = 4 a = 1 and b = 2 and c > 3 and d = 4 a = 1 and b = 2 and C > 3 and d = 4

2) create an index for a field that is frequently used as a query condition

3) Frequently updated fields are not suitable for creating indexes

4) If the column cannot effectively distinguish data, it is not suitable for index column (such as gender, male and female unknown, there are only three at most, the distinction is too low).

5) Expand indexes as much as possible, do not create new indexes. For example, if you want to add (a,b) to a table that already has an index of A, you only need to modify the original index.

6) foreign key columns must be indexed.

7) For those columns that are rarely involved in the query, do not create indexes for those columns with a high number of duplicate values.

8) Do not index columns of data types defined as text, image, and bit.

How do I delete data at the million level or above

About index: Because index needs extra maintenance cost, index file is a separate file, so when we add, modify and delete data, there will be extra operations on index file, these operations need to consume extra IO, which will reduce the efficiency of add/change/delete. So, when we delete millions of database data, check the MySQL official manual to see that the speed of deleting data is proportional to the number of indexes created.

  1. So when we want to delete millions of data, we can delete the index first.

  2. Then delete the useless data (this process takes less than two minutes)

  3. Re-create the index after the deletion (when there is less data) is also very fast, in about 10 minutes.

  4. With the previous direct delete is definitely much faster, not to mention in case of delete interruption, all delete will be rolled back. That’s even worse.

Today, the explanation of index here, focus on mention, index basic principle and the principle of creating index is the focus, the basic interview must ask! There are so many things you can add to your collection. This number has a number of topics, such as [data structure], [Netty topics], [Dubbo topics], [mysql optimization topics], [Redis topics], [high concurrency topics] and other high-quality good articles. Let’s pay attention if you think you have something to gain.