Blow brag

A few days ago the boss asked me how to carry out SQL optimization, I answered the new index. Ha ha ha, then the boss went out looking for a stick, came in and said to me you know where the door is, you go out or I invite you out?

Then forced to go out to be beaten, after coming back the boss said to see what is Explain, and then handed in a 5000 word review……

The following content is described in MySQL 8.0

2. Basic content

Mysql > alter table create index (); mysql > alter table create index (); First of all, InnoDB’s index model is B+ tree. In InnoDB, tables are stored in the form of indexes according to the order of primary keys, which is called indexed organized tables. Each new index corresponds to a B+ tree in InnoDB. Suppose we have a table with a primary key column ID, a field K, name, and an index on K.

create table T( id int primary key, k int not null, name varchar(16), index (k))engine=InnoDB; Copy the codeCopy the code

Insert three rows into table:

INSERT INTO T (id, k) VALUES (100, 1);
INSERT INTO T (id, k) VALUES (200, 2);
INSERT INTO T (id, k) VALUES (300, 3);
INSERT INTO T (id, k) VALUES (500, 5);
INSERT INTO T (id, k) VALUES (600, 6);Copy the code

An example of two trees is shown below:

As can be seen from the figure, each new index adds a new B+ tree, and the index is divided into primary key index and non-primary key index: primary key index: the leaf node of the primary key index stores the entire row of data. In InnoDB, primary key indexes are also called clustered indexes. Non-primary key index: The content of a leaf node that is not a primary key index is the primary key value. In InnoDB, non-primary key indexes are also called secondary indexes.

Now execute a query:

select * from T where id=500;Copy the code

This statement simply searches the B+ tree id and returns all the data for the leaf node.

Next query statement:

select * from T where k=5Copy the code

Select * from B+ tree where k=5; select * from B+ tree where k=5; select * from B+ tree where k=5;

That is, a query based on a non-primary key index requires one more index tree to be scanned. Therefore, we should try to use primary key queries in our applications.

Of course, I know that primary key queries are rarely used in real development, because primary keys do not normally exist in business processes, so consider the following statement:

select id from T where k=5Copy the code

Index tree (k, k, k, k, k, k, k, k, k, k, k);

  • Because overwriting indexes can reduce the number of tree searches and significantly improve query performance, it is a common performance optimization method to use overwriting indexes.

Based on the overwriting instructions above, let’s look at another example:

CREATE TABLE `tuser` (
  `id` int(11) NOT NULL,
  `id_card` varchar(32) DEFAULT NULL,
  `name` varchar(32) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `ismale` tinyint(1) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `id_card` (`id_card`),
  KEY `name_age` (`name`,`age`)
) ENGINE=InnoDBCopy the code

There is a primary key index id, two secondary indexes ID_card and name_age. Name_age is an index consisting of two columns. Suppose the following statement is now executed:

select id from tuser whereName like 'zhang %';Copy the code

Select * from name; select * from name; select * from name;

Select ID3, ID3, ID3, ID3, ID3, ID3, ID3, ID3, ID3, ID3

However, if the combined index is defined as name_age (age,name), the above query does not use the combined index, which is called the left-most prefix principle.

  • There is an unwritten rule in the industry that a single table contains no more than 20 fields and no more than 5 indexes. As the amount of data increases, too many indexes occupy too much physical space.

Of course, these are some basic SQL, encountered some more complex SQL statement, how to optimize it, this will use Explain execution plan, first let’s look at the example (this SQL statement is very complex, is now used by the company, so type a code, sorry ha) :

Here’s the point:

Using the Explain keyword, you can see the number of rows scanned by each query, which index is used, etc. The following article will Explain what each field means and then look at how to optimize…

3, Explain the execution plan

In the case of the image above, Explain mainly displays information about how the optimizer’s expectations match the actual execution times and other iterator-based information through the following fields:

id

Id is the order of execution, which is the priority of each statement, which can be the same (which is determined by the optimizer) or different (the higher the ID value, the higher the priority, and the earlier the statement is executed).

select_type

Select query type, mainly used to distinguish a variety of complex queries, such as ordinary query, federated query, sub-query, etc., the value of the following:

SIMPLE: indicates the simplest select query statement, that is, the query does not contain operations such as subquery or union fetch and set.

PRIMARY: When the query contains any complex subparts, the outermost query is marked as PRIMARY.

SUBQUERY: When a select or WHERE list contains a SUBQUERY, the SUBQUERY is marked as SUBQUERY.

DERIVED: Represents the select contained in the SUBquery in the FROM clause. Subqueries contained in the FROM list are marked as DERIVED.

UNION: If the UNION is followed by a SELECT statement, it will be marked as UNION. If union is included in the subquery of the FROM clause, the outer select is marked as derived.

UNION RESULT: represents reading data from the temporary table of the UNION, and the

of the table column indicates the UNION operation with the results of the first and fourth select.
,4>

table

The name of the table to be queried is not necessarily a real table. If an alias is specified for the table, the alias is displayed. It may also be a temporary table. (Just a table name, just a comment)

partitions

For non-partitioned tables, the value is NULL. If partitioned tables are queried, partitions in partitioned table names are displayed.

type

What type of query is used, which is a very important metric in SQL optimization, in order of good to bad: system > const > eq_ref > ref > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

System: When a table has only one row (system table), the amount of data is small and disk I/O is not required.

Const: indicates that the query hit the primary key or unique index, or that the joined part is a const. This kind of scan is highly efficient, returns little data, the speed is very fast.

Eq_ref: the primary key or unique key index is matched during query.

Ref: Different from eq_ref, ref means that a non-unique index is used and many rows are found that match the condition.

Ref_or_null: This connection type is similar to ref, except that MySQL searches for rows containing NULL values in addition.

Index_merge: The index merge optimization method is used, and the query uses more than two indexes.

Unique_subquery: It’s just that the index lookup function used to completely replace the subquery is more efficient.

Index_subquery: Different from unique_subquery, used for non-unique indexes and can return duplicate values.

Range: Selects rows using an index, retrieving only rows in a given range. Simply put, this means retrieving data from a given range for an indexed field. Use bettween in the WHERE statement… And, <, >, <=, in and other conditional query types are range.

Index: Both index and ALL read the entire table. The difference is that index is read through the index tree, while ALL is read from the hard disk.

ALL: traverses the entire table to find matching rows, with the worst performance.

possible_keys

If an index exists on a column, the index will be listed, but this index is not necessarily the index that will be used in the final query. In some cases, you may need to specify force index when the optimizer does not execute the specified index.

key

Unlike possible_keys, key is the actual index used in the query. If no index is used, it is displayed as NULL.

key_len

Key_len: indicates the length (bytes) of the index used in the query. In principle, the shorter the length is, the better.

A single-column index takes the entire index length into account. A multi-column index, not all columns are available, needs to calculate the columns actually used in the query.

ref

Ref: const, func, null

Display const when using constant equivalents. In an associated query, the associated fields of the corresponding associated table are displayed. If the query condition uses expressions, functions, or if the condition column is implicitly converted, it may appear as FUNc. In other cases, null is displayed.

rows

Rows: Estimates the number of rows that need to be read to find the records we need, based on table statistics and index usage.

This is an important data to evaluate the performance of SQL, mysql needs to scan the number of rows, very intuitive display of SQL performance, in general, the smaller the rows value is the better.

filtered

Filtered is a percentage. In simple terms, this field represents the percentage of records returned by the storage engine that satisfy the criteria after the data is filtered. (After MySQL.5.7, default Explain directly displays filtered data for partitions and filtered data).

Extra

This column displays a lot of information that is not suitable for other columns, and a lot of additional information from Explain is displayed in this field:

  • Using index: we used an overwrite index in the corresponding select operation. Generally speaking, it means that the column of the query is overwritten by the index. Using index can be very fast.
EXPLAIN SELECT id FROM s_goods;Copy the code

  • Using WHERE: Not all queries with WHERE statements display Using WHERE when no index is available.
EXPLAIN SELECT * FROM s_goods WHERE cat_id = 1;Copy the code

  • Using temporary: Indicates that the query results need to be stored in a temporary table. This table is used for sorting or group query.
EXPLAIN SELECT shop_id FROM s_goods WHERE cat_id IN (1,2,3,4) GROUP BY shop_id;Copy the code

  • Using filesort: indicates a sort operation that cannot be done Using an index, i.e. an ORDER BY column that does not have an index.
EXPLAIN SELECT * FROM s_goods ORDER BY shop_id;Copy the code

  • Using join buffer: Displays this if the associated field does not use an index.
EXPLAIN SELECT * FROM s_goods a LEFT JOIN s_goods_cats b ON a.shop_id = b.parent_id;Copy the code

  • Impossible WHERE: Indicates that there are no eligible rows when we use the incorrect WHERE statement.
EXPLAIN SELECT * FROM s_goods WHERE 1 <> 1;Copy the code

  • No tables Used: Our query does not have a FROM clause or a FROM DUAL clause.
EXPLAIN SELECT NOW();Copy the code

Write in the last

First of all, thank you for being here.

If there is any understanding of this article is not in place, or the explanation is not correct, welcome to comment!

Ps: If you have any other questions, you can also scan and find me. Welcome to talk to me