Analyze MySQL SQL execution with Explain commands

In the previous article, we talked about row locks being added to indexes. However, complex SQL often contains multiple conditions and multiple indexes. It is important to find out which indexes are used in SQL execution to analyze lock scenarios.

For example, SQL like this:

mysql> delete from t1 where id = 1 or val = 1
Copy the code

Where id and val are indexes, which indexes are used and which locks are applied? To do this, we need to use Explain to get the execution plan for MySQL to execute this SQL.

What is an execution plan? In simple terms, it is the performance of SQL executed in the database. It is usually used in SQL performance analysis, optimization, and locking analysis scenarios. The execution process is generated by the parser, preprocessor, and query optimizer during MySQL query.

MySQL > select * from ‘MySQL’;

If you can understand how MySQL optimizes and executes queries, it will be helpful not only to optimize queries, but also to determine the final locking scenario by analyzing the indexes used.

The following figure shows MySQL performing a query. Each step is actually more complicated than you think, and the optimizer, in particular, is more complicated and harder to understand. This article only gives a brief introduction.

The MySQL query process is as follows:

The client sends a query to the server.
The server first checks the query cache, and if a hit is made, it immediately returns the result stored in the cache. Otherwise move on to the next stage.
SQL is parsed and preprocessed on the server, and then the optimizer generates the corresponding execution plan.
MySQL then calls the storage engine API to execute the query based on the execution plan generated by the optimizer.
Returns the result to the client.

The execution plan

MySQL parses queries, creates internal data structures (parse trees), and optimizes them in a variety of ways, including rewriting queries, determining the order in which tables are read, selecting appropriate indexes, and so on.

The user can influence the optimizer’s decision process by giving it a keyword hint. You can also use Explain to understand how the database makes optimization decisions and provide a reference for users to reconstruct the schema of the query and database tables and modify the database configuration to make the query as efficient as possible.

Let’s take a look at the relevant output parameters in Explain and explain what they mean with practical examples.

select_type

The operation types for querying data are as follows

Simple Simple queries, which do not contain subqueries or unions, as shown in the figure below, are the simplest query statements.

Primary is the value that the outermost query is marked with when the SQL contains complex subqueries.
Derived is when the subqueries contained in the FROM clause of SQL are labeled with that value, and MySQL executes those subqueries recursively, putting the results in temporary tables. The following figure illustrates both types.

Subquery is the subquery that SQL contains in select or WHERE and is marked with this value.

Dependent subquery: The first select in a subquery, dependent on the outlying query, usually in.

Union is the second SELECT in SQL that appears after the union keyword and is marked with this value; If union is contained in a subquery from, the outer select is marked as derived.
Union result Select the result from the union table. The following figure shows the SQL case for union and Union result.

The dependent Union is the second or subsequent select statement after the union keyword and, like the dependent subquery, depends on the external query.

type

The join types of the table, in descending order of performance, are system, const, eq_ref, ref, range, index, and all.

System indicates that the table has only one row, which is equivalent to the system table. As shown in the figure below, the table join type of primary is System because the table derived from the subquery from has only one row of data.

Const is found by index at a time, matches only one row of data, and is used for constant value comparisons between PRIMARY keys or UNIQUE indexes.

Eq_ref Unique index scan. For each index key, only one record in the table matches it. Usually used for primary key or unique index scan. For each combination of rows from the preceding table, one row is read from the table. It is the best connection type besides const.

As shown in the figure below, the query type for table T1 is ALL, indicating full table scan. Then, each row in T1 is compared with the primary key index t2.id, so the query for table T2 is eq_ref.

Ref non-unique index scan, which returns all rows matching a single value, differs from eq_REF in that the index is non-unique, as shown in the following example.

Range only examines rows in a given range, using an index to select rows when using operators such as =, between, >, <, and in, and using constants to compare key columns. As shown in the figure below, where ID is the unique index and val is non-unique.

Index is similar to ALL except that it traverses the index tree to read the index value, which is slightly faster than ALL because index files are usually smaller than data files. MySQL index overwrite is involved here
ALL full table scan generally has poor performance and should be avoided.

Possible_keys, key, and key_len

The possible_key column indicates which index MySQL might use to look up in the table. If the column is NULL, the associated index is not used. You need to check where clause conditions to create appropriate indexes to improve query efficiency.

The key column shows the index that MySQL actually decides to use. If no index is selected, the value is NULL.

Key_len displays the length of the index that MySQL decides to use. If the key is NULL, the column is also NULL, and the index length used is as short as possible to ensure accuracy. Because shorter, index files are smaller and require fewer I/O times.

Select * from T2 where id = 1 or val = 1; select * from T2 where id = 1; idx_t2_val;

These are the fields we care about most when we analyze locking scenarios, and a future article will show you how to use these fields in conjunction with other tools to determine which locks have been added to complex SQL.

ref

The REF column indicates which column or constant of the other table is used to select rows from the table. Id = t1.id, so ref is mysql.t1.id

Rows and filtered

The rows column shows the number of rows that MySQL thinks it must check when performing a query.

Filtered list Indicates the percentage of the number of rows returned after SQL statements are executed to the number of read rows. The greater the value, the better. MySQL uses Table Filter to read rows for filtering. Theoretically, the efficiency is highest when the number of rows read is equal to the number of rows returned. The higher the filtering ratio, the lower the efficiency.

As shown in the figure above, there are three rows in table T1, and rows are 3, indicating that all rows are read out. Filtered table filter val = 3 returns only one row of data, so filtered table filter is 33.33%.

extra

Contains additional information that is not suitable for display in other columns, but is important. Common values are as follows

Using index indicates that the select operation uses an overwrite index to avoid accessing rows of the table, which is efficient.
The using WHERE clause is used to restrict which line. That is, Table Filter is used to Filter data.

Select * from extra can also be used to read data using an overwrite index, as shown in the following figure. Because only three rows are read using the val index, filtered through the WHERE clause, which is 55%, extra uses using WHERE.

Using Filesort MySQL uses an external index sort for the data, rather than reading the data in the index order within the table. If this value occurs, you should optimize the SQL statement. As shown in the figure below, the val column has no index, so it cannot be read using index ordering.

Use temporary tables to hold intermediate results. For example, MySQL uses temporary tables to sort query results, often for order BY and group BY. If this value is present, the SQL should be optimized. In my experience, temporary tables are generated when group by is an unindexed column, or when the ORDER by or group by column is not the first table from the JOIN statement sequence.
Using join buffer uses the join buffer. The figure below shows the connection cache and temporary tables. About the content of the connection cache, you can consult, there is time to write an article to explain.

When DISTINCT finds the first match, it stops searching for more rows for the current row combination

Afterword.

After knowing the SQL execution plan through Explain, we can not only understand the index used in SQL execution, determine the locking scenario, but also optimize the SQL for other information, such as optimizing the type from index to ref.

Personal blog, welcome to play

Analyze MySQL SQL execution with Explain commands

MySQL > select * from ‘MySQL’;

The execution plan

select_type

type

Possible_keys, key, and key_len

ref

Rows and filtered

extra

Afterword.

Related Posts

Debug techniques in IDEA

Reids(4) – Magic HyperLoglog solves statistical problems

TCP communication process and tcpdump packet capture analysis