Recently, I checked the TiDB slow SQL in the project. When QUERYING the execution plan, I found that the details of TiDB execution plan are still different from those of mysql. Today, I will learn and share the details
Query Plan Command
EXPLAIN command, you can view the execution plan of TiDB when executing SQL
The EXPLAIN SQL statementsCopy the code
For example (desensitization data)
Execute EXPLAIN
EXPLAIN
select
a0_.id,
a0_.create_time,
a0_.end_time,
a0_.flow_id,
a0_.campaign_id,
a0_.unit_id,
a0_.oa_id,
a0_.org_path_,
a0_.param,
a0_.start_time,
a0_.state,
a0_.user_type,
a0_.update_time,
a0_.user_id
from
table_a a0_
where
a0_.campaign_id = 354361236223
and a0_.user_id = 25325123
and a0_.user_type = 1
and a0_.param = '1'
limit
1000
Copy the code
Execution Plan Result
The execution plan is presented in a tree structure. Let’s talk about what each column means:
id
Is an operator, and is a subtask that needs to be executed at each step during SQL executionestRows
Estimate the number of rows to process for each subtasktask
Is the position of the subtask when it is executedaccess-object
Objects placed by subtasks, such as tables, indexes, etcoperator info
Some information about the execution of subtasks can be regarded as operation logs
I’m going to focus on operators today
Id: indicates an operator. It is a sub-task that needs to be executed at each step during SQL execution
An operator is a specific step performed to return the result of a query
TiDB operators are mainly divided into two types: the first type is the operators of scanning tables, and the second type is the operators of gathering scanned data or calculation results on TiKV/TiFlash
Operators of the first class: operators that scan tables
Table sweep operators have the following classes:
TableFullScan: Full table scan
If an index is not used in a general query condition or is invalid, TableFullScan will appear in the execution plan
TableFullScan chestnuts:
select
*
from
tablea a0_
Copy the code
SQL > select * from table_name where table_name ();
See the execution plan, appearTableFullScan
Id forTableFullScan
+ indicates that the subtask performed in this step performs a full table scan
IndexFullScan: Scans index data in full
IndexFullScan chestnut 1: aggregate query IndexFullScan chestnut, use COUNT:
select
COUNT(user_id)
from
tablea a0_
Copy the code
SQL > select COUNT from user_id; select COUNT from user_id; select COUNT from user_id;
See the execution plan, appearIndexFullScan
Id forIndexFullScan
+ an ordinal indicating that the subtask performed in this step is performing the index columnuser_id
A scan of the full index data was performed
Query IndexFullScan chestnut using group by:
select
user_id
from
tablea a0_
GROUP by
user_id
Copy the code
SQL > select * from user_id where group by is used for index column user_id;
See the execution plan, appearIndexFullScan
Id forIndexFullScan
+ an ordinal indicating that the subtask performed in this step is performing the index columnuser_id
A scan of the full index data was performed
IndexFullScan chestnut 3: Aggregate query IndexFullScan chestnut using min function:
select
MIN(user_id)
from
tablea a0_
Copy the code
SQL > select * from user_id; select * from user_id; select * from user_id;
See the execution plan, appearIndexFullScan
Id forIndexFullScan
+ an ordinal indicating that the subtask performed in this step is performing the index columnuser_id
A scan of the full index data was performed
IndexFullScan chestnut 4: Subquery IndexFullScan chestnut, using the index IN subquery, when the subquery is full:
select
*
from
tablea a0_
where
user_id IN (
select
user_id
from
tablea
)
Copy the code
Select * from user_id; select * from user_id; select * from user_id; select * from user_id;
Select * from TableFullScan_49; select * from TableFullScan_49; select * from TableFullScan_49; select * from TableFullScan_49; select * from TableFullScan_49; Appear IndexFullScan
Query IndexFullScan chestnut 5: JOIN query IndexFullScan chestnut, use left JOIN, when the left join table is full data:
select
a0_.*,
a1_.*
from
tablea a0_
LEFT JOIN (
select
*
from
tablea
) as a1_ ON a0_.user_id = a1_.user_id
Copy the code
In this SQL, user_id is used for left JOIN. When the left linked table is scanned for all index data, IndexFullScan operator will appear, and the execution plan is as follows:
To look at the execution plan, the left row table is a full table scan, so the index column will beuser_id
Perform a scan of full index dataIndexFullScan
TableRowIDScan: According to the rowId scan table data transmitted from the upper layer, in popular terms, the query first obtains the rowId through the index, and then reads the data according to the rowId
According to the rowId scan table data passed from the upper layer, generally speaking, it is to query the index to obtain THE rowId, read data according to the rowId, TableRowIDScan will appear in the execution plan, for example
TableRowIDScan chestnuts:
select
*
from
tablea a1_
where
a1_.user_id = 123214125
Copy the code
For a simple SQL look, the execution plan is as follows:
Because the index column user_id is used, rowId is obtained from the index, and rowId is used to read the table data. Therefore, TableRowIDScan appears in the execution plan, and the ID is TableRowIDScan + a serial number, indicating that, The subtask performed in this step is the rowId scan table data obtained by sending indexes
IndexRangeScan: Index data scan with range
Index data scan with range, use this chestnut
TableRowIDScan chestnuts:
select
*
from
tablea a1_
where
a1_.user_id = 123214125
Copy the code
For a simple SQL look, the execution plan is as follows:
Because the range query is used for index column user_id, IndexRangeScan is displayed in the execution plan. If the ID is IndexRangeScan +, the sub-task of this step is to scan index data with a range
The second type of operator: the operator that aggregates scanned data or calculated results on TiKV/TiFlash
The operators of the data aggregation class are as follows:
TableReader: Summarizes the data obtained by TableFullScan or TableRangeScan
Summarize the data obtained by TableFullScan or TableRangeScan
TableReader converts full table scan TableFullScan
select
*
from
tablea a1_
Copy the code
SQL > select * from table_name where table_name ();See the execution plan, because there is no index query, a full table scan, appearTableFullScan
So it was finally usedTableReader
Operator, for the full table scan data are summarized
IndexReader: Summarizes data obtained by the upper and lower table scan operators IndexFullScan or IndexRangeScan
The data obtained by the upper and lower table scan operators IndexFullScan or IndexRangeScan are summarized
IndexReader converges full index scan chestnut:
select
MIN(user_id)
from
tablea a0_
Copy the code
As the min function is used on the index column, the full index will be scanned and IndexFullScan operator will appear. Therefore, IndexReader operator will be used to aggregate the data obtained by IndexFullScan operator. The execution plan is as follows:
IndexLookUp
First collect the RowID scanned by TiKV at Build end, and then Probe end to read the data on TiKV accurately according to these RowID. Build is an operator of type IndexFullScan or IndexRangeScan, and Probe is an operator of type TableRowIDScan
IndexLookUp chestnuts:
select
*
from
tablea a1_
where
a1_.user_id = 123214125
Copy the code
The implementation plan is as follows:
Look at this SQL, it is a column by indexuser_id
The index range scan is performed, as described above, by performing the same logic as the index columnuser_id
A range scan was performed to get all the eligiblerowId
And then throughrowId
Scan the table to get the data, see the execution is also, first inBuild
End,IndexRangeScan
Operator, for indexed columnsuser_id
We ran a range scan. We got itrowId
In theProbe
End, in passingTableRowIDScan
Operator, throughrowId
Scan the table for data, and finally passIndexLookUp
Operator to aggregate the final data
3. Sequence of the operator execution
- The structure of the operator is a tree, but in the execution of the query, it is not strictly required that the child node tasks be completed before the parent node. Moreover, TiDB executes the nodes of the same query in parallel
- Again, the above SQL is chestnut
select
a0_.*,
a1_.*
from
tablea a0_
LEFT JOIN (
select
*
from
tablea
) as a1_ ON a0_.user_id = a1_.user_id
Copy the code
The implementation plan is as follows:At every level,Build client
Always precededThe Probe tip
Execute, andBuild client
Always present inThe Probe tip
In front of the
The operator in TiDB execution plan is enough for you, and the following information will be filled in for you, etc. Welcome to exchange, point out some mistakes in the article, let me deepen my understanding.