Then a TiDB execution plan (a), the last article mainly introduces the operator involved in the execution plan, today the rest of the execution plan to finish
Query Plan Command
EXPLAIN command, you can view the execution plan of TiDB when executing SQL
The EXPLAIN SQL statementsCopy the code
For example (desensitization data)
Execute EXPLAIN
EXPLAIN
select
a0_.id,
a0_.create_time,
a0_.end_time,
a0_.flow_id,
a0_.campaign_id,
a0_.unit_id,
a0_.oa_id,
a0_.org_path_,
a0_.param,
a0_.start_time,
a0_.state,
a0_.user_type,
a0_.update_time,
a0_.user_id
from
table_a a0_
where
a0_.campaign_id = 354361236223
and a0_.user_id = 25325123
and a0_.user_type = 1
and a0_.param = '1'
limit
1000
Copy the code
Execution Plan Result
The execution plan is presented in a tree structure. Let’s talk about what each column means:
id
Is an operator, and is a subtask that needs to be executed at each step during SQL executionestRows
Estimate the number of rows to process for each subtasktask
Is the position of the subtask when it is executedaccess-object
Objects of subtasks, such as tables, indexes, and so onoperator info
Some information about the execution of subtasks can be regarded as operation logs
The last article has the final say. Today, I will talk about the remaining fields in the execution planestRows
,task
,access-object
,operator info
The meaning of it
estRows
: Estimate the number of rows to process for each subtask
This one’s easy to understand. Just go straight to chestnuts
select
user_id
from
tablea a0_
GROUP by
user_id
Copy the code
SQL > select * from user_id where group by is used for index column user_id;
- Because the execution plan of this SQL is for index columns first
user_id
A full scan of index data was performed, usingIndexFullScan
Operator, soIndexFullScan_11
The estimated number of lines of the operator at this stepestRows
Is indexed columnuser_id
The data amount of full data, 133270314 - The next step in the SQL execution is through
IndexReader
The operator performs an aggregation of the data of the lower operator, soIndexReader_13
The estimated number of lines of the operatorestRows
isuser_id
Group by, 873229.35, 873229.35
task
: indicates the location of the subtask when it is executed
- Is the position of the subtask when it is executed
- There are two main types
cop
“Means to useTiKV
In theCoprocessor
Supports most functions (including aggregate functions and scalar functions),LIMIT
Operation, index scan, and table scanroot
, it is to point to inTiDB
Computing tasks performed in, generally all aggregatedTiKV/TiFlash
The data scanned on or the operator of the calculation result can only be used asroot
Task inTiDB
On execution, all of themJoin
Operations are only asroot
Task inTiDB
Performed on the- One of the goals of SQL optimization for TiDB is to push calculations as far down as possible
TiKV
Performed in the
Take a chestnut
Chestnut 1: aggregate query chestnut, using COUNT:
select
COUNT(user_id)
from
tablea a0_
Copy the code
SQL > select COUNT from user_id; select COUNT from user_id; select COUNT from user_id;
- For indexed columns
user_id
Using theThe COUNT function
, the index column is first evaluateduser_id
Perform full scan of index data,IndexFullScan_19
The execution position of the operator iscop[tikv]
- Subsequent executions
SteamAgg_8
Operator, because it’s an aggregate function, will also becop[tikv]
Performed on the - The final
IndexReader_21
The operator performs an aggregation of the data of the underlying operator atroot
That isTiDB
Performed in the
2: aggregate query chestnut, using group by:
select
user_id
from
tablea a0_
GROUP by
user_id
Copy the code
SQL > select * from user_id where group by is used for index column user_id;
- For indexed columns
user_id
Using thegroup by
, the index column is first evaluateduser_id
Perform full scan of index data,IndexFullScan_11
The execution position of the operator iscop[tikv]
- Subsequent executions
HashAgg_5
Operator, because omega is omegagroup by
, also can be incop[tikv]
Performed on the - The final
IndexReader_13
The operator performs an aggregation of the data of the underlying operator atroot
That isTiDB
Performed in the
Chestnut 3: subquery chestnut, using index IN subquery, when the subquery is full:
select
*
from
tablea a0_
where
user_id IN (
select
user_id
from
tablea
)
Copy the code
Select * from user_id; select * from user_id; select * from user_id; select * from user_id;
- First of all, the subquery is a full table scan with no condition
TableFullScan_49
, since the subquery is full data, will be incop[tikv]
Performed on the - Aggregate the results of subqueries
TableReader_50
Will be inroot
That isTiDB
Performed in the - Look where plan 1 is executed, when the outer SQL is on the index column
user_id
When In is performed, the index column is evaluateduser_id
Perform a scan of full index data,IndexFullScan_40
Will be incop[tikv]
Performed on the - The same aggregation operator at position one
IndexReader_42
inroot
That isTiDB
Performed in the - The final
HashJoin_22
The operator performs an aggregation of the data of the underlying operator atroot
That isTiDB
Performed in the
access-object
: objects of subtasks, such as tables, indexes, and so on
This one’s easy to understand. Just go straight to chestnuts
select
*
from
tablea a1_
where
a1_.user_id = 123214125
Copy the code
The implementation plan is as follows:
- Look at this SQL, it is a column by index
user_id
The index range scan is performed, and its execution logic is to first pass the index columnuser_id
A range scan was performed to get all the eligiblerowId
And then throughrowId
Scan the table to get the data, see the execution is also, first inBuild
End,IndexRangeScan
Operator, for indexed columnsuser_id
We ran a range scan. We got itrowId
In theProbe
End, in passingTableRowIDScan
Operator, throughrowId
Scan the table for data, and finally passIndexLookUp
Operator to aggregate the final data - Look at this SQL execution plan
access-object
- First of all in
Build
End,IndexRangeScan_8(Build)
Operator, for indexed columnsuser_id
Range scan is performed, so the object of the operator istable:a0_,index:idx_user_id(user_id)
, meaning that the object of operation is a tablea0_
The index of theidx_user_id(user_id)
- It is then obtained by a range scan index
rowId
Scan the table for data, soTableRowIDScan_9(Probe)
The object of the operator is a tablea0_
operator info
: Indicates operation logs during the execution of subtasks
This is very easy to understand, basically every step of the operation log, do not lift the chestnut, from the original chestnut can see the understand
TiDB execution plan operator for everyone said here, welcome everyone to exchange, point out some of the paper said wrong place, let me deepen understanding.