Recently, I checked the TiDB slow SQL in the project. When QUERYING the execution plan, I found that the details of TiDB execution plan are still different from those of mysql. Today, I will learn and share the details

Query Plan Command

EXPLAIN command, you can view the execution plan of TiDB when executing SQL

The EXPLAIN SQL statementsCopy the code

For example (desensitization data)

Execute EXPLAIN

EXPLAIN 
select
  a0_.id,
  a0_.create_time,
  a0_.end_time,
  a0_.flow_id,
  a0_.campaign_id,
  a0_.unit_id,
  a0_.oa_id,
  a0_.org_path_,
  a0_.param,
  a0_.start_time,
  a0_.state,
  a0_.user_type,
  a0_.update_time,
  a0_.user_id
from
  table_a a0_
where
  a0_.campaign_id = 354361236223
  and a0_.user_id = 25325123
  and a0_.user_type = 1
  and a0_.param = '1'
limit
  1000
Copy the code

Execution Plan Result

The execution plan is presented in a tree structure. Let’s talk about what each column means:

  • idIs an operator, and is a subtask that needs to be executed at each step during SQL execution
  • estRowsEstimate the number of rows to process for each subtask
  • taskIs the position of the subtask when it is executed
  • access-objectObjects placed by subtasks, such as tables, indexes, etc
  • operator infoSome information about the execution of subtasks can be regarded as operation logs

I’m going to focus on operators today

Id: indicates an operator. It is a sub-task that needs to be executed at each step during SQL execution

An operator is a specific step performed to return the result of a query

TiDB operators are mainly divided into two types: the first type is the operators of scanning tables, and the second type is the operators of gathering scanned data or calculation results on TiKV/TiFlash

Operators of the first class: operators that scan tables

Table sweep operators have the following classes:

TableFullScan: Full table scan

If an index is not used in a general query condition or is invalid, TableFullScan will appear in the execution plan

TableFullScan chestnuts:

select
  * 
from
  tablea a0_
Copy the code

SQL > select * from table_name where table_name ();

See the execution plan, appearTableFullScanId forTableFullScan+ indicates that the subtask performed in this step performs a full table scan

IndexFullScan: Scans index data in full

IndexFullScan chestnut 1: aggregate query IndexFullScan chestnut, use COUNT:

select
  COUNT(user_id)
from
  tablea a0_
Copy the code

SQL > select COUNT from user_id; select COUNT from user_id; select COUNT from user_id;

See the execution plan, appearIndexFullScanId forIndexFullScan+ an ordinal indicating that the subtask performed in this step is performing the index columnuser_idA scan of the full index data was performed

Query IndexFullScan chestnut using group by:

select
  user_id
from
  tablea a0_
GROUP by
  user_id
Copy the code

SQL > select * from user_id where group by is used for index column user_id;

See the execution plan, appearIndexFullScanId forIndexFullScan+ an ordinal indicating that the subtask performed in this step is performing the index columnuser_idA scan of the full index data was performed

IndexFullScan chestnut 3: Aggregate query IndexFullScan chestnut using min function:

select
  MIN(user_id)
from
  tablea a0_
Copy the code

SQL > select * from user_id; select * from user_id; select * from user_id;

See the execution plan, appearIndexFullScanId forIndexFullScan+ an ordinal indicating that the subtask performed in this step is performing the index columnuser_idA scan of the full index data was performed

IndexFullScan chestnut 4: Subquery IndexFullScan chestnut, using the index IN subquery, when the subquery is full:

select
  *
from
  tablea a0_
where
  user_id IN (
    select
      user_id
    from
      tablea
  )
Copy the code

Select * from user_id; select * from user_id; select * from user_id; select * from user_id;

Select * from TableFullScan_49; select * from TableFullScan_49; select * from TableFullScan_49; select * from TableFullScan_49; select * from TableFullScan_49; Appear IndexFullScan

Query IndexFullScan chestnut 5: JOIN query IndexFullScan chestnut, use left JOIN, when the left join table is full data:

select
  a0_.*,
  a1_.*
from
  tablea a0_
  LEFT JOIN (
    select
      *
    from
      tablea
  ) as a1_ ON a0_.user_id = a1_.user_id
Copy the code

In this SQL, user_id is used for left JOIN. When the left linked table is scanned for all index data, IndexFullScan operator will appear, and the execution plan is as follows:

To look at the execution plan, the left row table is a full table scan, so the index column will beuser_idPerform a scan of full index dataIndexFullScan

TableRowIDScan: According to the rowId scan table data transmitted from the upper layer, in popular terms, the query first obtains the rowId through the index, and then reads the data according to the rowId

According to the rowId scan table data passed from the upper layer, generally speaking, it is to query the index to obtain THE rowId, read data according to the rowId, TableRowIDScan will appear in the execution plan, for example

TableRowIDScan chestnuts:

select
  * 
from
  tablea a1_
where
  a1_.user_id = 123214125
Copy the code

For a simple SQL look, the execution plan is as follows:

Because the index column user_id is used, rowId is obtained from the index, and rowId is used to read the table data. Therefore, TableRowIDScan appears in the execution plan, and the ID is TableRowIDScan + a serial number, indicating that, The subtask performed in this step is the rowId scan table data obtained by sending indexes

IndexRangeScan: Index data scan with range

Index data scan with range, use this chestnut

TableRowIDScan chestnuts:

select
  * 
from
  tablea a1_
where
  a1_.user_id = 123214125
Copy the code

For a simple SQL look, the execution plan is as follows:

Because the range query is used for index column user_id, IndexRangeScan is displayed in the execution plan. If the ID is IndexRangeScan +, the sub-task of this step is to scan index data with a range

The second type of operator: the operator that aggregates scanned data or calculated results on TiKV/TiFlash

The operators of the data aggregation class are as follows:

TableReader: Summarizes the data obtained by TableFullScan or TableRangeScan

Summarize the data obtained by TableFullScan or TableRangeScan

TableReader converts full table scan TableFullScan

select
  * 
from
  tablea a1_
Copy the code

SQL > select * from table_name where table_name ();See the execution plan, because there is no index query, a full table scan, appearTableFullScanSo it was finally usedTableReaderOperator, for the full table scan data are summarized

IndexReader: Summarizes data obtained by the upper and lower table scan operators IndexFullScan or IndexRangeScan

The data obtained by the upper and lower table scan operators IndexFullScan or IndexRangeScan are summarized

IndexReader converges full index scan chestnut:

select
  MIN(user_id)
from
  tablea a0_
Copy the code

As the min function is used on the index column, the full index will be scanned and IndexFullScan operator will appear. Therefore, IndexReader operator will be used to aggregate the data obtained by IndexFullScan operator. The execution plan is as follows:

IndexLookUp

First collect the RowID scanned by TiKV at Build end, and then Probe end to read the data on TiKV accurately according to these RowID. Build is an operator of type IndexFullScan or IndexRangeScan, and Probe is an operator of type TableRowIDScan

IndexLookUp chestnuts:

select
  * 
from
  tablea a1_
where
  a1_.user_id = 123214125
Copy the code

The implementation plan is as follows:

Look at this SQL, it is a column by indexuser_idThe index range scan is performed, as described above, by performing the same logic as the index columnuser_idA range scan was performed to get all the eligiblerowIdAnd then throughrowIdScan the table to get the data, see the execution is also, first inBuildEnd,IndexRangeScanOperator, for indexed columnsuser_idWe ran a range scan. We got itrowIdIn theProbeEnd, in passingTableRowIDScanOperator, throughrowIdScan the table for data, and finally passIndexLookUpOperator to aggregate the final data

3. Sequence of the operator execution
  • The structure of the operator is a tree, but in the execution of the query, it is not strictly required that the child node tasks be completed before the parent node. Moreover, TiDB executes the nodes of the same query in parallel
  • Again, the above SQL is chestnut
select
  a0_.*,
  a1_.*
from
  tablea a0_
  LEFT JOIN (
    select
      *
    from
      tablea
  ) as a1_ ON a0_.user_id = a1_.user_id
Copy the code

The implementation plan is as follows:At every level,Build clientAlways precededThe Probe tipExecute, andBuild clientAlways present inThe Probe tipIn front of the

The operator in TiDB execution plan is enough for you, and the following information will be filled in for you, etc. Welcome to exchange, point out some mistakes in the article, let me deepen my understanding.