Then a TiDB execution plan (a), the last article mainly introduces the operator involved in the execution plan, today the rest of the execution plan to finish

Query Plan Command

EXPLAIN command, you can view the execution plan of TiDB when executing SQL

The EXPLAIN SQL statementsCopy the code

For example (desensitization data)

Execute EXPLAIN

EXPLAIN 
select
  a0_.id,
  a0_.create_time,
  a0_.end_time,
  a0_.flow_id,
  a0_.campaign_id,
  a0_.unit_id,
  a0_.oa_id,
  a0_.org_path_,
  a0_.param,
  a0_.start_time,
  a0_.state,
  a0_.user_type,
  a0_.update_time,
  a0_.user_id
from
  table_a a0_
where
  a0_.campaign_id = 354361236223
  and a0_.user_id = 25325123
  and a0_.user_type = 1
  and a0_.param = '1'
limit
  1000
Copy the code

Execution Plan Result

The execution plan is presented in a tree structure. Let’s talk about what each column means:

  • idIs an operator, and is a subtask that needs to be executed at each step during SQL execution
  • estRowsEstimate the number of rows to process for each subtask
  • taskIs the position of the subtask when it is executed
  • access-objectObjects of subtasks, such as tables, indexes, and so on
  • operator infoSome information about the execution of subtasks can be regarded as operation logs

The last article has the final say. Today, I will talk about the remaining fields in the execution planestRows,task,access-object,operator infoThe meaning of it

estRows: Estimate the number of rows to process for each subtask
This one’s easy to understand. Just go straight to chestnuts
select
  user_id
from
  tablea a0_
GROUP by
  user_id
Copy the code

SQL > select * from user_id where group by is used for index column user_id;

  • Because the execution plan of this SQL is for index columns firstuser_idA full scan of index data was performed, usingIndexFullScanOperator, soIndexFullScan_11The estimated number of lines of the operator at this stepestRowsIs indexed columnuser_idThe data amount of full data, 133270314
  • The next step in the SQL execution is throughIndexReaderThe operator performs an aggregation of the data of the lower operator, soIndexReader_13The estimated number of lines of the operatorestRowsisuser_idGroup by, 873229.35, 873229.35
task: indicates the location of the subtask when it is executed
  • Is the position of the subtask when it is executed
  • There are two main types
  • cop“Means to useTiKVIn theCoprocessorSupports most functions (including aggregate functions and scalar functions),LIMITOperation, index scan, and table scan
  • root, it is to point to inTiDBComputing tasks performed in, generally all aggregatedTiKV/TiFlashThe data scanned on or the operator of the calculation result can only be used asrootTask inTiDBOn execution, all of themJoinOperations are only asrootTask inTiDBPerformed on the
  • One of the goals of SQL optimization for TiDB is to push calculations as far down as possibleTiKVPerformed in the
Take a chestnut

Chestnut 1: aggregate query chestnut, using COUNT:

select
  COUNT(user_id)
from
  tablea a0_
Copy the code

SQL > select COUNT from user_id; select COUNT from user_id; select COUNT from user_id;

  • For indexed columnsuser_idUsing theThe COUNT function, the index column is first evaluateduser_idPerform full scan of index data,IndexFullScan_19The execution position of the operator iscop[tikv]
  • Subsequent executionsSteamAgg_8Operator, because it’s an aggregate function, will also becop[tikv]Performed on the
  • The finalIndexReader_21The operator performs an aggregation of the data of the underlying operator atrootThat isTiDBPerformed in the

2: aggregate query chestnut, using group by:

select
  user_id
from
  tablea a0_
GROUP by
  user_id
Copy the code

SQL > select * from user_id where group by is used for index column user_id;

  • For indexed columnsuser_idUsing thegroup by, the index column is first evaluateduser_idPerform full scan of index data,IndexFullScan_11The execution position of the operator iscop[tikv]
  • Subsequent executionsHashAgg_5Operator, because omega is omegagroup by, also can be incop[tikv]Performed on the
  • The finalIndexReader_13The operator performs an aggregation of the data of the underlying operator atrootThat isTiDBPerformed in the

Chestnut 3: subquery chestnut, using index IN subquery, when the subquery is full:

select
  *
from
  tablea a0_
where
  user_id IN (
    select
      user_id
    from
      tablea
  )
Copy the code

Select * from user_id; select * from user_id; select * from user_id; select * from user_id;

  • First of all, the subquery is a full table scan with no conditionTableFullScan_49, since the subquery is full data, will be incop[tikv]Performed on the
  • Aggregate the results of subqueriesTableReader_50Will be inrootThat isTiDBPerformed in the
  • Look where plan 1 is executed, when the outer SQL is on the index columnuser_idWhen In is performed, the index column is evaluateduser_idPerform a scan of full index data,IndexFullScan_40Will be incop[tikv]Performed on the
  • The same aggregation operator at position oneIndexReader_42inrootThat isTiDBPerformed in the
  • The finalHashJoin_22The operator performs an aggregation of the data of the underlying operator atrootThat isTiDBPerformed in the
access-object: objects of subtasks, such as tables, indexes, and so on
This one’s easy to understand. Just go straight to chestnuts
select
  * 
from
  tablea a1_
where
  a1_.user_id = 123214125
Copy the code

The implementation plan is as follows:

  • Look at this SQL, it is a column by indexuser_idThe index range scan is performed, and its execution logic is to first pass the index columnuser_idA range scan was performed to get all the eligiblerowIdAnd then throughrowIdScan the table to get the data, see the execution is also, first inBuildEnd,IndexRangeScanOperator, for indexed columnsuser_idWe ran a range scan. We got itrowIdIn theProbeEnd, in passingTableRowIDScanOperator, throughrowIdScan the table for data, and finally passIndexLookUpOperator to aggregate the final data
  • Look at this SQL execution planaccess-object
  • First of all inBuildEnd,IndexRangeScan_8(Build)Operator, for indexed columnsuser_idRange scan is performed, so the object of the operator istable:a0_,index:idx_user_id(user_id), meaning that the object of operation is a tablea0_The index of theidx_user_id(user_id)
  • It is then obtained by a range scan indexrowIdScan the table for data, soTableRowIDScan_9(Probe)The object of the operator is a tablea0_
operator info: Indicates operation logs during the execution of subtasks
This is very easy to understand, basically every step of the operation log, do not lift the chestnut, from the original chestnut can see the understand

TiDB execution plan operator for everyone said here, welcome everyone to exchange, point out some of the paper said wrong place, let me deepen understanding.