case

Create the DDL for the table

CREATE TABLE 'employees' (' id' INT(10) NOT NULL AUTO_INCREMENT, 'name' VARCHAR(24) NOT NULL DEFAULT COMMENT 'COLLATE 'utf8_general_ci', 'age' INT(10) NOT NULL DEFAULT '0' COMMENT 'age ', 'position' VARCHAR(20) NOT NULL DEFAULT COMMENT 'COLLATE 'utf8_general_ci', 'hire_time' TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'CURRENT_TIMESTAMP ', PRIMARY KEY (' id') USING BTREE, INDEX `idx_name_age_position` (`name`, `age`, `position`) USING BTREE, INDEX 'idx_age' (' age ') USING BTREE) COMMENT=' table '; # random number (randomly generated employee age) drop function if exists' rand_num '; delimiter ;; create function `rand_num`( `start_num` integer, `end_num` integer ) returns int comment '' begin return floor(start_num + rand() * (end_num - start_num + 1)); end ;; Drop function if exists' rand_str '; delimiter ;; create definer = `root`@`localhost` function `rand_str`( `n` int ) returns varchar(255) begin declare chars_str varchar(100) default 'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz'; declare return_str varchar(100) default ''; declare i int default 0; while i < n do set return_str = concat(return_str, substring(chars_str, floor(1 + rand() * 52), 1)); set i = i + 1; end while; return return_str; end ;; Drop procedure if exists' insert_emp '; delimiter ;; create procedure `insert_emp`( in `max_num` int(10) ) begin declare i int default 0; set autocommit = 0; repeat set i = i+1; insert into `employees` (`name`, `age`, `position`, `hire_time`) values (rand_str(6), rand_num(20, 40), 'dev', now()); until i = max_num end repeat; commit; end ;; delimiter ; call insert_emp(10000);Copy the code

The index optimization

An index pushdown

For auxiliary syndication indexes (name,age,position), normally the left-most prefix rule is followed, SELECT * FROM employees WHERE name like ‘LiLei%’ AND age = 22 AND position =’manager’ SELECT * FROM employees WHERE name like ‘LiLei%’ AND age = 22 AND position =’manager’ Age and position in the resulting index row are unordered and cannot be used well. In previous versions of MySQL5.6, this query could only match entries in the federated index with names starting with ‘LiLei’, and then run the primary keys back to the table to find the corresponding entries in the primary key index.

MySQL 5.6 introduces index push-down optimization, which can judge all fields contained in the index during index traversal and filter out the unqualified records before returning to the table, effectively reducing the number of return to the table. With index push-down optimization, the above query matches an index whose name starts with ** ‘LiLei’ in the federated index, and filters age and position** in the index, then returns to the table with the primary key ID of the remaining index.

Index push-down can reduce the number of table returns. For InnoDB engine, table index push-down can only be used for secondary indexes. Innodb primary key index (clustered index) tree leaves store full row data, so index push-down does not reduce the effect of querying full row data.

Why does range lookup Mysql not use index push-down optimization? In most cases, the result set after filtering is small, so Mysql chose to use index push-down optimization for like KK%. Of course, this is not absolute, sometimes like KK% will not be pushed down index.

Several common scenarios

1. The first field in a federated index is a range lookup that does not walk the index

EXPLAIN SELECT * FROM employees WHERE name > 'SAN ZHANG' AND age = 22 AND position ='dev';
Copy the code

Mysql > alter table select * from table select * from table select * from table select * from table select * from table select * from table select * from table select * from table select *

2. Force index drop

EXPLAIN SELECT * FROM employees force index(idx_name_age_position) WHERE name > 'SAN ZHANG' AND age = 22 AND position ='dev';
Copy the code

SQL > alter table select * from table select * from table select * from table select * from table select * from table select * from table select *

#Disabling query caching
set global query_cache_size=0;

#The execution time is 0.1 secondsSELECT * FROM employees WHERE name > 'SAN ZHANG';
#The execution time is 0.15 secondsSELECT * FROM employees force index(idx_name_age_position) WHERE name > 'SAN ZHANG'; ;Copy the code

3. Overwrite index optimization

EXPLAIN SELECT name,age,position FROM employees WHERE name > 'SAN ZHANG' AND age = 22 AND position ='dev';
Copy the code

4. In and OR scan the index when the data volume is large, and select full table scan when the data record is small

EXPLAIN SELECT name,age,position FROM employees WHERE name in ('SAN ZHANG', 'SI Li', 'MAZI WQNAG', 'LIU ZHAO') AND age = 22 AND position ='dev';
Copy the code

EXPLAIN SELECT name,age,position FROM employees WHERE 
(name = 'SAN ZHANG' or name = 'SI Li' or name = 'MAZI WQNAG' or name = 'LIU ZHAO') 
AND age = 22 AND position ='dev';
Copy the code

Create an employees_temp table with a small number of records

CREATE TABLE `employees_temp` ( `id` INT(10) NOT NULL AUTO_INCREMENT, 'name' VARCHAR(24) NOT NULL DEFAULT 'COMMENT ',' age 'INT(10) NOT NULL DEFAULT '0' COMMENT' age ', 'position' VARCHAR(20) NOT NULL DEFAULT '' COMMENT ' 'hire_time' TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'CURRENT_TIMESTAMP ', PRIMARY KEY (' id') USING BTREE, INDEX 'idx_name_age_position' (' name ', 'age', 'position') COMMENT=' BTREE '; insert into `employees_temp`(name, age, position) values ('SAN ZHANG', 23, 'dev'); insert into `employees_temp`(name, age, position) values ('SI Li', 23, 'dev'); insert into `employees_temp`(name, age, position) values ('LIU ZHAO', 26, 'dev'); insert into `employees_temp`(name, age, position) values ('LING AI', 38, 'manager');Copy the code

EXPLAIN SELECT * FROM employees_temp WHERE name in (‘SAN ZHANG’, ‘SI Li’, ‘MAZI WQNAG’, ‘LIU ZHAO’) AND age = 22 AND position =’dev’;

EXPLAIN SELECT * FROM employees_temp WHERE (name = ‘SAN ZHANG’ or name = ‘SI Li’ or name = ‘MAZI WQNAG’ or name = ‘LIU ZHAO’) AND age = 22 AND position =’dev’;

5, like ‘SAN%’ usually select index

EXPLAIN SELECT name,age,position FROM employees WHERE name like 'SAN%' AND age = 22 AND position ='dev';
Copy the code

EXPLAIN SELECT * FROM employees WHERE name like 'SAN%' AND age = 22 AND position ='dev';
Copy the code

Select the appropriate index

Index analysis case

EXPLAIN select * from employees where name > 'a';
Copy the code

If you want to traverse the name joint index tree, and then need to search the final data in the primary key tree according to the traversed primary key value, the cost is higher than the full table scan, you can use overwrite index optimization, so that only need to traverse the name joint index tree can get all the results, as follows:

EXPLAIN select name,age,position from employees where name > 'a' ; 
Copy the code

EXPLAIN select * from employees where name > 'zzz' ;
Copy the code

Use the Trace Tool

Mysql > select * from ‘a’; mysql > select * from ‘ZZZ’; mysql > select * from ‘ZZZ’; Therefore, only temporary analysis SQL use, immediately after the use of trace tool closed:

set session optimizer_trace="enabled=on",end_markers_in_json=on;  --开启trace
select * from employees where name > 'a' order by position;
SELECT * FROM information_schema.OPTIMIZER_TRACE;

查看trace字段:
{
  "steps": [
    {
      "join_preparation": {  --第一阶段:SQL准备阶段,格式化sql
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `employees`.`id` AS `id`,`employees`.`name` AS `name`,`employees`.`age` AS `age`,`employees`.`position` AS `position`,`employees`.`hire_time` AS `hire_time` from `employees` where (`employees`.`name` > 'a') order by `employees`.`position`"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {  --第二阶段:SQL 优化阶段
        "select#": 1,
        "steps": [
          {
            "condition_processing": { --条件处理
              "condition": "WHERE",
              "original_condition": "(`employees`.`name` > 'a')",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "(`employees`.`name` > 'a')"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "(`employees`.`name` > 'a')"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "(`employees`.`name` > 'a')"
                }
              ] /* steps */
            } /* condition_processing */
          },
          {
            "substitute_generated_columns": {
            } /* substitute_generated_columns */
          },
          {
            "table_dependencies": [  --表依赖详情
              {
                "table": "`employees`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "ref_optimizer_key_uses": [
            ] /* ref_optimizer_key_uses */
          },
          {
            "rows_estimation": [  --预估表的访问成本
              {
                "table": "`employees`",
                "range_analysis": {
                  "table_scan": {    --全表扫描情况
                    "rows": 120085,  --扫描行数
                    "cost": 24372    --查询成本
                  } /* table_scan */,
                  "potential_range_indexes": [  --查询可能使用的索引
                    {
                      "index": "PRIMARY",  --主键索引
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_name_age_position", --辅助索引
                      "usable": true,
                      "key_parts": [
                        "name",
                        "age",
                        "position",
                        "id"
                      ] /* key_parts */
                    }
                  ] /* potential_range_indexes */,
                  "setup_range_conditions": [
                  ] /* setup_range_conditions */,
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  } /* group_index_range */,
                  "analyzing_range_alternatives": {  --分析各个索引使用成本
                    "range_scan_alternatives": [
                      {
                        "index": "idx_name_age_position",
                        "ranges": [
                          "a < name"  --索引使用范围
                        ] /* ranges */,
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,  --使用该索引获取的记录是否按照主键排序
                        "using_mrr": false,
                        "index_only": false,     --是否使用覆盖索引
                        "rows": 60042,           --索引扫描的行数
                        "cost": 72051,           --索引使用成本
                        "chosen": false,         --是否选择该索引
                        "cause": "cost"
                      }
                    ] /* range_scan_alternatives */,
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    } /* analyzing_roworder_intersect */
                  } /* analyzing_range_alternatives */
                } /* range_analysis */
              }
            ] /* rows_estimation */
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ] /* plan_prefix */,
                "table": "`employees`",
                "best_access_path": {     --最优访问路径
                  "considered_access_paths": [  --最终选择的访问路径  
                    {
                      "rows_to_scan": 120085,
                      "access_type": "scan",  --访问类型:为 scan, 全表扫描
                      "resulting_rows": 120085,
                      "cost": 24370,
                      "chosen": true,         --确定选择
                      "use_tmp_table": true
                    }
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "condition_filtering_pct": 100,
                "rows_for_plan": 120085,
                "cost_for_plan": 24370,
                "sort_cost": 120085,
                "new_cost_for_plan": 144455,
                "chosen": true
              }
            ] /* considered_execution_plans */
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "(`employees`.`name` > 'a')",
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                {
                  "table": "`employees`",
                  "attached": "(`employees`.`name` > 'a')"
                }
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
          },
          {
            "clause_processing": {
              "clause": "ORDER BY",
              "original_clause": "`employees`.`position`",
              "items": [
                {
                  "item": "`employees`.`position`"
                }
              ] /* items */,
              "resulting_clause_is_simple": true,
              "resulting_clause": "`employees`.`position`"
            } /* clause_processing */
          },
          {
            "reconsidering_access_paths_for_index_ordering": {
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`employees`",
                "index_provides_order": false,
                "order_direction": "undefined",
                "index": "unknown",
                "plan_changed": false
              } /* index_order_summary */
            } /* reconsidering_access_paths_for_index_ordering */
          },
          {
            "refine_plan": [
              {
                "table": "`employees`"
              }
            ] /* refine_plan */
          }
        ] /* steps */
      } /* join_optimization */
    },
    {
      "join_execution": {  --第三阶段: SQL 执行阶段
        "select#": 1,
        "steps": [
          {
            "filesort_information": [
              {
                "direction": "asc",
                "table": "`employees`",
                "field": "position"
              }
            ] /* filesort_information */,
            "filesort_priority_queue_optimization": {
              "usable": false,
              "cause": "not applicable (no LIMIT)"
            } /* filesort_priority_queue_optimization */,
            "filesort_execution": [
            ] /* filesort_execution */,
            "filesort_summary": {
              "rows": 120003,
              "examined_rows": 120003,
              "number_of_tmp_files": 34,
              "sort_buffer_size": 262056,
              "sort_mode": "<sort_key, packed_additional_fields>"
            } /* filesort_summary */
          }
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}


结论:全表扫描的成本低于索引扫描,所以mysql最终选择全表扫描

select * from employees where name > 'zzz' order by position;
SELECT * FROM information_schema.OPTIMIZER_TRACE;

查看trace字段可知索引扫描的成本低于全表扫描,所以mysql最终选择索引扫描

set session optimizer_trace="enabled=off";    --关闭trace
Copy the code

Common SQL in-depth optimization

Order by and Group by optimization

Case 1:

explain select * from employees where name = 'SAN' and position = 'dev' order by age desc;
Copy the code

Select * from key_len=74; select * from key_len=74; select * from key_len=74

Case 2:

explain select * from employees where name = 'SAN' order by position;
Copy the code

Select * from ‘explain’ where key_len=74; select * from ‘explain’ where key_len=74;

Case 3:

explain select * from employees where name = 'SAN' order by age, position;
Copy the code

Search only for index name, age, and position, without Using filesort.

Case 4:

 explain select * from employees where name = 'SAN' order by position, age;
Copy the code

Use filesort (name,age,position); select * from explain_index (name,age,position);

Case 5:

explain select * from employees where name = 'SAN' and age = 30 order by position, age;
Copy the code

Analysis: Compared with Case 4, Using filesort does not appear in Extra, because age is a constant and is optimized in sorting, so the index is not reversed and Using filesort does not appear.

Case 6:

explain select * from employees where name = 'SAN' order by age desc ,position asc;
Copy the code

Using filesort is created by ordering columns in the same order as the index, and ordering by is in ascending order. Mysql8 and later have descending indexes to support this query method.

Case 7:

explain select * from employees where name in ('SAN', 'SI', 'WU') order by age ,position;
Copy the code

Analysis: Multiple equality conditions are also range queries for sorting

Case 8:

explain select * from employees where name > 'a' order by name;
Copy the code

You can optimize with overwrite indexes

explain select name, age, position from employees where name > 'a' order by name;
Copy the code

Optimization summary:

MySQL supports two kinds of sorting filesort and index, Using index. Index is efficient, filesort is inefficient. 2, Order by Using index

  1. The ORDER by statement uses the left-most front row of the index.
  2. Use the combination of the WHERE clause and the ORDER by clause condition columns to satisfy the left-most front of the index.

3, as far as possible on the index column to complete the sorting, follow the index creation (index creation order) leftmost prefix rule. 4, If the order by condition is not in the index column, the Using filesort is generated. Order by (group by); order by (group by); For group by optimizations that do not require sorting, we can add order by NULL to disable sorting. Note that where is higher than HAVING, so don’t qualify having in where.

Using filesort to sort files in detail

Filesort Indicates the sorting mode of files

  • Single-way sort: fetch all the fields that meet the condition at one time and sort them in sort buffer. Sort_mode < sort_key, additional_fields > or < sort_key, packed_additional_fields >
  • Double-sort (also called table sort mode) : firstly, the corresponding sort field and the row ID can be directly located according to the corresponding conditions, and then sort in the sort buffer. After sorting, the other fields need to be retrieved again. Sort_mode < sort_key, rowid >
  • MySQL determines which sort mode to use by comparing the size of the system variable ** MAX_LENGTH_FOR_sort_DATA (default 1024 bytes) ** with the total size of the fields to be queried.
  • If the total length of the field is less than max_LENGTH_FOR_sorT_data, single-way sort mode is used;
  • If the total length of the field is greater than max_LENGTH_FOR_sort_data, then the two-way sort mode is used.

Examples verify the various sorting methods: SQL > select * from ‘trace’;

set session optimizer_trace="enabled=on",end_markers_in_json=on;  --开启trace
select * from employees where name = 'SAN' order by position;
select * from information_schema.OPTIMIZER_TRACE;

trace排序部分结果:
{
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `employees`.`id` AS `id`,`employees`.`name` AS `name`,`employees`.`age` AS `age`,`employees`.`position` AS `position`,`employees`.`hire_time` AS `hire_time` from `employees` where (`employees`.`name` > 'a') order by `employees`.`position`"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "(`employees`.`name` > 'a')",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "(`employees`.`name` > 'a')"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "(`employees`.`name` > 'a')"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "(`employees`.`name` > 'a')"
                }
              ] /* steps */
            } /* condition_processing */
          },
          {
            "substitute_generated_columns": {
            } /* substitute_generated_columns */
          },
          {
            "table_dependencies": [
              {
                "table": "`employees`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "ref_optimizer_key_uses": [
            ] /* ref_optimizer_key_uses */
          },
          {
            "rows_estimation": [
              {
                "table": "`employees`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 120085,
                    "cost": 24372
                  } /* table_scan */,
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_name_age_position",
                      "usable": true,
                      "key_parts": [
                        "name",
                        "age",
                        "position",
                        "id"
                      ] /* key_parts */
                    }
                  ] /* potential_range_indexes */,
                  "setup_range_conditions": [
                  ] /* setup_range_conditions */,
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  } /* group_index_range */,
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "idx_name_age_position",
                        "ranges": [
                          "a < name"
                        ] /* ranges */,
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 60042,
                        "cost": 72051,
                        "chosen": false,
                        "cause": "cost"
                      }
                    ] /* range_scan_alternatives */,
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    } /* analyzing_roworder_intersect */
                  } /* analyzing_range_alternatives */
                } /* range_analysis */
              }
            ] /* rows_estimation */
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ] /* plan_prefix */,
                "table": "`employees`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan": 120085,
                      "access_type": "scan",
                      "resulting_rows": 120085,
                      "cost": 24370,
                      "chosen": true,
                      "use_tmp_table": true
                    }
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "condition_filtering_pct": 100,
                "rows_for_plan": 120085,
                "cost_for_plan": 24370,
                "sort_cost": 120085,
                "new_cost_for_plan": 144455,
                "chosen": true
              }
            ] /* considered_execution_plans */
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "(`employees`.`name` > 'a')",
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                {
                  "table": "`employees`",
                  "attached": "(`employees`.`name` > 'a')"
                }
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
          },
          {
            "clause_processing": {
              "clause": "ORDER BY",
              "original_clause": "`employees`.`position`",
              "items": [
                {
                  "item": "`employees`.`position`"
                }
              ] /* items */,
              "resulting_clause_is_simple": true,
              "resulting_clause": "`employees`.`position`"
            } /* clause_processing */
          },
          {
            "reconsidering_access_paths_for_index_ordering": {
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`employees`",
                "index_provides_order": false,
                "order_direction": "undefined",
                "index": "unknown",
                "plan_changed": false
              } /* index_order_summary */
            } /* reconsidering_access_paths_for_index_ordering */
          },
          {
            "refine_plan": [
              {
                "table": "`employees`"
              }
            ] /* refine_plan */
          }
        ] /* steps */
      } /* join_optimization */
    },
    {
      "join_execution": {   --SQL 执行阶段
        "select#": 1,
        "steps": [
          {
            "filesort_information": [
              {
                "direction": "asc",
                "table": "`employees`",
                "field": "position"
              }
            ] /* filesort_information */,
            "filesort_priority_queue_optimization": {
              "usable": false,
              "cause": "not applicable (no LIMIT)"
            } /* filesort_priority_queue_optimization */,
            "filesort_execution": [
            ] /* filesort_execution */,
            "filesort_summary": {
              "rows": 120003,               --文件排序信息
              "examined_rows": 120003,      --预计扫描行数
              "number_of_tmp_files": 34,    --参与排序的行
              "sort_buffer_size": 262056,   --使用临时文件的个数,如果这个值为 9 代表全部使用的 sort_buffer 内存排序,否则使用的磁盘文件排序
              "sort_mode": "<sort_key, packed_additional_fields>"  --排序方式,这里采用的是单路排序
            } /* filesort_summary */
          }
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}



      
set max_length_for_sort_data = 2;    --employees表所有字段长度总和肯定大于2字节
select * from employees where name = 'SAN' order by position;
select * from information_schema.OPTIMIZER_TRACE;

trace排序部分结果:
{
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `employees`.`id` AS `id`,`employees`.`name` AS `name`,`employees`.`age` AS `age`,`employees`.`position` AS `position`,`employees`.`hire_time` AS `hire_time` from `employees` where (`employees`.`name` = 'SAN') order by `employees`.`position`"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "(`employees`.`name` = 'SAN')",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "(`employees`.`name` = 'SAN')"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "(`employees`.`name` = 'SAN')"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "(`employees`.`name` = 'SAN')"
                }
              ] /* steps */
            } /* condition_processing */
          },
          {
            "substitute_generated_columns": {
            } /* substitute_generated_columns */
          },
          {
            "table_dependencies": [
              {
                "table": "`employees`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "ref_optimizer_key_uses": [
              {
                "table": "`employees`",
                "field": "name",
                "equals": "'SAN'",
                "null_rejecting": false
              }
            ] /* ref_optimizer_key_uses */
          },
          {
            "rows_estimation": [
              {
                "table": "`employees`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 120085,
                    "cost": 24372
                  } /* table_scan */,
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_name_age_position",
                      "usable": true,
                      "key_parts": [
                        "name",
                        "age",
                        "position",
                        "id"
                      ] /* key_parts */
                    }
                  ] /* potential_range_indexes */,
                  "setup_range_conditions": [
                  ] /* setup_range_conditions */,
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  } /* group_index_range */,
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "idx_name_age_position",
                        "ranges": [
                          "SAN <= name <= SAN"
                        ] /* ranges */,
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 1,
                        "cost": 2.21,
                        "chosen": true
                      }
                    ] /* range_scan_alternatives */,
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    } /* analyzing_roworder_intersect */
                  } /* analyzing_range_alternatives */,
                  "chosen_range_access_summary": {
                    "range_access_plan": {
                      "type": "range_scan",
                      "index": "idx_name_age_position",
                      "rows": 1,
                      "ranges": [
                        "SAN <= name <= SAN"
                      ] /* ranges */
                    } /* range_access_plan */,
                    "rows_for_plan": 1,
                    "cost_for_plan": 2.21,
                    "chosen": true
                  } /* chosen_range_access_summary */
                } /* range_analysis */
              }
            ] /* rows_estimation */
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ] /* plan_prefix */,
                "table": "`employees`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "ref",
                      "index": "idx_name_age_position",
                      "rows": 1,
                      "cost": 1.2,
                      "chosen": true
                    },
                    {
                      "access_type": "range",
                      "range_details": {
                        "used_index": "idx_name_age_position"
                      } /* range_details */,
                      "chosen": false,
                      "cause": "heuristic_index_cheaper"
                    }
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "condition_filtering_pct": 100,
                "rows_for_plan": 1,
                "cost_for_plan": 1.2,
                "chosen": true
              }
            ] /* considered_execution_plans */
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "(`employees`.`name` = 'SAN')",
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                {
                  "table": "`employees`",
                  "attached": null
                }
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
          },
          {
            "clause_processing": {
              "clause": "ORDER BY",
              "original_clause": "`employees`.`position`",
              "items": [
                {
                  "item": "`employees`.`position`"
                }
              ] /* items */,
              "resulting_clause_is_simple": true,
              "resulting_clause": "`employees`.`position`"
            } /* clause_processing */
          },
          {
            "added_back_ref_condition": "((`employees`.`name` <=> 'SAN'))"
          },
          {
            "reconsidering_access_paths_for_index_ordering": {
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`employees`",
                "index_provides_order": false,
                "order_direction": "undefined",
                "index": "idx_name_age_position",
                "plan_changed": false
              } /* index_order_summary */
            } /* reconsidering_access_paths_for_index_ordering */
          },
          {
            "refine_plan": [
              {
                "table": "`employees`",
                "pushed_index_condition": "(`employees`.`name` <=> 'SAN')",
                "table_condition_attached": null
              }
            ] /* refine_plan */
          }
        ] /* steps */
      } /* join_optimization */
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "filesort_information": [
              {
                "direction": "asc",
                "table": "`employees`",
                "field": "position"
              }
            ] /* filesort_information */,
            "filesort_priority_queue_optimization": {
              "usable": false,
              "cause": "not applicable (no LIMIT)"
            } /* filesort_priority_queue_optimization */,
            "filesort_execution": [
            ] /* filesort_execution */,
            "filesort_summary": {
              "rows": 0,
              "examined_rows": 0,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 262136,
              "sort_mode": "<sort_key, rowid>"  --排序方式,双路排序
            } /* filesort_summary */
          }
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}

set session optimizer_trace="enabled=off";    --关闭trace
Copy the code

Let’s first look at the detailed process of single-way sorting:

  1. Find the first primary key ID from index name that meets the condition name = ‘SAN’
  2. Extract the entire row based on the primary key ID, extract the values of all fields, and store them in sort_buffer
  3. Find the next primary key ID from index name that meets the condition name = ‘SAN’
  4. Repeat steps 2 and 3 until name = ‘SAN’ is not satisfied
  5. Sort the data in sort_buffer by field position
  6. Returns the result to the client

Let’s look at the detailed process of double sort:

  1. Find the first primary key ID whose name = ‘SAN’ from index name
  2. Insert position and primary key ID into sort Buffer
  3. Get a primary key ID from index name that satisfies the record name = ‘SAN’
  4. Repeat 3, 4 until name = ‘SAN’ is not satisfied
  5. Sort the field position and primary key ID in sort_buffer by field position
  6. Iterate through the sorted ID and field position, return to the original table according to the value of ID, retrieve all field values and return them to the client

Sort sort (primary key, sort); sort (primary key, sort); sort (primary key, sort); sort (primary key, sort); sort (primary key, sort); If the MySQL sort memory sort_buffer configuration is small and there is no condition to continue to increase, you can appropriately configure max_length_FOR_sort_data small point, let the optimizer choose to use the dual sort algorithm, can sort more rows in sort_buffer at one time, You just need to go back to the original table to fetch data based on the primary key. Select * from sort_buffer; select * from sort_buffer; select * from sort_buffer; select * from sort_buffer; This will return the query results directly from memory after sorting. Therefore, MySQL uses the parameter max_LENGTH_FOR_sort_data to control sorting and use different sorting modes in different scenarios to improve sorting efficiency. Note that sort_buffer sorting is usually more efficient than disk sorting, but you should not increase sort_buffer size (default: 1M) because mysql has optimized many parameters.

Index design principles

1, code first, index after

Create an index for a table. Create an index for a table. This is actually wrong, generally should wait until the main business function development is completed, the related SQL table to be taken out after the analysis and then establish an index.

2, joint index to cover conditions as far as possible

For example, you can design one or two or three union indexes (minimize single-value indexes), so that each of the union indexes contains the SQL statement where, ORDER by, group by columns, and ensure that the order of the columns in the union indexes matches the left-most prefix principle of SQL queries.

Do not build indexes on small cardinality fields

The index cardinality refers to the number of different values in the table for this field. For example, if there is a gender field in a table with a total of 1 million rows and its value is either male or female, the cardinality of this field is 2. If you want to index such a small cardinality field, you are better off with a full table scan, because your index tree contains both male and female values and there is no way to do a quick binary lookup, so there is not much point in using an index. General index establishment, try to use those cardinality relatively large fields, that is, the value of the field, so as to give play to the advantages of B+ tree fast binary search.

4. We can use prefix indexes for long strings

Try to index columns with smaller field types, such as tinyint, because smaller field types take up less disk space and your search performance will be better. Of course, the so-called smaller column type is not absolute, many times you will want to index vARCHAR (255), even if some extra disk space is necessary. The first 20 characters of each value in the field can be placed in the index tree, and the first 20 characters of each value in the field can be placed in the index tree. Similar to KEY index(name(20),age,position). At this time, when you search in the WHERE condition, if you search according to the name field, you will first search in the index tree according to the first 20 characters of the name field, locate the partial data matching the prefix of the next first 20 characters, and then return to the cluster index to extract the complete value of the name field for comparison. But if you order by name, then your name contains only the first 20 characters in the index tree, so this sort will not use the index, and the same is true for group by. So you have to get a sense of the prefix index.

5, Where does not match order by

When the index design conflicts between WHERE and ORDER by, should I design the index for WHERE or order by? Where is the index used? Order by is the index used? This is usually done by having the WHERE condition use the index to quickly filter out a portion of the specified data and then sort it. Because in most cases where filtering based on indexes tends to be the fastest way to filter out the small amount of data you want, the cost of sorting can be much lower.

6. Optimize based on slow SQL queries

You can do specific index optimization for slow SQL queries based on some slow SQL queries in the monitoring background. For details on slow SQL queries, please refer to this article: blog.csdn.net/qq_40884473…

Index optimization cases

With social scene APP, for example, we usually go to search some friends, it will involve the selection of user information, there must be for users to search the user table, the table in general will be large amount of data, we won’t consider depots table, for example, we usually screening area (provinces), gender, age, height, hobbies and so on, For some apps, users may also have ratings, such as popularity ratings, and we may also rank them according to ratings and so on.

For background programs, in addition to filtering various conditions of the user, it also needs pagination and other processing, which may generate similar SQL statement execution: select xx from user where xx=xx and xx=xx order by XX limit xx,xx

In this case, how to design a reasonable index? For example, users may give priority to the users in the same city according to province and also according to gender. Should we design a joint index (province,city,sex)? These fields seem to have a small base, but they should, because they are frequently queried. Select * from type where type =xx and city=xx and age>=xx and age<=xx. Select * from type where type =xx and city=xx and age<=xx. In this case, the age index is still not used, because the user did not filter the sex field, so how to optimize? In fact, we can optimize the writing of SQL as follows: Where type =xx and city=xx and sex in (‘female’,’male’) and age>=xx and age<=xx So you can add interest field also to index (province, city, sex, hobby, age)

For example, if you want to filter users who have logged in in the last week, you must want to make friends with active users, so that you can receive feedback as soon as possible. Where province=xx and city=xx and sex in (‘female’,’male’) and age>=xx and age<=xx and latest_login_time>= xx Latest_login_time added to index? Such as (province, city, sex, hobby, age, latest_login_time), is clearly not enough, that how to optimize this kind of situation? In fact, we can try to create another field, is_LOGin_IN_latEST_7_days, which is 1 if the user has logged in within a week, 0 otherwise, Then we can put the index design into (province, city, sex, hobby, is_login_in_latest_7_days, age) to meet the above the scene! In general, it is possible to filter out most of the data through such a multi-field index, and retain a small portion of the data based on disk files for order by statement sorting, and finally based on limit paging, so the general performance is relatively high.

But sometimes the user might query it like this, just look at the most popular female, like SQL: Select * from index where sex = ‘female’ order by score limit xx,xx where sex = ‘female’ order by score limit xx,xx In fact, we can design an auxiliary joint index, such as (sex,score), so that it can meet the query requirements.

Above is to tell everyone about some of the index design train of thought, the core idea is to try to use one or two more complex field joint index, more than 80% of the country’s resistance to your query, then use one or two auxiliary index as far as possible under the resistance to the rest of the some atypical queries, ensure that the large scale data query can make full use of the index as much as possible, This will ensure your query speed and performance!

Reference documentation

  • Index Condition Pushdown
  • MySQL index design principles
  • Use of the MySQL Trace tool
  • The difference between group by and Order by + understanding process