The preparatory work
MySQL > create table MySQL > create table MySQL > create table MySQL > create table
CREATE TABLE `emp` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'primary key',
`empno` int(11) DEFAULT NULL COMMENT 'Employee Id',
`ename` varchar(255) DEFAULT NULL COMMENT 'Name of Employee',
`job` varchar(255) DEFAULT NULL COMMENT 'work',
`mgr` varchar(255) DEFAULT NULL COMMENT 'Manager's Number',
`hiredate` date DEFAULT NULL COMMENT 'Date of Employment',
`sal` double DEFAULT NULL COMMENT 'wages',
`comm` double DEFAULT NULL COMMENT 'allowance',
`deptno` int(11) DEFAULT NULL COMMENT 'Department No.'.PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Table of Employees';
CREATE TABLE `dept` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'primary key',
`deptno` int(11) DEFAULT NULL COMMENT 'Department Number',
`dname` varchar(255) DEFAULT NULL COMMENT 'Department Name',
`loc` varchar(255) DEFAULT NULL COMMENT 'address'.PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Department Table';
CREATE TABLE `salgrade` (
`id` int(11) NOT NULL COMMENT 'primary key',
`grade` varchar(255) DEFAULT NULL COMMENT 'class',
`lowsal` varchar(255) DEFAULT NULL COMMENT 'Minimum wage',
`hisal` varchar(255) DEFAULT NULL COMMENT 'Maximum salary'.PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Pay scale';
CREATE TABLE `bonus` (
`id` int(11) NOT NULL COMMENT 'primary key',
`ename` varchar(255) DEFAULT NULL COMMENT 'Name of Employee',
`job` varchar(255) DEFAULT NULL COMMENT 'work',
`sal` double DEFAULT NULL COMMENT 'wages',
`comm` double DEFAULT NULL COMMENT 'allowance'.PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Bonus schedule';
Copy the code
Follow-up execution plan, query optimization, index optimization and other knowledge of the drill, based on the above several tables to operate.
MySQL Execution Plan
To perform SQL tuning, you need to know how the SQL statement to be tuned is executed. You need to view the execution process of the SQL statement to speed up the execution efficiency of the SQL statement.
You can use explain + SQL statements to simulate the optimizer executing SQL queries to see how MySQL handles SQL statements.
Please check out the official website for explain.
Explain output format
mysql> explain select * from emp;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | emp | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
Copy the code
Mysql > select_type; mysql > select_type;
Column | Meaning |
---|---|
id | The SELECT Identifier (The SELECT identifier) |
select_type | The SELECT Type (this SELECT type) |
table | The table for The output row |
partitions | The matching partitions |
type | The join type |
possible_keys | The possible indexes to Choose |
key | The index actually chosen |
key_len | The length of The chosen key |
ref | The columns are compared to The index. |
rows | Estimated of Rows to be examined |
filtered | Percentage of Rows filtered by table condition |
extra | Additional information |
id
The serial number of a SELECT query, which contains a set of numbers indicating the order in which the select clause or operation table is executed in the query.
Id numbers fall into three categories:
- If the IDS are the same, the order of execution goes from top to bottom
mysql> explain select * from emp e join dept d on e.deptno = d.deptno join salgrade sg on e.sal between sg.lowsal and sg.hisal; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------- --------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | e | NULL | | NULL ALL | NULL | NULL | NULL | 1 | | NULL | 100.00 | | 1 SIMPLE | d | NULL | | NULL ALL | NULL | NULL | NULL | 1 | | 100.00 Using the where; Using the join buffer (Block Nested Loop) | | | 1 SIMPLE | sg | NULL | | NULL ALL | NULL | NULL | NULL | 1 | | 100.00 Using where; Using join buffer (Block Nested Loop) | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------- --------------------------------------+Copy the code
MySQL > select * from explain where id = 1; MySQL > select * from explain where id = 1;
- If the IDS are different, the id sequence increases. A larger ID has a higher priority and is executed earlier
mysql> explain select * from emp e where e.deptno in (select d.deptno from dept d where d.dname = 'SALEDept');
+----+--------------+-------------+------------+------+---------------+------+---------+------+------+----------+-------- --------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+------+---------------+------+---------+------+------+----------+-------- --------------------------------------------+
| 1 | SIMPLE | <subquery2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | 100.00 | NULL |
| 1 | SIMPLE | e | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where; Using join buffer (Block Nested Loop) |
| 2 | MATERIALIZED | d | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+--------------+-------------+------------+------+---------------+------+---------+------+------+----------+-------- --------------------------------------------+
Copy the code
The execution sequence for this example is to execute id 2 first, then ID 1.
- The same and different ids exist at the same time: The same ids can be regarded as a group and are executed from top to bottom. In all groups, the higher the ID value, the higher the priority, the earlier the execution
Again, execute id 2 first, and then execute id 1 from the top down.
select_type
It is mainly used to distinguish the type of query, whether it is common query, joint query or sub-query.
select_type Value |
JSON Name | Meaning |
---|---|---|
SIMPLE | None | Simple SELECT (not using UNION or subqueries) |
PRIMARY | None | Outermost SELECT |
UNION | None | Second or later SELECT statement in a UNION |
DEPENDENT UNION | dependent (true) | Second or later SELECT statement in a UNION, dependent on outer query |
UNION RESULT | union_result | Result of a UNION. |
SUBQUERY | None | First SELECT in subquery |
DEPENDENT SUBQUERY | dependent (true) | First SELECT in subquery, dependent on outer query |
DERIVED | None | Derived table |
MATERIALIZED | materialized_from_subquery | Materialized subquery |
UNCACHEABLE SUBQUERY | cacheable (false) | A subquery for which the result cannot be cached and must be re-evaluated for each row of the outer query |
UNCACHEABLE UNION | cacheable (false) | The second or later select in a UNION that belongs to an uncacheable subquery (see UNCACHEABLE SUBQUERY) |
SIMPLE
A simple query, without subqueries and unions
mysql> explain select * from emp;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | emp | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
Copy the code
primary
If the query contains any complex subqueries, the outermost query is marked as Primaryunion
If the second select appears after the union, it is marked as union
mysql> explain select * from emp where deptno = 1001 union select * from emp where sal < 5000;
+----+--------------+------------+------------+------+---------------+------+---------+------+------+----------+--------- --------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+------------+------+---------------+------+---------+------+------+----------+--------- --------+
| 1 | PRIMARY | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where |
| 2 | UNION | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 33.33 | Using where |
| NULL | UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+------------+------+---------------+------+---------+------+------+----------+--------- --------+
Copy the code
Select_type of this statement contains both primary and union
dependent union
Similar to union, depentent here means that the result of a union or union all is affected by an external tableunion result
Select to get the result from the union tabledependent subquery
Subquery subqueries are affected by external table queries
mysql> explain select * from emp e where e.empno in ( select empno from emp where deptno = 1001 union select empno from emp where sal < 5000);
+----+--------------------+------------+------------+------+---------------+------+---------+------+------+----------+--- --------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------+------------+------+---------------+------+---------+------+------+----------+--- --------------+
| 1 | PRIMARY | e | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where |
| 3 | DEPENDENT UNION | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where |
| NULL | UNION RESULT | <union2,3> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------------+------------+------------+------+---------------+------+---------+------+------+----------+--- --------------+
Copy the code
This SQL execution contains the PRIMARY, DEPENDENT SUBQUERY, DEPENDENT UNION, and UNION RESULT
subquery
Include subqueries in select or WHERE lists
For example:
mysql> explain select * from emp where sal > (select avg(sal) from emp) ;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | PRIMARY | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 33.33 | Using where |
| 2 | SUBQUERY | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
Copy the code
DERIVED
Subqueries that occur in the FROM clause are also called derived tablesMATERIALIZED
Materialized subquery?UNCACHEABLE SUBQUERY
Indicates that results using subqueries cannot be cached
Such as:
mysql> explain select * from emp where empno = (select empno from emp where deptno=@@sort_buffer_size);
+----+----------------------+-------+------------+------+---------------+------+---------+------+------+----------+------ -------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+----------------------+-------+------------+------+---------------+------+---------+------+------+----------+------ -------+
| 1 | PRIMARY | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where |
| 2 | UNCACHEABLE SUBQUERY | emp | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where |
+----+----------------------+-------+------------+------+---------------+------+---------+------+------+----------+------ -------+
Copy the code
uncacheable union
The result of the union query cannot be cached
table
Which table is being accessed by the row, table name or alias, possibly temporary table or union merge result set.
- If it is a concrete table name, it indicates that the data is fetched from the actual physical table, and of course it can be an alias for the table
- The table name is the form derivedN and represents a derived table that uses a query with ID N
- When there is a union result, the table name is in the form of union N1,n2, etc. N1 and n2 represent the IDS participating in the union
type
Type shows the access type. The access type indicates how I access our data. The most obvious one is full table scan, which is very inefficient to directly and violently traverse a table to find the data I need.
There are many types of access, in order of efficiency from best to worst:
system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL
In general, make sure the query is at least range and preferably ref
all
Full table scan, in general, such SQL statements and large data volume then need to be optimized
In general, you can avoid ALL by adding indexes
index
Full index scanning is more efficient than all in two main cases:- One is that the current query overwrites the index, that is, the data we need is available in the index
- One is to use indexes for sorting, which avoids reordering of data
range
=, <>, >, >=, <, <=, IS NULL, BETWEEN, LIKE, or IN();
Examples on the official website are as follows:
SELECT * FROM tbl_name WHERE key_column = 10;
SELECT * FROM tbl_name WHERE key_column BETWEEN 10 and 20;
SELECT * FROM tbl_name WHERE key_column IN (10,20,30);
SELECT * FROM tbl_name WHERE key_part1 = 10 AND key_part2 IN (10,20,30);
index_subquery
Use indexes to associate subqueries instead of scanning the full table
value IN (SELECT key_column FROM single_table WHERE some_expr)
unique_subquery
This join type is similar to index_SubQuery in that it uses a unique index
value IN (SELECT primary_key FROM single_table WHERE some_expr)
index_merge
Multiple indexes are required to be used in combination during the query processref_or_null
In cases where a field requires both an associated condition and a NULL value, the query optimizer will choose this access method
SELECT * FROM ref_table
WHERE key_column=expr OR key_column IS NULL;
fulltext
Perform a join using the FULLTEXT indexref
Non-unique indexes are used for data lookup
SELECT * FROM ref_table WHERE key_column=expr;
SELECT * FROM ref_table,other_table WHERE ref_table.key_column=other_table.column;
SELECT * FROM ref_table,other_table WHERE ref_table.key_column_part1=other_table.column AND ref_table.key_column_part2=1;
eq_ref
Use unique indexes for data lookup
SELECT * FROM ref_table,other_table WHERE ref_table.key_column=other_table.column;
SELECT * FROM ref_table,other_table WHERE ref_table.key_column_part1=other_table.column AND ref_table.key_column_part2=1;
const
This table has at most one matching row
SELECT * FROM tbl_name WHERE primary_key=1;
SELECT * FROM tbl_name WHERE primary_key_part1=1 AND primary_key_part2=2;
Such as:
mysql> explain select * from emp where id = 1;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | emp | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
Copy the code
system
The table has only one row of records (equal to the system table), which is a special case of const type and does not normally occur
possible_keys
Shows one or more indexes that may be applied to the table. If an index exists on a field involved in the query, the index will be listed but not necessarily used by the query
key
If the index in use is null, no index is used. If an overwrite index is used, the index overlaps with the select field of the query
key_len
Represents the number of bytes used in the index. Key_len can be used to calculate the length of the index used in the query. The shorter the length is, the better, without losing precision
ref
Shows which column of the index is used, if possible, as a constant
rows
Based on table statistics and index usage, it is important to estimate the number of rows that need to be read to find the required record. This parameter is a direct reflection of how much data is found in the SQL, and as little as possible to accomplish the purpose
extra
Contains additional information
using filesort
Note mysql cannot use indexes to sort data, but can only use the sorting algorithm to sort data, which consumes extra spaceusing temporary
Create a temporary table to hold intermediate results and delete the temporary table after the query is completeusing index
This indicates that the current query overrides the index, reading data directly from the index without accessing the table. Using WHERE indicates that the index is being used to perform a lookup of the index’s key value. If not, it indicates that the index is being used to read data rather than actually lookup itusing where
Use WHERE for conditional filteringusing join buffer
Using connection cachingimpossible where
The result of the WHERE statement is always false
MySQL index basics
To understand how indexes are optimized, you must understand the underlying principles of indexes.
Advantages of indexes
- This greatly reduces the amount of data that the server needs to scan
- Helps the server avoid sorting and temporary tables
- Changed random IO to sequential IO (improved efficiency)
Usefulness of indexes
- Quickly find rows that match the WHERE clause
- Remove rows from consideration. If you can choose between multiple indexes, mysql typically uses indexes that find the least rows
- If the table has multiple column indexes, the optimizer can use any left-most prefix of the index to find rows
- Retrieves row data from other tables when there is a table join
- Find min or Max values for a particular index column
- Sort and group tables if sorting or grouping is done on the leftmost prefix of the available index
- In some cases, queries can be optimized to retrieve values without querying rows of data
Classification of indexes
MySQL index data structure inference
Indexes are used to quickly find rows with specific column values.
If there is no index, MySQL must start with the first row and then read through the table to find relevant rows.
The larger the table, the more time it takes. If the table has indexes for related columns, MySQL can quickly determine where to look in the middle of the data file without having to look at all the data. This is much faster than reading each line sequentially.
If MySQL indexes can help us query data quickly, how does it store data underneath?
Several possible storage structures
hash
Index format of the hash table
Disadvantages of storing data in hash tables:
- To use hash storage, all data files need to be added to the memory, which consumes memory space
- If all queries are equivalent queries, then hash is really fast, but in a real work environment where there are more range look-up data than equivalent queries, hash is not a good fit
In fact, when the MySQL storage engine is memory, the index data structure uses hash tables.
Binary tree
The structure of a binary tree looks like this:
Binary trees cause data skew due to the tree depth. If the tree depth is too deep, THE I/O count increases, affecting the data read efficiency.
AVL trees need to be rotated, see the legend:
In addition to the rotation operation, red-black trees also have a color change function (to reduce rotation), so that although the insertion speed is fast, but the loss of query efficiency.
In binary trees, AVL trees, and red-black trees, the I/O count increases because the tree depth is too deep, affecting the data read efficiency.
Let’s look at the B tree
B tree features:
- All the keys are distributed throughout the tree
- It is possible to end the search at non-leaf nodes and make a search in the whole set of keywords. The performance is close to binary search
- Each node has at most m subtrees
- The root node has at least two subtrees
- Branch nodes have at least M /2 subtrees (all branch nodes except root and leaf nodes)
- All leaf nodes are in the same layer, each node can have up to M-1 keys, and are arranged in ascending order
Legend description:
Each node occupies a disk block, and a node has two ascending keywords and three Pointers to the child root node. The Pointers store the address of the disk block where the child node resides.
The three scope fields divided into two keywords correspond to the scope fields of the data of the subtree pointed to by the three Pointers.
For the root node, the keywords are 16 and 34. The data range of the subtree to which the P1 pointer points is less than 16, the data range of the subtree to which the P2 pointer points is 16 to 34, and the data range of the subtree to which the P3 pointer points is greater than 34.
Procedure for finding keywords:
1. Locate disk block 1 based on the root node and read it into the memory. [Disk I/O operation 1]
2. Find pointer P2 to disk block 1 when keyword 28 is in the interval (16,34).
3. Locate disk block 3 according to the P2 pointer and read it into the memory. [Disk I/O operation 2nd]
4. Find pointer P2 to disk block 3 by comparing keyword 28 in interval (25,31).
5. Locate disk block 8 according to P2 pointer and read it into the memory. [Disk I/O operation 3]
6. In the keyword list of disk block 8, locate keyword 28.
Therefore, we can know the disadvantages of B-tree storage:
- Each node has a key and also contains data. The storage space of each page is limited. If data is large, the number of keys stored in each node will be smaller
- When a large amount of data is stored, the depth is large, and the DISK I/O count increases, affecting the query performance
So what is the MySQL index data structure
Most MySQL indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are stored in B-trees
Don’t get me wrong, in fact, MySQL index storage structure is B+ tree, above we know that B tree is not appropriate.
Mysql index data structure –B+Tree
B+Tree is an optimization based on BTree. The changes are as follows:
1. Each node of a B+Tree can contain more nodes for two reasons. The first reason is to reduce the height of the Tree, and the second reason is to change the data range into multiple ranges.
2. Non-leaf nodes store keys, while leaf nodes store keys and data.
3. Two Pointers on leaf nodes are connected to each other (in line with the disk prefetch feature), providing higher sequential query performance.
B+ tree storage search schematic diagram:
Note:
There are two head Pointers on the B+Tree, one to the root node and the other to the leaf node with the smallest keyword, and there is a chain-ring structure between all the leaf nodes (that is, the data nodes).
Therefore, there are two kinds of lookup operations on B+Tree: a range and paging lookup on the primary key, and a random lookup starting from the root node.
Since the leaf nodes of B+ tree only store data and the root node only store keys, let’s calculate that even if there is only 3 layers of B+ tree, it can also produce tens of millions of levels of data.
Zhuang you need to know
Consider the following table where id is the primary key:
mysql> select * from stu;
+------+---------+------+
| id | name | age |
+------+---------+------+
| 1 | Jack Ma | 18 |
| 2 | Pony | 19 |
+------+---------+------+
Copy the code
Back to the table
We create a normal index for a normal column.
select * from stu where name='Pony';
Copy the code
Select * from B+ tree where primary key ID is found; select * from B+ tree where primary key ID is found;
This will eventually go back to the primary key and look up the B+ tree, which is the back table.
Cover index
For this query:
mysql> select id from stu where name='Pony';
Copy the code
There is no need to find the primary key id.
No return table is called an overwrite index.
The left matching
Create a composite index (name, age) with the name and age fields, and then query:
select * from stu where name=? and age=?
Copy the code
Select * from index (name, age); select * from index (age);
select * from stu where age=?
Copy the code
If the index is not queried by name, the index will not be queried by name – this is the left-most matching rule.
Join I have to search by age, but also have an index to optimize? It can be done like this:
- (Recommended) Change the order of the combined index (name, age) and create the index (age, name)
- Or you can just put
age
Create a separate index for the field
An index pushdown
Maybe also called predicate push down…
select t1.name,t2.name from t1 join t2 on t1.id=t2.id
Copy the code
T1 has 10 records, t2 has 20 records.
Let’s guess, this is either executed this way:
Select id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id, id
Or:
First find out t1.name,t2.name, and then associate according to id
If you do not use the index condition push optimization, MySQL can only query the index t1, T2 merged all rows, and then compare all conditions.
When the index condition push-down optimization technology is used, the data stored in the index can be used to determine whether the data corresponding to the current index meets the conditions. Only the data matching the conditions can be queried.
summary
Explain
To optimize the execution of SQL statements, you need to view the execution process of SQL statements to speed up the execution efficiency.- Index advantages and usefulness.
- The data structure used for the index is a B+ tree.
- Back table, overwrite index, left-most match and index push down.
The first public line 100 li ER, welcome the old iron attention reading correction. GitHub github.com/xblzer/Java…