MySQL Advanced - Storage engine and SQL optimization

1. Overview of Mysql architecture

The entire MySQL Server consists of the following

Connection Pool: indicates a Connection Pool component
Management Services & Utilities: Components that manage Services and tools
SQL Interface: SQL Interface component
Parser: Queries the Parser component
Optimizer: Optimizer component
Caches & Buffers: Caches & Buffers
Pluggable Storage Engines: Storage engine
File System: indicates the File System

1) Connection layer

At the top layer are some client and link services, including local SOCK communication and most TCP/ IP-like communication implemented based on client/server tools. Mainly complete some similar connection processing, authorization and authentication, and related security schemes. On this layer, the concept of thread pools is introduced to provide threads for clients that are securely accessed through authentication. SSL – based secure links can also be implemented at this layer. The server also validates that it has operational permissions for each client that is securely connected.

2) Service layer

The second layer architecture mainly completes most of the core service functions, such as SQL interface, and completes the cache query, SQL analysis and optimization, and the execution of some built-in functions. All cross-storage engine functionality is also implemented in this layer, such as procedures, functions, and so on. In this layer, the server will parse the query and create the corresponding internal parse tree, and perform the corresponding optimization, such as determining the order of the query table, whether to use the index, and finally generate the corresponding execution operations. If the select statement is used, the server also queries the internal cache. If the cache space is large enough, the system performance can be improved in an environment where a large number of read operations are performed.

3) Engine layer

Storage engine layer, storage engine is really responsible for the storage and extraction of data in MySQL, server through API and storage engine communication. Different storage engines have different functions. In this way, you can select a proper storage engine based on your own needs.

4) Storage layer

The data storage layer stores data on file systems and interacts with storage engines.

MySQL is a bit different from other databases in that its architecture can be used and used well in many different scenarios. Mainly reflected in the storage engine, plug-in storage engine architecture, query processing and other system tasks and data storage extraction separation. In this architecture, you can select an appropriate storage engine based on service requirements.

2. Storage engine

2.1 Storage Engine Overview

Unlike most databases, MySQL has a storage engine concept. You can choose the optimal storage engine for different storage requirements.

A storage engine is a way to store data, build indexes, update query data, and so on. Storage engines are table based, not library based. So storage engines can also be called table types.

Databases such as Oracle and SqlServer have only one storage engine. MySQL provides a plug-in storage engine architecture. Therefore, MySQL has various storage engines. You can use them or write storage engines as required.

Storage engines supported by MySQL5.0 include: InnoDB, MyISAM, BDB, MEMORY, MERGE, EXAMPLE, NDB Cluster, ARCHIVE, CSV, BLACKHOLE, FEDERATED, etc. InnoDB and BDB provide transaction security table, Other storage engines are non-transactional security tables.

To query the storage engines supported by the current database, specify Show Engines:

If you do not specify a storage engine when creating a new table, the system will use the default storage engine. The default storage engine was MyISAM before MySQL5.5 and changed to InnoDB after 5.5.

Mysql database default storage engine

 show variables like '%storage_engine%' ； 
Copy the code

2.2 Storage Engine Features

The following describes several commonly used storage engines and compares their differences.

The characteristics of	InnoDB	MyISAM	MEMORY	MERGE	NDB
Storage limits	64TB	There are	There are	There is no	There are
The transaction security	`support`
Locking mechanism	`Row locking (suitable for high concurrency)`	`Table locks`	Table locks	Table locks	Row locks
B-tree indexes	support	support	support	support	support
The hash index			support
The full text indexing	Support (after 5.6)	support
The cluster index	support
Data index	support		support		support
The index buffer	support	support	support	support	support
Data compressibility		support
Space use	high	low	N/A	low	low
Memory usage	high	low	medium	low	high
Batch insertion speed	low	high	high	high	high
Support foreign keys	`support`

Next we will focus on the two most commonly used storage engines: InnoDB, MyISAM, and the other two MEMORY engines, MERGE.

2.2.1 InnoDB

InnoDB storage engine is the default storage engine for Mysql. InnoDB storage engine provides transaction security with commit, rollback, crash recovery capabilities. But Compared to MyISAM’s storage engine, InnoDB writes less efficiently and takes up more disk space to retain data and indexes.

InnoDB storage engine differs from other storage engines by:

Transaction control

create table goods_innodb(
	id int NOT NULL AUTO_INCREMENT,
	name varchar(20) NOT NULL.primary key(id)
)ENGINE=innodb DEFAULT CHARSET=utf8;
start transaction;

insert into goods_innodb(id,name)values(null.'Meta20');

commit;
Copy the code

Test, found in InnoDB is the existence of transactions;

Foreign key constraints

InnoDB is the only storage engine in MySQL that supports foreign keys. When creating foreign keys, the parent table must have corresponding indexes, and the child table will automatically create corresponding indexes when creating foreign keys.

In the following two tables, COUNTRY_innodb is the parent table, country_id is the primary key index, city_innoDB is the child table, and country_ID field is the foreign key corresponding to country_ID, the primary key of country_INNODB.

create table country_innodb(
	country_id int NOT NULL AUTO_INCREMENT,
    country_name varchar(100) NOT NULL.primary key(country_id)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;


create table city_innodb(
	city_id int NOT NULL AUTO_INCREMENT,
    city_name varchar(50) NOT NULL,
    country_id int NOT NULL.primary key(city_id),
    key idx_fk_country_id(country_id),
    CONSTRAINT `fk_city_country` FOREIGN KEY(country_id) REFERENCES country_innodb(country_id) ON DELETE RESTRICT ON UPDATE CASCADE
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

-- ON DELETE RESTRICT: Deletes the primary table. If there are associated records, the primary table will not be deleted
-- ON UPDATE CASCADE: Updates the primary table. If the sub-tables have associated records, UPDATE the sub-table records

insert into country_innodb values(null.'China'), (null.'America'), (null.'Japan');
insert into city_innodb values(null.'Xian'.1), (null.'NewYork'.2), (null.'BeiJing'.1);
Copy the code

When creating indexes, you can specify RESTRICT, CASCADE, SET NULL, and NO ACTION that can be performed on child tables when the parent table is deleted or updated.

RESTRICT is the same as NO ACTION to RESTRICT that the parent table cannot be updated if the child table has associated records.

CASCADE: When the parent table is updated or deleted, the records corresponding to the child table are updated or deleted.

SET NULL: When the parent table is updated or deleted, the corresponding fields of the child table are SET NULL.

For the two tables created above, the foreign keys of the subtables are set to ON DELETE RESTRICT ON UPDATE CASCADE. Therefore, when the primary table deletes records, the subtables cannot be deleted if they have corresponding records. When the primary table updates records, the subtables cannot be deleted if they have corresponding records. The sub-table is updated.

The data in the table is as follows:

You can view foreign key information in either of the following ways:

show create table city_innodb ;
Copy the code

Delete country_id 1 from country_id where country_id = 1:

 delete from country_innodb where country_id = 1;
Copy the code

Alter table country select country_id from country_id where country_id = country_id;

update country_innodb set country_id = 100 where country_id = 1;
Copy the code

After the update, the data information of the sub-table is:

storage

InnoDB stores tables and indexes in two ways:

The table structure of a table created in this way is stored in a. FRM file. Data and indexes are stored in the innodb_datA_home_DIR and innodb_datA_file_PATH tablespaces, which can be multiple files.

② With multi-tablespace storage, the table structure of the table created in this way still exists in the.frm file, but the data and indexes of each table are stored separately in.ibd.

2.2.2 MyISAM

MyISAM does not support transactions, nor does it support foreign keys. MyISAM has the advantage of fast access, no requirement for transaction integrity, or SELECT, INSERT based applications can basically use this engine to create tables. There are two more important features:

Transactions not supported

create table goods_myisam(
	id int NOT NULL AUTO_INCREMENT,
	name varchar(20) NOT NULL.primary key(id)
)ENGINE=myisam DEFAULT CHARSET=utf8;
Copy the code

Through testing, we found that in MyISAM storage engine, there is no transaction control;

File storage mode

Each MyISAM is stored on disk as 3 files with the same file name as the table name, but with the extension name:

.frm (store table definitions);

.myd (MYData, store data);

.myi (MYIndex, store index);

Then the MEMORY

The Memory storage engine stores the table’s data in Memory. Each MEMORY table actually corresponds to a disk file in the format of. FRM. This file stores only the structure of the table, while its data files are stored in MEMORY, which facilitates fast data processing and improves the efficiency of the entire table. Memory-type tables are very fast to access because their data is stored in MEMORY and HASH indexes are used by default, but when the service is shut down, the data in the table is lost.

2.2.4 the MERGE

MERGE a storage engine is a group of MyISAM tables. The MERGE tables must have the same structure. The MERGE tables do not store data themselves.

For MERGE table INSERT_METHOD, the INSERT_METHOD clause defines the table to be inserted. The INSERT_METHOD clause can have three different values. FIRST or LAST values are used so that the insert is applied to the FIRST or LAST table. Indicates that an insert cannot be performed on the MERGE table.

You can DROP a MERGE table, but this operation simply drops the MERGE table definition and has no impact on the internal tables.

Here is an example of creating and using a MERGE table:

1). To create 3 test table order_1990 order_1991, order_all, including order_all is two tables before the MERGE table:

create table order_1990(
	order_id int ,
	order_money double(10.2),
	order_address varchar(50),
	primary key (order_id)
)engine = myisam default charset=utf8;


create table order_1991(
	order_id int ,
	order_money double(10.2),
	order_address varchar(50),
	primary key (order_id)
)engine = myisam default charset=utf8;


create table order_all(
	order_id int ,
	order_money double(10.2),
	order_address varchar(50),
	primary key (order_id)
)engine = merge union = (order_1990,order_1991) INSERT_METHOD=LAST default charset=utf8;
Copy the code

2). Insert records into two tables

insert into order_1990 values(1.100.0.'Beijing');insert into order_1990 values(2.100.0.'Shanghai');insert into order_1991 values(10.200.0.'Beijing');insert into order_1991 values(11.200.0.'Shanghai');
Copy the code

SQL > select * from table where id = 1;

Data from order_1990:

Data from order_1991:

Data in order_all:

Insert a record into order_all. INSERT_METHOD is set to LAST when the MERGE table is defined.

insert into order_all values(100.10000.0.'xi 'an');Copy the code

2.3 Storage Engine Selection

When selecting a storage engine, select an appropriate one based on the characteristics of the application system. For complex application systems, you can combine multiple storage engines based on actual conditions. The following are some common storage engine environments.

InnoDB: is the default storage engine for Mysql, used for transaction processing applications, and supports foreign keys. InnoDB storage engine is a suitable choice for applications with high requirements on transaction integrity and data consistency under concurrent conditions. Data operations include many updates and deletes in addition to inserts and queries. InnoDB storage engine not only reduces locking due to deletion and update, but also ensures complete commit and rollback of transactions. InnoDB is the best choice for systems that require high data accuracy, such as billing or accounting systems.
MyISAM: If the application is mainly read and insert operations, with few updates and deletes, and the transaction integrity and concurrency requirements are not very high, then this storage engine is very suitable.
MEMORY: Keeps all data in RAM, providing access to several blocks in situations where records and other similar data need to be located quickly. MEMORY has limitations on the size of tables that are too large to be cached in MEMORY, and the need to ensure that table data can be recovered after a database abort. MEMORY tables are often used to update small tables that are less frequent to get access results quickly.
MERGE: Used to combine a series of equivalent MyISAM tables in a logical manner and reference them as an object. MERGE tables have the advantage of overcoming the size limitation of a single MyISAM table and improving the access efficiency of MERGE tables by distributing different tables on multiple disks. This is ideal for VLDB environments such as data warehousing.

3. Optimize SQL steps

In the process of application development, due to the small amount of data in the initial stage, developers pay more attention to the realization of functions when writing SQL statements. However, when the application system is officially launched, with the rapid growth of production data, many SQL statements gradually show performance problems, and the impact on production is also more and more. In this case, these problematic SQL statements become the bottleneck of the entire system performance. Therefore, we must optimize them. This chapter describes how to optimize SQL statements in MySQL in detail.

When facing a database with SQL performance problems, we should start from where to carry out systematic analysis, so that we can locate the problem SQL and solve the problem as soon as possible.

3.1 Viewing the SQL Execution Frequency

After the success of the MySQL client connection, through the show [session | global] status command can provide server status information. Show [session | global] status can according to need to add parameter “session” or “global” to display the session level (the current connection) statistical results and the global level (since database last start) statistical results. If no, session is used by default.

The following command displays the values of all statistical parameters in the current session:

show status like 'Com_______';
Copy the code

show status like 'Innodb_rows_%';
Copy the code

Com_xxx represents the number of times each XXX statement is executed. We are usually interested in the following statistical parameters.

parameter	meaning
Com_select	The number of times the select operation is performed.
Com_insert	The number of INSERT operations performed. For batch INSERT operations, only one is added.
Com_update	Number of UPDATE operations performed.
Com_delete	Number of DELETE operations performed.
Innodb_rows_read	Select the number of rows returned by the query.
Innodb_rows_inserted	The number of rows inserted by the INSERT operation.
Innodb_rows_updated	Number of rows updated by the UPDATE operation.
Innodb_rows_deleted	Number of rows deleted by the DELETE operation.
Connections	Number of attempts to connect to the MySQL server.
Uptime	Server working time.
Slow_queries	Number of slow queries.

Com_*** : These parameters are accumulated for all storage engine table operations.

Innodb_*** : These parameters are only for the InnoDB storage engine, and the cumulative algorithm is slightly different.

3.2 Locating inefficient SQL Execution

You can locate an INEFFICIENT SQL statement in either of the following ways.

Slow Query logs–log-slow-queries[=file_name] when started, mysqld writes a log file containing all SQL statements whose execution time is longer than long_query_time seconds. See log Management in Chapter 26 of this book for details.
show processlist : You can run the show processList command to view the threads in progress, including the status of the threads and whether the table is locked. In this way, you can view the SQL execution in real time. At the same time, some lock table operations are optimized.

1) id column, system assigned "connection "when user logs in to mysql_id", you can use the function connection_Id () looks at the 2) user column to display the current user. If not root, this command displays only the SQL statement within the user's privileges. 3) host column, which shows the statement was issued from which IP port, which can be used to track the user with the statement in question. 4) DB column, which shows which database the process is currently connected to. 6) Time column, which displays the duration of the state, in seconds. 7) State column, which displays the status of the SQL statement using the current connection. State describes a state in the execution of a statement. An SQL statement, such as queries, may need to go through states such as copying to TMP tables, sorting result, sending data, etc., to complete the 8) INFO column. Displaying this SQL statement is an important basis for determining problem statementsCopy the code

3.3 Explain the analysis of the execution plan

After the preceding steps are used to query inefficient SQL statements, you can run the EXPLAIN or DESC command to obtain information about how the MySQL SELECT statement is executed. The information includes how the tables are joined during the SELECT statement execution and the join sequence

Query the execution plan of an SQL statement:

explain  select * from tb_item where id = 1;
Copy the code

explain  select * from tb_item where title = 'Alcatel (OT-979) Glacier White Unicom 3G Mobile phone 3';
Copy the code

field	meaning
id	The sequence number of a SELECT query, which is a set of numbers that indicate the order in which the select clause or operation table is executed in the query.
select_type	Indicates the type of the SELECT. Common values include SIMPLE (a SIMPLE table that does not use table joins or subqueries), PRIMARY (the PRIMARY query that is the outer query), UNION (the second or subsequent query statement in the UNION), and SUBQUERY (the first SELECT in the SUBQUERY)
table	Outputs a table of result sets
type	Represents the join type of the table, Performance from good to bad connection type (system – > const — — — — — > eq_ref — — — — — – > ref — — — — — — — > ref_or_null — — — — > index_merge – > index_subquery —–> range —–> index ——> all )
possible_keys	Represents the index that may be used when querying
key	Represents the actual index used
key_len	The length of the index field
rows	Number of scanned rows
extra	Description and implementation

3.3.1 Environment Preparation

CREATE TABLE `t_role` (  `id` varchar(32) NOT NULL,  `role_name` varchar(255) DEFAULT NULL,  `role_code` varchar(255) DEFAULT NULL,  `description` varchar(255) DEFAULT NULL.PRIMARY KEY (`id`),  UNIQUE KEY `unique_role_name` (`role_name`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;CREATE TABLE `t_user` (  `id` varchar(32) NOT NULL,  `username` varchar(45) NOT NULL,  `password` varchar(96) NOT NULL,  `name` varchar(45) NOT NULL.PRIMARY KEY (`id`),  UNIQUE KEY `unique_user_username` (`username`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;CREATE TABLE `user_role` (  `id` int(11) NOT NULL auto_increment ,  `user_id` varchar(32) DEFAULT NULL,  `role_id` varchar(32) DEFAULT NULL.PRIMARY KEY (`id`),  KEY `fk_ur_user_id` (`user_id`),  KEY `fk_ur_role_id` (`role_id`),  CONSTRAINT `fk_ur_role_id` FOREIGN KEY (`role_id`) REFERENCES `t_role` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,  CONSTRAINT `fk_ur_user_id` FOREIGN KEY (`user_id`) REFERENCES `t_user` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION) ENGINE=InnoDB DEFAULT CHARSET=utf8;insert into `t_user` (`id`, `username`, `password`, `name`) values('1'.'super'.'$2a$10$TJ4TmCdK.X4wv/tCqHW14.w70U3CC33CeVncD3SLmyMXMknstqKRe'.'Super Administrator');insert into `t_user` (`id`, `username`, `password`, `name`) values('2'.'admin'.'$2a$10$TJ4TmCdK.X4wv/tCqHW14.w70U3CC33CeVncD3SLmyMXMknstqKRe'.'System Administrator');insert into `t_user` (`id`, `username`, `password`, `name`) values('3'.'itcast'.'$2a$10$8qmaHgUFUAmPR5pOuWhYWOr291WJYjHelUlYn07k5ELF8ZCrW0Cui'.'test02');insert into `t_user` (`id`, `username`, `password`, `name`) values('4'.'stu1'.'$2a$10$pLtt2KDAFpwTWLjNsmTEi.oU1yOZyIn9XkziK/y/spH5rftCpUMZa'.Students' 1 ');insert into `t_user` (`id`, `username`, `password`, `name`) values('5'.'stu2'.'$2a$10$nxPKkYSez7uz2YQYUnwhR.z57km3yqKn3Hr/p1FR6ZKgc18u.Tvqm'.Students' 2 ');insert into `t_user` (`id`, `username`, `password`, `name`) values('6'.'t1'.'$2a$10$TJ4TmCdK.X4wv/tCqHW14.w70U3CC33CeVncD3SLmyMXMknstqKRe'.Teachers' 1 ');INSERT INTO `t_role` (`id`, `role_name`, `role_code`, `description`) VALUES('5'.'students'.'student'.'students');INSERT INTO `t_role` (`id`, `role_name`, `role_code`, `description`) VALUES('7'.'the teacher'.'teacher'.'the teacher');INSERT INTO `t_role` (`id`, `role_name`, `role_code`, `description`) VALUES('8'.'Teaching Manager'.'teachmanager'.'Teaching Manager');INSERT INTO `t_role` (`id`, `role_name`, `role_code`, `description`) VALUES('9'.'Administrator'.'admin'.'Administrator');INSERT INTO `t_role` (`id`, `role_name`, `role_code`, `description`) VALUES('10'.'Super Administrator'.'super'.'Super Administrator');INSERT INTO user_role(id,user_id,role_id) VALUES(NULL.'1'.'5'), (NULL.'1'.'7'), (NULL.'2'.'8'), (NULL.'3'.'9'), (NULL.'4'.'8'), (NULL.'5'.'10');Copy the code

3.3.2 rainfall distribution on 10-12 the explain of id

The ID field is the sequence number of the SELECT query and is a set of numbers representing the order in which the select clause or operation table is executed in the query. There are three id cases:

1) The same ID means that the table is loaded from top to bottom.

explain select * from t_role r, t_user u, user_role ur where r.id = ur.role_id and u.id = ur.user_id ;
Copy the code

2) Different ids The greater the ID value, the higher the priority, the earlier the execution.

EXPLAIN SELECT * FROM t_role WHERE id = (SELECT role_id FROM user_role WHERE user_id = (SELECT id FROM t_user WHERE username = 'stu1'))
Copy the code

3) Id has the same, but also different, exist at the same time. Those with the same ID can be considered a group and executed from top to bottom. In all groups, the greater the id value, the higher the priority and the earlier the execution.

EXPLAIN SELECT * FROM t_role r , (SELECT * FROM user_role ur WHERE ur.`user_id` = '2') a WHERE r.id = a.role_id ; 
Copy the code

3.3.3 select_type explain

Indicates the SELECT type. Common values are shown in the following table: The efficiency decreases from top to bottom

select_type	meaning
SIMPLE	`A simple SELECT query that does not contain subqueries or unions`
PRIMARY	`If the query contains any complex subqueries, the outermost query is marked with this identifier`
SUBQUERY	`Subqueries are included in SELECT or WHERE lists`
DERIVED	The subqueries contained in the FROM list are labeled as DERIVED and MYSQL performs these subqueries recursively, putting the results in temporary tables
UNION	If the second SELECT occurs after the UNION, mark it as UNION; If UNION is included in the subquery of the FROM clause, the outer SELECT will be marked: DERIVED
UNION RESULT	SELECT to get the result from the UNION table

Derived2 in the table represents the table with ID 2 in Derived.

We explain the table

Show which table this row is about

The type of 3.3.5 explain

Type indicates the access type and is an important indicator. The value can be:

type	meaning
NULL	MySQL does not access any tables, indexes, and returns results directly
system	The table has only one row of records (equal to the system table), which is a special case of const type and generally does not occur
const	`Const is used to compare primary key or unique indexes. Because it only matches one row, it's fast. By placing the primary key in the WHERE list, MySQL can convert the query to a constant. Const compares all parts of a "primary key" or "unique" index with a constant value`
eq_ref	`It is similar to ref, but the difference is that it uses a unique index. Only one record can be queried through the associated query based on the primary key. Common for primary key or unique index scans`
ref	`A non-unique index scan that returns all rows matching a single value. Essentially an index access that returns all rows (multiple rows) that match a single value`
range	Only the rows that are given are retrieved, using an index to select the rows. Where between, <, >, in, etc.
index	`The difference between index and ALL is that index traverses the index tree, which is usually faster than ALL, which traverses the data file.`
all	`The entire table is traversed to find matching rows`

The resulting value, from best to worst, is:

NULL > system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALLsystem > const > eq_ref > ref > range > index > ALL
Copy the code

In general, we need to ensure that the query is at least range and preferably ref.

The key of 3.3.6 explain

Possible_keys: Displays one or more possible indexes that can be applied to the table. Key: indicates the actual index, ifNULL, no index is used. Key_len: indicates the number of bytes used in the index. This value is the maximum possible length of the index field, not the actual length. The shorter the length is, the better without compromising accuracy.Copy the code

The rows of 3.3.7 explain

Number of scanned rows.

3.3.8 explain the extra

Additional execution plan information is displayed in this column. The first two need to be optimized to keep the following using index

extra	meaning
using filesort	Note mysql uses an external index sort to read data, instead of reading data in the order of the indexes in the table.`Efficiency is low.`
using temporary	Temporary tables are used to hold intermediate results. MySQL uses temporary tables when sorting query results. Order by and group by;`Low efficiency`
using index	Indicates that the corresponding SELECT operation uses an overwrite index to avoid accessing rows of the table.`Efficiency is good.`

3.4 show profile Analyze SQL

Mysql has added support for show Profiles and show profile statements starting with version 5.0.37. Show Profiles can help us understand where our time is going when doing SQL optimizations.

The have_profiling parameter allows you to see if MySQL currently supports profiles:

Profiling is turned off by default and can be turned on at the Session level with the set statement:

set profiling=1; //Turn on the profiling switch;Copy the code

Profiles provide a clearer picture of how SQL is executed.

First, we can perform a series of operations, as shown below:

showdatabases; use db01;show tables;select * from tb_item where id < 5;select count(*) from tb_item;
Copy the code

After executing the above command, execute the show Profiles directive to see how long the SQL statement takes to execute:

The show profile for query query_id statement displays the status and time consumed by each thread during the SQL execution:

TIP: Sending Data indicates that the MySQL thread is accessing the data row and Sending the result back to the client, not just the client. In the Sending Data state, the MySQL thread usually needs to do a large number of disk read operations, so it is often the state that takes the longest time in the whole query.Copy the code

After obtaining the state of the thread that consumes the most time, MySQL supports all, CPU, block IO, Context switch, page faults, and other detailed types to see which resources MySQL is consuming too much time with. For example, choose to view CPU elapsed time:

field	meaning
Status	SQL statement execution status
Duration	The time spent at each step of SQL execution
CPU_user	CPU occupied by the current user
CPU_system	CPU occupied by the system

3.5 Trace analysis optimizer execution plan

MySQL5.6 provides A trace of SQL, and the trace file provides further insight into why the optimizer chose plan A over Plan B.

Open trace, set the format to JSON, and set the maximum memory size that trace can use to avoid the default memory is too small during parsing and cannot be fully displayed.

SET optimizer_trace="enabled=on",end_markers_in_json=on;set optimizer_trace_max_mem_size=1000000;
Copy the code

Execute SQL statement:

select * from tb_item where id < 4;
Copy the code

Finally, check information_schema.optimizer_trace to see how MySQL executes SQL:

select * from information_schema.optimizer_trace\G;*************************** 1. row ***************************QUERY: select * from tb_item where id < 4TRACE: {  "steps": [    {      "join_preparation": {        "select#": 1,        "steps": [          {            "expanded_query": "/* select#1 */ select `tb_item`.`id` AS `id`,`tb_item`.`title` AS `title`,`tb_item`.`price` AS `price`,`tb_item`.`num` AS `num`,`tb_item`.`categoryid` AS `categoryid`,`tb_item`.`status` AS `status`,`tb_item`.`sellerid` AS `sellerid`,`tb_item`.`createtime` AS `createtime`,`tb_item`.`updatetime` AS `updatetime` from `tb_item` where (`tb_item`.`id` < 4)"          }        ] /* steps */      } /* join_preparation */    },    {      "join_optimization": {        "select#": 1,        "steps": [          {            "condition_processing": {              "condition": "WHERE",              "original_condition": "(`tb_item`.`id` < 4)",              "steps": [                {                  "transformation": "equality_propagation",                  "resulting_condition": "(`tb_item`.`id` < 4)"                },                {                  "transformation": "constant_propagation",                  "resulting_condition": "(`tb_item`.`id` < 4)"                },                {                  "transformation": "trivial_condition_removal",                  "resulting_condition": "(`tb_item`.`id` < 4)"                }              ] /* steps */            } /* condition_processing */          },          {            "table_dependencies": [              {                "table": "`tb_item`",                "row_may_be_null": false,                "map_bit": 0,                "depends_on_map_bits": [                ] /* depends_on_map_bits */}]/* table_dependencies */          },          {            "ref_optimizer_key_uses": [            ] /* ref_optimizer_key_uses */          },          {            "rows_estimation": [              {                "table": "`tb_item`",                "range_analysis": {                  "table_scan": {                    "rows": 9816098,                    "cost": 2.04 the e6                  } /* table_scan */,                  "potential_range_indices": [                    {                      "index": "PRIMARY",                      "usable": true,                      "key_parts": [                        "id"                      ] /* key_parts */}]/* potential_range_indices */,                  "setup_range_conditions": [                  ] /* setup_range_conditions */,                  "group_index_range": {                    "chosen": false,                    "cause": "not_group_by_or_distinct"                  } /* group_index_range */,                  "analyzing_range_alternatives": {                    "range_scan_alternatives": [                      {                        "index": "PRIMARY",                        "ranges": [                          "id < 4"                        ] /* ranges */,                        "index_dives_for_eq_ranges": true,                        "rowid_ordered": true,                        "using_mrr": false,                        "index_only": false,                        "rows": 3,                        "cost": 1.6154,                        "chosen": true}]/* range_scan_alternatives */,                    "analyzing_roworder_intersect": {                      "usable": false,                      "cause": "too_few_roworder_scans"                    } /* analyzing_roworder_intersect */                  } /* analyzing_range_alternatives */,                  "chosen_range_access_summary": {                    "range_access_plan": {                      "type": "range_scan",                      "index": "PRIMARY",                      "rows": 3,                      "ranges": [                        "id < 4"                      ] /* ranges */                    } /* range_access_plan */,                    "rows_for_plan": 3,                    "cost_for_plan": 1.6154,                    "chosen": true                  } /* chosen_range_access_summary */                } /* range_analysis */}]/* rows_estimation */          },          {            "considered_execution_plans": [              {                "plan_prefix": [                ] /* plan_prefix */,                "table": "`tb_item`",                "best_access_path": {                  "considered_access_paths": [                    {                      "access_type": "range",                      "rows": 3,                      "cost": 2.2154,                      "chosen": true}]/* considered_access_paths */                } /* best_access_path */,                "cost_for_plan": 2.2154,                "rows_for_plan": 3,                "chosen": true}]/* considered_execution_plans */          },          {            "attaching_conditions_to_tables": {              "original_condition": "(`tb_item`.`id` < 4)",              "attached_conditions_computation": [              ] /* attached_conditions_computation */,              "attached_conditions_summary": [                {                  "table": "`tb_item`",                  "attached": "(`tb_item`.`id` < 4)"                }              ] /* attached_conditions_summary */            } /* attaching_conditions_to_tables */          },          {            "refine_plan": [              {                "table": "`tb_item`",                "access_type": "range"              }            ] /* refine_plan */}]/* steps */      } /* join_optimization */    },    {      "join_execution": {        "select#": 1,        "steps": [        ] /* steps */      } /* join_execution */}]/* steps */}
Copy the code

4. Use of indexes

Indexing is one of the most common and important methods of database optimization, and it usually helps users solve most of the performance optimization problems of MySQL.

4.1 Verifying Indexes Improves query efficiency

In our prepared table structure, TB_item, we store 3 million records;

A. Query information by ID

select * from tb_item where id = 1999\G;
Copy the code

Query speed is fast, close to 0S, mainly because ID primary key, index;

2). Perform precise query according to title

select * from tb_item where title = 'iphoneX Mobile 3G 32G941'\G; 
Copy the code

To view the execution plan of an SQL statement:

Create index for title;

create index idx_item_title on tb_item(title);
Copy the code

After the index is created, query again:

Using Explain to view the execution plan, the SQL was executed using the index just created

4.2 Use of Indexes

4.2.1 Preparing the Environment

create table `tb_seller` (	`sellerid` varchar (100),	`name` varchar (100),	`nickname` varchar (50),	`password` varchar (60),	`status` varchar (1),	`address` varchar (100),	`createtime` datetime,    primary key(`sellerid`))engine=innodb default charset=utf8mb4; insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('alibaba'.Alibaba.Ali Store.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('baidu'.Baidu Technology Co., LTD..'Baidu Store'.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('huawei'.Huawei Technologies Co., LTD..Huawei Store.'e10adc3949ba59abbe56e057f20f883e'.'0'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('itcast'.'Transwise Podcast Education Technology Limited'.'Pass the Wisdom Podcast'.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('itheima'.'Dark Horse Programmer'.'Dark Horse Programmer'.'e10adc3949ba59abbe56e057f20f883e'.'0'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('luoji'.'Logitech Technologies Limited'.Logitech Shop.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('oppo'.'OPPO Technology Limited '.'OPPO official Flagship Store '.'e10adc3949ba59abbe56e057f20f883e'.'0'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('ourpalm'.'Zhangqu Technology Co., LTD.'.'Palm Fun Shop'.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('qiandu'.'Thousand Degree Technology'.'Thousand Degree Store'.'e10adc3949ba59abbe56e057f20f883e'.'2'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('sina'.'Sina Technology Limited'.'Sina official Flagship Store'.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Beijing'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('xiaomi'.'Xiaomi Technology'.'Xiaomi Official Flagship Store'.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Xi 'an'.'the 2088-01-01 12:00:00');insert into `tb_seller` (`sellerid`, `name`, `nickname`, `password`, `status`, `address`, `createtime`) values('yijia'.'宜家家居'.'Ikea Flagship Store'.'e10adc3949ba59abbe56e057f20f883e'.'1'.'Beijing'.'the 2088-01-01 12:00:00');Create index idx_seller_NAMe_STA_ADDR on TB_seller (name,status,address);
Copy the code

4.2.2 Avoiding Index Failure

1). Full value matching, specify a specific value for all columns in the index. In this case, the index takes effect and the execution efficiency is high.

explain select * from tb_seller where name='Xiaomi Technology' and status='1' and address='Beijing'\G;
Copy the code

2). Left-most prefix rule

If you index multiple columns, follow the leftmost prefix rule. Indicates that the query starts at the left-most front of the index and does not skip columns in the index. (Imagine climbing stairs.)

To match the leftmost prefix rule, go index:

Invalidates index by violating leftmost prefix rule:

If the leftmost rule is followed, but a column is jumped, only the leftmost column index is in effect:

3). Select * from range; Index invalid for a field after a range query condition

The status query is indexed by name, but the last condition, address, does not use an index.

4). Do not perform operations on the index column, the index will be invalid.

5). The string is not quoted, which invalidates the index.

Since the string is not quoted in the query, the MySQL query optimizer will automatically cast the string, invalidating the index.

6). Use overridden indexes to avoid select *

Try to use overridden indexes (queries that only access the index (index columns contain query columns entirely)) and reduce select *.

If the query column exceeds the index column, performance will also be reduced. Mysql > select * from user where password = ‘password’;

TIP : 	usingIndex: appears when an overridden index is usedusing where: In the case of a lookup using an index, you need to go back to the table to query the desired datausing index condition: The lookup uses the index, but needs to query the data back to the tableusing index ; using whereThe: lookup uses an index, but the required data is found in the index column, so there is no need to query the data back to the tableCopy the code

If the column before or has an index, and the column behind it does not have an index, then the index involved will not be used.

Select * from createtime; select * from createtime;

explain select * from tb_seller where name='Dark Horse Programmer' or createtime = 'the 2088-01-01 12:00:00'\G;	
Copy the code

Select * from ‘%’ where ‘%’ = ‘%’;

If it is only a tail fuzzy match, the index is not invalidated. If it is a header fuzzy match, the index is invalid.

Solution: Overwrite the index

9). If MySQL evaluates using an index as slower than using a full table, do not use the index.

MySQL > select * from ‘address’ where address = ‘Beijing’; MySQL > select * from ‘explain’; MySQL > select * from ‘address’ where address = ‘Beijing’; Instead, use a full table scan.

10). Is NULL, is NOT NULL

If a full table scan is fast, the MySQL database automatically checks whether a full table scan is performed. If most of the index column data in the table is non-null, the index is moved when is not NULL is used, and the index is not moved when IS NULL is used (faster than full table scan). And vice versa.

11). In index, not in index invalid.

12). Single column index and compound index.

Use composite indexes rather than single-column indexes.

Creating a composite index

create index idx_name_sta_address ontb_seller(name, status, address); This creates three indexes: name name+ status	name + status + address
Copy the code

Create a single-column index

create index idx_seller_name on tb_seller(name);create index idx_seller_status on tb_seller(status);create index idx_seller_address on tb_seller(address);
Copy the code

When a single-column index is used, the database selects the best index (the most recognizable index) and does not use all indexes.

4.3 Checking index Usage

show status like 'Handler_read%';	 Show global status like 'Handler_read%'; -- Check global index usage
Copy the code

Handler_read_first: the number of times the first item in the index is read. If it is high, the server is doing a lot of full index scans (the lower the value, the better). Handler_read_key: If the index is working, this value represents the number of times a row is read by the index value, while a lower value indicates a modest performance improvement because the index is not used very often (a higher value is better). Handler_read_next: Number of requests to read the next line in key order. This value is increased if you query indexed columns with range constraints or if an index scan is performed. Handler_read_prev: Number of requests to read the previous row in key order. This read method is mainly used for optimizationORDER BY.DESC. Handler_read_rnd: Number of requests to read a line based on a fixed position. This value is higher if you are executing a large number of queries and need to sort the results. You may be using a lot of queries that require MySQL to scan the entire table or your join is not using keys correctly. This high value implies inefficiency and should be remedied by indexing. Handler_read_rnd_next: Number of requests to read the next line in the data file. This value is higher if you are doing a lot of table scans. This usually means that your table index is incorrect or that a written query does not utilize the index.Copy the code

5. SQL optimization

5.1 Inserting Data in batches

Environment Preparation:

CREATE TABLE `tb_user_2` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `username` varchar(45) NOT NULL,  `password` varchar(96) NOT NULL,  `name` varchar(45) NOT NULL,  `birthday` datetime DEFAULT NULL,  `sex` char(1) DEFAULT NULL,  `email` varchar(45) DEFAULT NULL,  `phone` varchar(45) DEFAULT NULL,  `qq` varchar(32) DEFAULT NULL,  `status` varchar(32) NOT NULL COMMENT 'User status',  `create_time` datetime NOT NULL,  `update_time` datetime DEFAULT NULL.PRIMARY KEY (`id`),  UNIQUE KEY `unique_user_username` (`username`)  ENGINE=InnoDB DEFAULT CHARSET=utf8;
Copy the code

When using the load command to import data, proper Settings can make the import more efficient.

For innoDB-type tables, there are several ways to improve import efficiency:

1) Sequential insertion of primary keys

Since innoDB-type tables are stored in the order of primary keys, sorting imported data in the order of primary keys can effectively improve the efficiency of data import. If an InnoDB table does not have a primary key, the system automatically creates an internal column as the primary key by default, so if you can create a primary key for the table, you can use this to improve the efficiency of data import.

The script file is sqL1.log----> Primary key in order sql2.log ----> Primary key in order
Copy the code

Insert ID order to sort data:

Insert ID unordered data:

【LOAD DATA 】

Mysql notes ERROR

2) Turn off the uniqueness check

Before importing data, run SET UNIQUE_CHECKS=0 to disable the uniqueness check. After importing data, run SET UNIQUE_CHECKS=1 to restore the uniqueness check to improve import efficiency.

3) Manually commit transactions

If the application uses automatic commit, you are advised to run SET AUTOCOMMIT=0 before the import to disable automatic commit, and run SET AUTOCOMMIT=1 after the import to enable automatic commit to improve the import efficiency.

5.2 Optimizing insert statements

There are several optimizations to consider when performing insert operations on data.

If you need to insert many rows into a table at the same time, you should try to use the insert statement of multiple value tables. This method greatly reduces the cost of connection, closing, and so on between the client and the database. Makes efficiency faster than a single INSERT statement executed separately.

For example, the original format is:
```
insert into tb_test values(1.'Tom');insert into tb_test values(2.'Cat');insert into tb_test values(3.'Jerry');
Copy the code
```
The optimized scheme is as follows:
```
insert into tb_test values(1.'Tom'), (2.'Cat'), (3.'Jerry');
Copy the code
```

Insert data in a transaction.

start transaction;insert into tb_test values(1.'Tom');insert into tb_test values(2.'Cat');insert into tb_test values(3.'Jerry');commit;
Copy the code

Sequential insertion of data

insert into tb_test values(4.'Tim');insert into tb_test values(1.'Tom');insert into tb_test values(3.'Jerry');insert into tb_test values(5.'Rose');insert into tb_test values(2.'Cat');
Copy the code

The optimized

insert into tb_test values(1.'Tom');insert into tb_test values(2.'Cat');insert into tb_test values(3.'Jerry');insert into tb_test values(4.'Tim');insert into tb_test values(5.'Rose');
Copy the code

5.3 Optimize the Order BY statement

5.3.1 Environment Preparation

CREATE TABLE `emp` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `name` varchar(100) NOT NULL,  `age` int(3) NOT NULL,  `salary` int(11) DEFAULT NULL.PRIMARY KEY (`id`)) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4;insert into `emp` (`id`, `name`, `age`, `salary`) values('1'.'Tom'.'25'.'2300');insert into `emp` (`id`, `name`, `age`, `salary`) values('2'.'Jerry'.'30'.'3500');insert into `emp` (`id`, `name`, `age`, `salary`) values('3'.'Luci'.'25'.'2800');insert into `emp` (`id`, `name`, `age`, `salary`) values('4'.'Jay'.'36'.'3500');insert into `emp` (`id`, `name`, `age`, `salary`) values('5'.'Tom2'.'21'.'2200');insert into `emp` (`id`, `name`, `age`, `salary`) values('6'.'Jerry2'.'and'.'3300');insert into `emp` (`id`, `name`, `age`, `salary`) values('7'.'Luci2'.'26'.'2700');insert into `emp` (`id`, `name`, `age`, `salary`) values('8'.'Jay2'.'33'.'3500');insert into `emp` (`id`, `name`, `age`, `salary`) values('9'.'Tom3'.'23'.'2400');insert into `emp` (`id`, `name`, `age`, `salary`) values('10'.'Jerry3'.'32'.'3100');insert into `emp` (`id`, `name`, `age`, `salary`) values('11'.'Luci3'.'26'.'2900');insert into `emp` (`id`, `name`, `age`, `salary`) values('12'.'Jay3'.'37'.'4500');create index idx_emp_age_salary on emp(age,salary);
Copy the code

5.3.2 Two sorting methods

1). The first method is to sort the returned data by filesort. All sorts that do not return the sorted results by index are called filesort sorts. Low efficiency

2). The second method directly returns ordered data through ordered index sequential scan, which is using index, without additional sorting and with high operation efficiency.

Multifield sort

Now that you know how MySQL sorts, the optimization goal is clear: minimize additional sorts and return the ordered data directly using index. Where conditions use the same index as Order by, and Order by has the same Order as index, and Order by’s fields are either in ascending or descending Order. Otherwise, you would definitely need to do something extra, which would result in FileSort.

5.3.3 Optimization of Filesort

It is possible to reduce the presence of Filesort by creating appropriate indexes, but in some cases, it is necessary to speed up the sorting operation of Filesort. MySQL has two sorting algorithms for Filesort:

1) Double scan algorithm: before MySQL4.1, use this way to sort. Sort (sort buffer, sort buffer); if sort buffer is not enough, store the sort result in temporary table. Once sorted, the records are read back into the table based on the row pointer, which can result in a large number of random I/O operations.

2) Single-scan algorithm: take out all fields that meet the conditions at one time, and then output the result set directly after sorting in sort area sort buffer. The memory cost of sorting is high, but the sorting efficiency is higher than that of two-scan algorithm.

If max_LENGTH_FOR_sort_data is larger than max_length_for_sort_data, then the length_for_sort_data algorithm is used. If max_length_for_sort_data is larger than max_length_sort_data, the length_sort_data algorithm is used. Otherwise use the first.

Sort_buffer_size and max_LENGTH_FOR_SORT_DATA system variables can be appropriately raised to increase the size of the sort area and improve sorting efficiency.

5.4 Optimizing the Group BY statement

Since GROUP BY actually does the same sort operation, and compared with ORDER BY, GROUP BY mainly only has more grouping operations after sorting. Of course, if other aggregate functions are used in grouping, then some calculation of aggregate functions is required. Therefore, indexes can be used in the implementation of GROUP BY as well as ORDER BY.

If the query contains group by but the user wants to avoid consumption of the sort result, then order by NULL can be used to disable the sort. As follows:

drop index idx_emp_age_salary onemp; explainselect age,count(*) from emp group by age;
Copy the code

The optimized

explain select age,count(*) from emp group by age order by null;
Copy the code

As you can see from the above example, the first SQL statement requires “filesort”, while the second SQL statement does not require “filesort” due to order by NULL.

Create index:

create index idx_emp_age_salary onEmp (age, salary);Copy the code

5.5 Optimizing nested Queries

Support for SQL subqueries began after Mysql4.1. The technique uses a SELECT statement to create a single column of query results, which can then be used as a filter in another query. Using subqueries allows you to do many SQL operations at once that would logically require multiple steps, avoid transactions or table locks, and make it easy to write. However, in some cases, subqueries can be replaced by more efficient joins.

Example of finding all user information with roles:

 explain select * from t_user where id in (select user_id from user_role );
Copy the code

The implementation plan is:

After the optimization:

explain select * from t_user u , user_role ur where u.id = ur.user_id;
Copy the code

Join queries are more efficient because MySQL does not need to create temporary tables in memory to perform a logical two-step query.

5.6 Optimization of OR conditions

For a query clause that contains OR, if you want to use an index, each condition column between OR must use the index, and no compound index can be used. If there is no index, you should consider adding an index.

Get all indexes in emP table:

Example:

explain select * from emp where id = 1 or age = 30;
Copy the code

It is recommended to replace or with union:

Let’s compare the important indicators and find that the main differences are type and REF

Type shows the type of access, which is a more important indicator. The result values in descending order are:

system > const > eq_ref > ref > fulltext > ref_or_null  > index_merge > unique_subquery > index_subquery > range > index > ALL
Copy the code

The UNION statement has a type value of ref and the OR statement has a type value of range, which is a significant difference

The ref value of the UNION statement is const, and the type value of the OR statement is null. Const means a constant value reference, which is very fast

The difference between these two terms indicates that UNION is superior to OR.

5.7 Optimizing Paging Query

In general paging queries, performance can be improved by creating overwrite indexes. MySQL > select * from ‘limit 200000010’; select * from ‘limit 200000010’; select * from ‘limit 200000010’;

5.7.1 Optimization Idea 1

Sort paging is performed on the index, and then the primary key is associated back to the original table to query the other column contents.

5.7.2 Optimization Idea 2

This scheme applies to tables with self-increasing primary keys and can convert Limit queries into queries at a certain location. (Limitation: primary bonds cannot be faulted)

5.8 Using SQL Prompts

SQL hints, is an important means to optimize the database, in simple terms, is to add some artificial hints in THE SQL statement to achieve the purpose of optimizing the operation.

5.8.1 USE INDEX

Add a use index to the list of indexes that you want MySQL to refer to.

create index idx_seller_name on tb_seller(name);
Copy the code

5.8.2 IGNORE INDEX

If you simply want MySQL to ignore one or more indexes, you can use ignore Index as a hint.

 explain select * from tb_seller ignore index(idx_seller_name) where name = 'Xiaomi Technology';
Copy the code

5.8.3 FORCE INDEX

To force MySQL to use a specific index, use force Index as the hint in the query.

create index idx_seller_address on tb_seller(address);
Copy the code

MySQL Advanced – Storage engine and SQL optimization