MySQL learning SQL statements

This is the 21st day of my participation in the August More Text Challenge

Locating slow SQL

View the slow query log to determine the slow query that has been executed

Show processList Shows slow queries that are being executed

Exlain analysis slow query

Explain can obtain the execution plan of SQL statements in MySQL, such as whether the statement uses an associated query, whether the statement uses an index, the number of rows scanned, and so on. Can help us choose better indexes and write better SQL. Usage: Run the explain command in front of the query statement.

explain select * from t1 where b=100;
Copy the code

Column names explain

Id Query id
Select_type Query type: displays whether the row is a simple or complex query
Table Indicates the table involved
Matching partitions: The query will match the partitions where the records reside. This column is displayed only when the partition keyword is used. For non-partitioned tables, this value is NULL.
Type Indicates the type of the query table
Possible_keys Possible index
Key Indicates the selected index
Key_len Selected index length: Generally used to determine how many columns are selected in the federated index
Ref compares the column with the index
Rows is the estimated number of rows that need to be scanned. For InnoDB, this value is an estimate and not necessarily accurate
Filtered percentage of rows filtered by criteria
Extra Additional Information

Reasons not to go to the index

1. Function operation

select * from t1 where date(c) =’2019-05-21′; Select * from t1 where c>=’2019-05-21 00:00:00′ and C <=’2019-05-21 23:59:59′;

When performing functional operations on index fields, the optimizer discards the index.

2. Implicit conversion

select * from t1 where a=1000; Select * from t1 where cast(a as signed int) =1000;

Select * from t1 where a=’1000′;

3. Fuzzy query

select * from t1 where a like ‘%1111%’; Select * from t1 where a like ‘1111%’;

If the condition only knows the intermediate value and requires a fuzzy query to find it, then use ElasticSearch or another search server is recommended.

4. Range query

select * from t1 where b>=1 and b <=2000; Select * from t1 where b>=1 and b <=1000; select * from t1 where b>=1001 and b <=2000;

The optimizer evaluates whether to use an index based on the percentage of retrieves, table size, I/O block size, and so on. If the volume of data in a single query is too large, the optimizer will not move the index.

5. Calculation operations

select * from t1 where b-1 =1000; Select * from t1 where b =1000 + 1;

Operations on index fields will not use the index.

Generally, it is recommended to use program code rather than MySQL to calculate conditional fields. If you can’t avoid calculating in MySQL, you must put the calculation after the equal sign.

conclusion

Implicit conversions should be avoided
A like query cannot start with %
When a range query is performed, the percentage of data contained cannot be too large
Operations and functions on conditional fields are not recommended

Optimize data import

When there are bulk imports, it is recommended that one INSERT statement insert multiple rows of data.

Because most of the bulk import time is spent communicating with the client and server, the combined commit of multiple INSERT statements can reduce the time the client and server communicate, and the combined commit can also reduce the number of data drops.

According to the test, the following methods are summarized to speed up the batch data import:

Insert multiple row values at once;
Turn off automatic submission, multiple SQL insert data one submission;
Adjust the parameters so that innodb_flush_log_at_trx_COMMIT and SYNc_binlog are both set to 0 (of course data may be lost in this case).

Make order by and Group by queries faster

order by

Filesort in MySQL does not necessarily sort in disk files, but may also sort in memory, depending on the size of the sorted data and the size of the sort_buffer_size configuration.

Order by optimization techniques

Add a suitable index

Sort field adds index. Sort field adds index. Using index(indexed), using filesort(not indexed)
If you want to optimize by adding an index to sort multiple fields, make sure that the order of the sorted fields is the same as the order of the columns in the union index. Therefore, if more than one field is sorted, you can optimize the sort statement by adding joint indexes on more than one sort field.
Optimization of equivalent query and then sort. For statements that query equivalents first and then sort, you can optimize such sort statements by adding joint indexes to the conditional and sorted fields.

Remove unnecessary return fields

The reason all columns are not indexed is that it costs more to scan the entire index and find rows that are not indexed than to scan the entire table.

So the optimizer abandons indexes.

Modify the parameters

Max_length_for_sort_data: If sorting is not efficient, you can increase the value of max_LENGTH_FOR_sort_DATA appropriately to let the optimizer choose full-field sorting first. Do not set the CPU usage to too high. Otherwise, the CPU usage is too low or the disk I/O is too high.
Sort_buffer_size: Appropriately increase sort_buffer_size to make sorting possible in memory. Do not set the value to too large, which may cause database server SWAP.

Several cases where index sort cannot be used

Use range query to sort again
A combination of ASC and DESC will not work with the index

group by

By default, the group by field is sorted, so the optimization is basically the same as the order by field. If the purpose is to group instead of sort, you can specify order by NULL to disable sorting.

Paging query

Paging queries sorted by self-increasing and consecutive primary keys

Select * from t1 limit 99000,2;

Select * from t1 where id >99000 LIMIT 2;

This rewriting must satisfy the following two conditions:

The primary key is self-increasing and continuous
The results are sorted by primary key

Query paging queries sorted by non-primary key fields

Select * from t1 order by a limit 99000,2;

Scanning an entire index and finding unindexed rows costs more than scanning a full table, so the optimizer abandons indexes.

SQL > select * from primary key; select * from primary key; select * from primary key;

Select * from t1 f inner join (select id from t1 order by a limit 99000,2)g on fid = gid;

count

When count() counts a column, such as count(a), where a represents the column name, null is not counted.

Count (*) is counted whether or not it contains a null value.

For the MyISAM engine, count(*) would be very fast if there was no WHERE clause and no other columns were retrieved. The MyISAM engine stores the total number of rows in the table on disk.

InnoDB does not keep the number of rows in a table because concurrent transactions may read different rows at the same time. So count(*) is performed AD hoc and is much slower than MyISAM.

InnoDB count(*) = InnoDB count(*) = MyISAM

MyISAM will maintain the total number of rows of the table on disk, if anycount(*)Demand, return this data directly
But InnoDB traverses the normal index tree and calculates the total table data

We learned earlier that count(*) is counted regardless of whether it contains a null value. And 1 in count(1) is a constant expression, so all results are counted. So there’s no difference between count of 1 and count of *.

Count () Optimization method:

Show table status: can quickly obtain the result, but the result is inaccurate.
Use Redis as a counter: can quickly obtain the results, more accurate than show table status results, but the concurrent scene count may not be accurate;
Add InnoDB count table: it can get results quickly, use transaction feature to ensure accurate count, also recommended method.