This article summarizes 19 optimization solutions for Mysql, which are based on the “mysql-index-btree type”. I hope it helps.
A, the EXPLAIN
To do MySQL optimization, we should use EXPLAIN to view the SQL execution plan.
Here’s a simple example, annotating (1,2,3,4,5) the data we want to focus on
- Type column, connection type. A good SQL statement must be at least range level. Prevent the occurrence of all levels
- Key column, the name of the index used. If no index is selected, the value is NULL. Mandatory indexes can be used
- Key_len column, index length
- Rows, the number of rows scanned. This value is an estimate
- Extra column, details. Note that common unfriendly values are: Using filesort, Using temporary
2. SQL statements should not contain too many values IN
MySQL optimizes IN to store all constants IN an array that is sorted. However, if the number is large, the consumption is also relatively large. Select id from table_name where num in(1,2,3); Or use connections instead.
The SELECT statement must specify the name of the field
SELECT * adds a lot of unnecessary consumption (CPU, IO, memory, network bandwidth); Increased the possibility of using overridden indexes; When the table structure changes, the front break also needs to be updated. Therefore, the field name is required to be followed directly by select.
4. Use limit 1 when only one data item is required
This is to make the type column in EXPLAIN const
5, if the sort field does not use index, as little sort as possible
6. If the other fields in the constraint have no index, use or as little as possible
If one of the fields on both sides of or is not an index field and the other conditions are not index fields, the query does not remove the index. Many times it is better to use union all or union(if necessary) instead of “or”
B. Union all. C. union all
The main difference between union and Union all is that the former requires the combination of result sets and then carries out the unique filtering operation, which involves sorting, increases a large number of CPU operations, and increases resource consumption and delay. Of course, the premise of union all is that the two result sets have no duplicate data.
ORDER BY RAND()
select id from `table_name`
order by rand() limit 1000;
The above SQL statement can be optimized to
select id from `table_name` t1 join (select rand() * (select max(id)
from `table_name`) asnid) t2 on t1.id > t2.nid limit 1000;
Distinguish between in and exists, not in and not exists
Select * from table A
Where id in (select id from table B)
The SQL statement above is equivalent to
Select * from table A
Select * from tabletop where tabletop b.id = tabletop a.id
The distinction between in and EXISTS mainly results in a change in driver order (which is key to performance changes). In the case of exists, the outer layer table is the driver table and is accessed first. In, the subquery is executed first. So IN is suitable for the situation where the outside is large and the inside is small; EXISTS applies to a situation where the outer surface is small and the inner surface is large.
For not in and NOT EXISTS, it is recommended to use not EXISTS, not in May have logic problems, not in is not just a matter of efficiency. How to efficiently write a SQL statement that replaces not EXISTS?
The original SQL statement
The select colname… From A table
Where a.id not in (select B. id from B表)
Efficient SQL statements
The select colname… From table A Left join table B on
where a.id = b.id where b.id is null
The extracted result set is shown in the figure below, with data in table A not in Table B
Ten, use reasonable paging method to improve the efficiency of paging
select id,name
from table_name limit 866613, 20
When you do paging with the SQL statement above, you may find that using limit paging queries directly gets slower and slower as the amount of table data increases.
The optimization method is as follows: You can take the ID of the maximum number of rows on the previous page, and then limit the start of the next page based on that maximum ID. For example, in this column, the maximum ID of the previous page is 866612. SQL can be written as follows:
select id,name from table_name
where id> 866612 limit 20
Eleven, section query
In some user selection pages, the time range selected by some users may be too large, resulting in slow query. The main reason is that too many lines are scanned. This time can be through the program, segmented query, circular traversal, the results of the combined processing for display.
The following SQL statement can be used when the number of rows scanned exceeds millions
Avoid null values for fields in the WHERE clause
The determination of NULL causes the engine to abandon the index and perform a full table scan instead.
Fuzzy query with % prefix is not recommended
For example, LIKE “%name” or “%name%”, this query will cause index invalidation and perform full table scan. But you can use LIKE “name%”.
How to query %name%
As shown in the figure below, although the secret field was indexed, it was not used in explain results
So how to solve this problem, answer: using full-text indexes
Select id,fnum, FDST from table_name where user_name like ‘%zhangsan%’; . Such statements, ordinary index is not sufficient to meet the query requirements. Fortunately, in MySQL, there are full-text indexes to help us.
SQL syntax for creating a full-text index is:
ALTER TABLE `table_name` ADD FULLTEXT
INDEX `idx_user_name` (`user_name`);
SQL statements that use full-text indexes are:
select id,fnum,fdst from table_name
where match(user_name) against(‘zhangsan’ in booleanmode);
Note: Before you need to create a full-text index, contact your DBA to confirm that it can be created. It is also important to note that the query statement is written differently from ordinary indexes
Avoid expression manipulation of fields in where clauses
Such as
select user_id,user_project from table_name
where age*2=36;
SQL > alter index (); this will cause the engine to abandon the index
select user_id,user_project from table_name
where age=36/2;
Avoid implicit type conversions
The type conversion occurs when the type of the column column in the WHERE clause is inconsistent with the type of the parameter passed in. It is recommended to determine the type of the parameter in the WHERE clause first
For union indexes, follow the leftmost prefix rule
For example, the index contains the fields ID,name,school, you can use the id field directly, or you can use the order ID,name, but name; No school can use this index. Therefore, when creating a federated index, you must pay attention to the order of the index fields, and put the commonly used query fields first
If necessary, you can use force index to force a query on an index
Sometimes the MySQL optimizer takes the index it sees fit to retrieve SQL statements, but it may not use the index we want. Force Index is used to force the optimizer to use our index.
Note the range query statement
For a federated index, if there is a range query, such as between,>,<, etc., the following index field will be invalidated.
19. About JOIN optimization
- The LEFT JOIN A table is the driver table
- INNER JOIN MySQL automatically finds out which table has less data and uses it to drive the table
- The RIGHT JOIN B table is the driver table
Note: MySQL does not have full join
select * from A left join B on B.name = A.name
where B.name is null
union all
select * from B;
Use inner join instead of left Join
There are at least two tables participating in the joint query, which are generally of different sizes. If the join mode is inner JOIN, MySQL will automatically select the small table as the driver table without other filtering conditions. However, left JOIN follows the principle of left drive right in the selection of the driver table, that is, the left join table is called the driver table.
Rational use of indexes
The index field of the driven table serves as the limit field of ON.
Use small tables to drive large tables
From the schematic diagram can intuitively see that if you can reduce the drive table, reduce the number of cycles in the nested loop, to reduce the total IO and CPU operation times.
Use opportunely STRAIGHT_JOIN
Inner join is selected by mysql as the driver table, but some special cases need to select another table as the driver table, such as group by, order by, etc. “Using filesort”, “Using temporary”. By STRAIGHT_JOIN to enforce the join order, the table name on the left of STRAIGHT_JOIN would be the driven table and the table name on the right would be the driven table. One prerequisite for using STRAIGHT_JOIN was that the query was an inner join, that is, an inner join. STRAIGHT_JOIN is not recommended for other links because it could result in inaccurate query results.
This method can sometimes reduce the time by three times.
Here is only a list of the above optimization scheme, of course, there are other optimization methods, we can try to explore, thank you for your attention.