When we write SQL statements, we tend to focus on the results of SQL execution, but whether we really pay attention to the efficiency of SQL execution, whether we pay attention to the specification of SQL writing?
The following dry goods sharing is summarized in the actual development process, I hope to help you!
1. Limit paging optimization
Limit efficiency is very low when the offset is very large.
SELECT id FROM A LIMIT 1000,10
SELECT ID FROM A LIMIT 9000010 slow
Solution a:
select id from A order by id limit90000, 10;Copy the code
If we use order by together. Very quickly, 0.04 seconds is OK. Because the primary key id is used as the index! Of course, whether or not you can use an index depends on your business logic, but be careful when using it for paging purposes!
Scheme 2
select id from A order by id between 90000 and 90010;
Copy the code
2. Use limit 1 and top 1 to obtain a row
Some business logic makes a query operation (especially when taking the maximum amount based on a certain field DESC). You can terminate [database index] by using limit 1 or top 1 to continue scanning the entire table or index.
counter-examples
SELECT id FROM A LIKE 'abc%'
Copy the code
Is case
SELECT id FROM A LIKE 'abc%' limit 1
Copy the code
3. Do not use select * from table under any circumstances, replace “*” with a list of specific fields, and do not return fields that are not needed to avoid a full scan!
counter-examples
SELECT * FROM A
Copy the code
Is case
SELECT id FROM A
Copy the code
4. Batch insert optimization
counter-examples
INSERT into person(name,age) values('A',24)
INSERT into person(name,age) values('B',24)
INSERT into person(name,age) values('C'24),Copy the code
Is case
INSERT into person(name,age) values('A', 24), ('B', 24), ('C', 24).Copy the code
SQL statement optimization mainly lies in the correct use of the index, and we often make a mistake in the development of the table is to complete the scan, one affect performance, and to consume time!
5. Optimization of like statement
counter-examples
SELECT id FROM A WHERE name like '%abc%'
Copy the code
Because ABC is preceded by “%”, this query must follow the full table query. Do not prefix the keyword with % unless necessary (fuzzy queries need to include ABC)
Is case
SELECT id FROM A WHERE name like 'abc%'
Copy the code
The instance
Mysql version: 5.7.26
select nick_name from member where nick_name like Xiao Ming '% %'
Copy the code
Like ‘% xiaoming %’ does not use index!
select nick_name from member where nick_name like 'xiaoming %'
Copy the code
Select * from index like’ xiaoming %’
6. Where clause uses an optimization of OR
It is often better to replace “or” with union all or union. If the OR keyword is used in the WHERE clause, the index will be discarded.
counter-examples
SELECT id FROM A WHERE num = 10 or num = 20
Copy the code
Is case
SELECT id FROM A WHERE num = 10 union all SELECT id FROM A WHERE num=20
Copy the code
7. The where clause uses IS NULL or IS NOT NULL optimizations
counter-examples
SELECT id FROM A WHERE num IS NULL
Copy the code
If you use IS NULL or IS NOT NULL in the WHERE clause, the index will be discarded and a full table query will be performed.
Is case
Set the default value of num to 0 to ensure that num does not have a null value in the table. When using IS NULL, SQL IS used very frequently in actual service scenarios. Therefore, we should avoid full table scan
SELECT id FROM A WHERE num=0
Copy the code
8. Optimize expression operations on fields in the WHERE clause
Do not perform functions, arithmetic operations, or other expression operations to the left of the “=” in the WHERE clause, or the system may not use the index properly.
- 1
SELECT id FROM A WHERE datediff(day,createdate,'2019-11-30') = 0Copy the code
Optimization for
SELECT id FROM A WHERE createdate>='2019-11-30' and createdate<'2019-12-1'
Copy the code
- 2
SELECT id FROM A WHERE year(addate) <2020
Copy the code
Optimization for
SELECT id FROM A where addate<'2020-01-01'
Copy the code
9. Sort index problems
Mysql queries only use an index, so if the WHERE clause already uses an index, the order BY column does not use an index. So the database default sort can meet the requirements of the situation do not use sort operation;
Try not to include more than one column sort, and create composite indexes for those columns if necessary.
10. Try to replace union all with union
The main difference between union and Union all is that the former requires the combination of two (or more) result sets and then the unique filtering operation, which involves sorting, increasing a large number of CPU operations, and increasing resource consumption and delay. So use union all instead of union when we can confirm that duplicate result sets are impossible or don’t care about duplicate result sets
Inner join and left join, right join, subquery
- Inner join left/ rightJoin left/ rightJoin left/rightjoin
SELECT A.id,A.name,B.id,B.name FROM A LEFT JOIN B ON A.id =B.id;
SELECT A.id,A.name,B.id,B.name FROM A RIGHT JOIN ON B A.id= B.id;
SELECT A.id,A.name,B.id,B.name FROM A INNER JOIN ON A.id =B.id;
Copy the code
It has been proved that inner join performance is faster, because inner join is equivalent and may return fewer rows. But remember that some statements implicitly use equivalent concatenations, such as:
SELECT A.id,A.name,B.id,B.name FROM A,B WHERE A.id = B.id;
Use inner Join as much as possible
- Second: sub-query performance is slower than external join performance, try to use external join to replace the sub-query.
counter-examples
If the outer table is A large table, we can expect query performance to be worse than this.
Select* from A where exists (select * from B where id>=3000 and A.uuid=B.uuid);
Copy the code
Execution time: about 2s
Is case
Select* from A inner join B ON A.uuid=B.uuid whereb.uuid>=3000; This statement executes the test in less than a second;Copy the code
Execution time: less than 1s
- Third: When using JOIN, you should use small results to drive large results
Left JOIN The left table results should be as small as possible, if there are conditions should be placed on the left side of the first processing, right join the same reverse. Such as:
counter-examples
Select * from A left join B A.id=B.ref_id where A.id>10
Copy the code
Is case
select * from (select * from A wehre id >10) T1 left join B on T1.id=B.ref_id;
Copy the code
12. Exist & in optimization
SELECT * from A WHERE id in ( SELECT id from B )
Copy the code
SELECT * from A WHERE id EXISTS ( SELECT 1 from A.id= B.id )
Copy the code
Analysis:
In is traversing the comparison in memory
Exist needs to query the database, so when the data volume of B is large, the efficiency of EXISTS is better than that of in**
In () is executed only once, caching all ids in table B, and then checking whether the IDS in table A are equal to those in table B. If the ids are equal, the records in table A are added to the result set until all records in table A are traversed.
The flow principle of an In operation is similar to the following code
List resultSet={};
Array A=(select * from A);
Array B=(select id from B);
for(int i=0; i<A.length; i++) {for(int j=0; j<B.length; j++) {if(A[i].id==B[j].id) {
resultSet.add(A[i]);
break; }}}return resultSet;
Copy the code
As you can see, it is not appropriate to use in() when table B is large, because table B will be traversed all at once
For example, if table A has 10,000 records and table B has 1,000,000 records, it is possible to traverse 10,000 x 1,000,000 records at most, which is inefficient.
Another example: Table A has 10000 records and Table B has 100 records, so it is possible to traverse 10000*100 times at most, which greatly reduces the number of traverse times and greatly improves the efficiency.
Conclusion: In () is suitable for the case where the data of table B is smaller than that of table A
Exist () will execute A.length() times, and the execution process code is as follows
List resultSet={};
Array A=(select * from A);
for(int i=0; i<A.length; i++) {if(exists(A[I].id) {// Execute select 1 from BwhereResultset.add (A[I]); }}return resultSet;
Copy the code
Use exists() when table B is larger than table A because it does not have as many traversals and only needs to perform one more query.
For example, if there are 10,000 records in table A and 1,000,000 records in table B, exists() performs 10,000 executions to determine whether the ids in table A are equal to those in table B.
For example, if there are 10000 entries in table A and 100000000 entries in table B, then exists() should be executed 10000 times because it only executes A.length. The more data in table B, the better exists() is.
If exists(A) has 10000 entries and exists(B) has 100 entries, it is better to use in() to perform 10000*100 times, because in() does the comparison in memory, and exists() needs to query the database.
We all know that querying a database consumes more performance and memory is faster.
Conclusion: Exists () applies to the case where table B has larger data than table A