30 Tips for Writing high-quality SQL

Select * from ‘select ’; select from ‘select’;

Select * from employee; Select id, name from employee;Copy the code

Reason:

Select only the required fields to save resources and network overhead. When a select * query is performed, it is likely that the overwrite index will not be used, resulting in a backtable query.

2. If you know that the query result has only one record or only a maximum/minimum record, you are advised to use limit 1

Suppose you have the Employee table, and you want to find a person named Jay.

CREATE TABLE `employee` ( `id` int(11) NOT NULL, `name` varchar(255) DEFAULT NULL, `age` int(11) DEFAULT NULL, `date` datetime DEFAULT NULL, `sex` int(1) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Select id, name from Employee where name='jay' LIMIT 1;Copy the code

Reason:

Add limit 1, as long as the corresponding record is found, no further scanning, efficiency will be greatly improved. Of course, if name is a unique index, there is no need to add limit 1, because the limit exists mainly to prevent a full table scan and thus improve performance. If a statement itself can predict that a full table scan is not required, there is no difference in performance.

3. Try to avoid using OR to join conditions in where clauses

Select * from user where userId = userId; select * from user where userId = userId;  CREATE TABLE `user` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` int(11) NOT NULL, `age` int(11) NOT NULL, `name` varchar(255) NOT NULL, PRIMARY KEY (`id`), KEY `idx_userId` (`userId`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; If you want to query a user whose userID is 1 or age is 18, you can easily have the following SQL counterexample: Select * from user where userid=1 or age =18 Union all select * from user where userid=1 union all select * from user where age = 18  select * from user where userid=1 select * from user where age = 18Copy the code

Reason:

Using or may invalidate the index, leading to a full table scan.

In the case of or+ age with no index, suppose it goes through the index of userId, but when it goes to the age query condition, it also has to do a full table scan, which is a three-step process: full table scan + index scan + merge. If it started with the full table scan, it will just do one scan. Mysql has an optimizer, and it seems reasonable that the index might fail in the case of an OR condition for efficiency and cost.

4. Optimize limit paging

When we do daily paging requirements, we usually use limit, but when the offset is very large, query efficiency becomes low.

Select id, name, age from employee limit 10000,10 Select id, name from Employee where ID >10000 LIMIT 10. Order by + index select id, name from employee order by id limit 10000,10Copy the code

Reason:

When the offset is maximum, the query is less efficient, because Mysql does not skip the offset directly. Instead, Mysql returns the offset + the number of items to be fetched, and then discard the previous offset. If you use optimization solution 1 and return the last maximum query record (offset), you can skip the offset and improve efficiency. The second scheme uses the order by+ index, which can also improve the query efficiency. Plan three, it is recommended to discuss with the business, there is no need to look up so after the page. Because most users don’t turn too many pages.

5. Optimize your like statements

In everyday development, it’s easy to think of like when you’re using fuzzy keyword queries, but like is likely to invalidate your index.

Select userId, name from user where userId like '%123'; Select userId, name from user where userId like '123%';Copy the code

Reason:

Put % in front, without indexing, as follows:If you put % after the keyword, it will still be indexed. As follows:

6. Use the WHERE condition to limit the data to be queried to avoid returning redundant rows

Suppose the business scenario is to query whether a user is a member. Have seen the old implementation code like this…

List<Long> userIds = sqlMap. QueryList ("select userId from user where isVip=1"); boolean isVip = userIds.contains(userId); Is:  Long userId = sqlMap.queryObject("select userId from user where userId='userId' and isVip='1' ") boolean isVip = UserId! =null;Copy the code

Reason:

Need what data, to look up what data, avoid returning unnecessary data, save overhead.

7. Avoid using mysql’s built-in functions on index columns

Business requirements: Query users who logged in in the last seven days (assuming loginTime is indexed)

Select userId,loginTime from loginuser where Date_ADD(loginTime,Interval 7 DAY) >=now(); Explain select userId,loginTime from loginuser WHERE loginTime >= Date_ADD(NOW(), interval-7 DAY);Copy the code

Reason:

Mysql built-in function is used on index column, index is invalidImg If the index column does not have built-in functions, the index will still go.

8. Try to avoid expression operations on fields in the WHERE clause. This will cause the system to abandon the use of indexes and perform a full table scan

Select * from user where age-1 =10; Select * from user where age =11;Copy the code

Reason:

Although age is indexed, the index is lost because of the operation on it…

9, Inner Join, left join, right join, Inner join preferentially, if left join, the left table result as small as possible

Inner join: When a join query is performed on two tables, only the perfectly matched result sets in the two tables are retained
Left Join returns all rows in the left table, even if there are no matching records in the right table.
Right join returns all rows in the right table, even if there are no matching records in the left table.

On the premise that both meet THE REQUIREMENTS of SQL, Inner Join is recommended to be used first. If left join is used, the data results of the left table should be as small as possible, and if there is a condition, it should be placed on the left.

Select * from tab1 t1 left join tab2 t2 on t1.size = t2.size WHERE t1.id>2; select * from tab1 t1 left join tab2 t2 on t1.size = t2.size WHERE t1.id>2; Select * from (select * from tab1 where id >2) t1 left join tab2 t2 on t1.size = t2.size;Copy the code

Reason:

If the inner Join is equivalent, perhaps fewer rows will be returned, so the performance will be relatively better. Similarly, the left join is used, and the results of the left table data are as small as possible, and the conditions are processed on the left, which means that the number of rows returned is likely to be small.

10, try to avoid using in the WHERE clause! = or <> operators, otherwise the engine will abandon the index and proceed to a full table scan.

Select age,name from user where age <>18; Select age,name from user where age <18; select age,name from user where age >18;Copy the code

Reason:

Use! = and <> are likely to invalidate indexes

11. When using a federated index, pay attention to the order of index columns. Generally, the left-most matching principle is followed.

Table structure :(idx_userid_age, userId first, age second)

CREATE TABLE `user` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` int(11) NOT NULL, `age` int(11) DEFAULT NULL, `name` varchar(255) NOT NULL, PRIMARY KEY (`id`), KEY `idx_userid_age` (`userId`,`age`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8; Select * from user where age = 10; Select * from user where userid=10 and age =10; Select * from user where userid =10;Copy the code

Reason:

When we create a joint index, such as (k1,k2,k3), we create indexes (k1), (k1,k2), and (k1,k2,k3). This is the leftmost matching principle. A federated index does not satisfy the leftmost rule and will generally fail, but this is a matter of concern to the Mysql optimizer.

12. To optimize the query, consider creating indexes on the columns involved in where and Order BY, and try to avoid full table scans.

Select * from user where address =' sz 'order by age; Alter table user add index idx_address_age (address,age)Copy the code

13. If there is too much data inserted, consider batch inserts.

For (User u :list){INSERT into User (name,age) values(#name#,#age#)} // Insert 500 batches at a time, Separator =","> insert into user(name,age) values <foreach collection="list" item="item" index="index" separator=","> (#{item.name},#{item.age}) </foreach>Copy the code

Reason:

For example, if you need to move 10,000 bricks to the roof and you have an elevator, the elevator can put an appropriate amount of bricks (up to 500), you can choose to transport one brick at a time, or 500 at a time, which time do you think is the biggest consumption?

14. Use overwrite indexes when appropriate.

Overwriting indexes allows your SQL statements to get all the data they need without going back to the table.

Select * from user where userid like '%123%'; select * from user where userid like '%123%'; For example: //id is a primary key, so it is a normal index, that is, overwrite index. select id,name from user where userid like '%123%';Copy the code

Use the distinct keyword with caution

The DISTINCT keyword is used to filter duplicate records to return the records that are not duplicate. When used to query a single field or few fields, it optimizes the query. However, when used in many fields, it will greatly reduce the query efficiency.

SELECT DISTINCT * from user; Example: select DISTINCT name from user;Copy the code

Reason:

Statements with DISTINCT have higher CPU time and usage than statements without DISTINCT. If distinct is used when querying many fields, the database engine will compare the data and filter out duplicate data. However, this comparison and filtering process will consume system resources and CPU time.

16. Delete redundant and duplicate indexes

KEY 'idx_userId' (' userId ') KEY 'idx_userId_age' (' userId ', 'age') // Delete index userId, because index (A, B) is equivalent to index (A) and (A, B) KEY idx_userId_age (' userId ', 'age')Copy the code

Reason:

Duplicate indexes need to be maintained and the optimizer needs to consider them individually when optimizing queries, which can affect performance.

17. Optimize your modify/delete statements for large amounts of data.

Do not modify or delete too much data at the same time, because the CPU usage is too high and other people’s access to the database is affected.

Counterexample: // delete 100,000 or 1 million + at a time? delete from user where id <100000; For (User User: list) {delete from User; 500 delete user WHERE id<500 delete product WHERE id>=500 and id<1000;Copy the code

Reason:

Delete too much data at once, you may have an error with Lock wait timeout exceed, so batch operations are recommended.

18. Consider using the default value in the WHERE clause instead of NULL.

Select * from user where age is not null; Select * from user where age>0; select * from user where age>0; (Use index age)Copy the code

Reason:

Is not null or is not NULL will not be used, it depends on the mysql version and query cost. If the mysql optimizer finds that it costs more to index than not to index, it will definitely abandon the index. These conditions! =, >is NULL, is not NULL is not NULL is not null is not null

If you replace the null value with the default value, it will be possible to walk off the index in many cases, and the meaning will be relatively clear.

19, do not have more than 5 table joins

The more tables you have, the more time and cost it takes to compile.
It is more readable to break the join table into smaller executions.
If you have to join many tables to get the data, it’s bad design.

20. Rational utilization of exist & in

(In is traversing A nested traversing B, exist is traversing B nested traversing B)

If table A represents the employee table of an enterprise, table B represents the department table, query all employees of all departments, it is easy to have SQL:

select * from A where deptId in (select deptId from B); Writing this is equivalent to:

Query department table B first

select deptId from B

Then query the employees of A by deptId

select * from A where A.deptId = B.deptId

This can be abstracted as a loop like this:

List<> resultSet ; for(int i=0; i<B.length; i++) { for(int j=0; j<A.length; j++) { if(A[i].id==B[j].id) { resultSet.add(A[i]); break; }}}}}}}}}

select * from A where exists (select 1 from B where A.deptId = B.deptId); The data obtained from the primary query is checked in subqueries, and the results (true or false) are used to determine whether the results of the primary query should be retained.

So, writing this is equivalent to:

Select * from (select * from (select * from (select * from))

Select * from B where a.deptid = B.deptid; select * from B where A.deptid = B.deptid;

Similarly, it can be abstracted as a loop like this:

List<> resultSet ; for(int i=0; i<A.length; i++) { for(int j=0; j<B.length; j++) { if(A[i].deptId==B[j].deptId) { resultSet.add(A[i]); break; }}} Database is the most laborious with the program link release. Suppose you link twice, you do a million queries of the data set each time, and then you go, and you only do it twice; Instead, it creates millions of links, requests for link release and repeats, and the system becomes overwhelmed. The mysql optimization principle is that small tables drive large tables and small data sets drive large data sets for better performance.

Therefore, we should choose the one with small outermost loop, that is, if the amount of data of B is less than that of A, it is suitable to use in; if the amount of data of B is greater than that of A, it is suitable to choose Exist.

21, Try to replace union with union all

If there are no duplicate records in the search results, union all is recommended to replace union.

Select * from user where userid=1 union select * from user where age = 10  select * from user where userid=1 union all select * from user where age = 10Copy the code

Reason:

If you use a union, you try to merge the retrieved results regardless of whether they are duplicated, and then sort them before printing the final results. If it is known that the retrieval results are not duplicated, use union all instead of union to improve efficiency.

Index should not be too much, generally within 5.

More indexes is not always better, and while indexes improve the efficiency of queries, they also reduce the efficiency of inserts and updates.
An index may be rebuilt during insert or Update, so be careful when creating an index, depending on the situation.
The number of indexes in a table should not exceed 5. If there are too many indexes, it is necessary to consider whether some indexes do not exist.

23. Use numeric fields as far as possible. If only numeric fields are used, do not design characters as far as possible

Example: 'king_id' int(11) NOT NULL COMMENT 'guardian Id'. Example:' king_id 'int(11) NOT NULL COMMENT' guardian Id'Copy the code

Reason:

Compared to numeric fields, character types reduce query and join performance and increase storage overhead.

24. Indexes are not suitable for fields with large amounts of duplicate data, such as database fields such as gender.

Because the SQL optimizer optimizes queries based on the amount of data in the table, if there is a lot of duplicate data in the index column, the Mysql query optimizer will probably abandon the index because it will be cheaper not to index.

25. Avoid returning too much data to the client.

Suppose the business requirement is that a user requests to see the live streaming data they have watched in the last year.

Select * from LivingInfo WHERE watchId =useId and watchTime >= Date_sub(now(),Interval 1 Y) select * from LivingInfo where watchId =useId and watchTime >= Date_sub(now(),Interval 1 Y) Select * FROM LivingInfo WHERE watchId =useId and watchTime>= Date_sub(now(),Interval 1 Y) LIMIT offset, PageSize // If the front page is used, you can query the first 200 records, because the average user should not go down too many pages,  select * from LivingInfo where watchId =useId and watchTime>= Date_sub(now(),Interval 1 Y) limit 200 ;Copy the code

26. When joining multiple tables in an SQL statement, use the table alias and prefix each column with the alias to make the semantics clearer.

Select * from A inner join B on A.eptid = B.eptid; Is example: select memeber. Name, deptment. The deptName from A member inner join B deptment on member. The deptId = deptment. DeptId;Copy the code

27. Use varchar/ nvARCHar instead of char/ nCHAR whenever possible.

Counterexample: 'deptName' char(100) DEFAULT NULL COMMENT 'department name' DEFAULT NULL COMMENT 'Department name'Copy the code

Reason:

Because first of all the variable length field storage space is small, can save storage space.
Secondly, for queries, it is more efficient to search in a relatively small field.

28. To improve the efficiency of the group by statement, you can filter out unnecessary records before executing the statement.

Select job, avg (salary) from employee group by job having job ='president' or job =' managent' Select job, avg (salary) from employee where job ='president' or job =' managent' group by job;Copy the code

29. When the field type is a string, enclose it in quotation marks; otherwise, the index is invalid

Select * from user where userid =123; Select * from user where userid ='123';Copy the code

Reason:

Why is the first statement not indexed without single quotes? This is because without single quotes, it’s a comparison between a string and a number, they don’t match, and MySQL will implicitly convert them to a floating point and then compare them.

Use Explain to analyze your PLAN for SQL

Try to make a habit of writing SQL on a daily basis. Use explain to analyze the SQL you have written, especially in the area of indexing.

explain select * from user where userid =10086 or age =18;
Copy the code

Gold nugget link: juejin.cn/post/684490… Link: mp.weixin.qq.com/s?__biz=Mzg…

30 Tips for Writing high-quality SQL

Select * from ‘select *’; select * from ‘select’;

2. If you know that the query result has only one record or only a maximum/minimum record, you are advised to use limit 1

3. Try to avoid using OR to join conditions in where clauses

4. Optimize limit paging

5. Optimize your like statements

6. Use the WHERE condition to limit the data to be queried to avoid returning redundant rows

7. Avoid using mysql’s built-in functions on index columns

8. Try to avoid expression operations on fields in the WHERE clause. This will cause the system to abandon the use of indexes and perform a full table scan

9, Inner Join, left join, right join, Inner join preferentially, if left join, the left table result as small as possible

10, try to avoid using in the WHERE clause! = or <> operators, otherwise the engine will abandon the index and proceed to a full table scan.

11. When using a federated index, pay attention to the order of index columns. Generally, the left-most matching principle is followed.

12. To optimize the query, consider creating indexes on the columns involved in where and Order BY, and try to avoid full table scans.

13. If there is too much data inserted, consider batch inserts.

14. Use overwrite indexes when appropriate.

Use the distinct keyword with caution

16. Delete redundant and duplicate indexes

17. Optimize your modify/delete statements for large amounts of data.

18. Consider using the default value in the WHERE clause instead of NULL.

19, do not have more than 5 table joins

20. Rational utilization of exist & in

21, Try to replace union with union all

Index should not be too much, generally within 5.

23. Use numeric fields as far as possible. If only numeric fields are used, do not design characters as far as possible

24. Indexes are not suitable for fields with large amounts of duplicate data, such as database fields such as gender.

25. Avoid returning too much data to the client.

26. When joining multiple tables in an SQL statement, use the table alias and prefix each column with the alias to make the semantics clearer.

27. Use varchar/ nvARCHar instead of char/ nCHAR whenever possible.

28. To improve the efficiency of the group by statement, you can filter out unnecessary records before executing the statement.

29. When the field type is a string, enclose it in quotation marks; otherwise, the index is invalid

Use Explain to analyze your PLAN for SQL

Related Posts

STM32 microcontroller startup file startup_STM32F10X_hD. s details

The 2021-07-18 algorithm problem

View the Linux version information

Select * from ‘select ’; select from ‘select’;