SQL statement performance optimization
1. To optimize the query, avoid full table scan as far as possible, and first consider creating indexes on the columns involved in WHERE and ORDER BY.
2. Avoid null values in the WHERE clause. Null is the default value when creating a table, but most of the time use NOT NULL, or use a special value such as 0 or -1 as an implicit value.
3. Avoid using it in where clauses! = or <> operators, MySQL only uses indexes for the following operators: <, <=, =, >, >=, BETWEEN, IN, and sometimes LIKE.
4, avoid using OR in the WHERE clause to join conditions. Otherwise, the engine will abandon the use of index and perform full table scan. select id from t where num=10 union all select id from t where num=20
Select id from t where num between 1 and 3. Select id from t where num between 1 and 3
Select id from t where name like ‘% ABC %’ or select id from t where name like ‘% ABC’ Select id from t where name like ‘ABC %
7. If you use parameters in the WHERE clause, it will also cause a full table scan.
8. Expression manipulation of fields in the WHERE clause should be avoided, and functional manipulation of fields in the WHERE clause should be avoided
9, many times it is a good choice to use exists instead of in: select num from a where num in(select num from b). Select num from a where exists(select 1 from b where num=a.num)
10. While indexes can improve the efficiency of select operations, they can also reduce the efficiency of insert and update operations. The number of indexes in a table should not exceed 6. If there are too many, you should consider whether it is necessary to build indexes on infrequently used columns.
11. Avoid updating clustered index columns as much as possible, because the sequence of clustered index columns is the physical storage sequence of table records. Changes in clustered index columns will result in the adjustment of the sequence of records in the whole table, which will consume considerable resources. If your application system needs to update clustered index data columns frequently, you need to consider whether to build the clustered index as a clustered index.
12. Use numeric fields as far as possible. If fields only contain numeric information, do not design them as characters, which will reduce query and connection performance and increase storage overhead.
13. Use varchar/nvarchar instead of char/nchar whenever possible, because first of all, the storage space of a longer field is small, and second of all, it is obviously more efficient to search within a relatively small field.
14, it is best not to use “return all: select from t”, replace “*” with a list of specific fields, do not return any fields that are not needed.
15. Try to avoid returning a large amount of data to the client. If the amount of data is too large, consider whether the corresponding demand is reasonable.
16. Use table aliases: When joining multiple tables in an SQL statement, use the Alias of the table and prefix the Alias to each Column. This reduces parsing time and reduces syntax errors caused by Column ambiguities.
Temporary table (tempDB, tempDB, tempDB); temporary table (tempDB, tempDB); It also greatly reduces the blocking of “shared lock” and “update lock” in program execution, reducing the blocking and improving the concurrency performance.
18, Some SQL query statements should be added with NOLock. Read and write blocks each other. To improve concurrent performance, add NOLock to some queries. There are three principles for using NOLock. Nolock cannot be added to the query results used for Insert, Delete, or modify. The queried table frequently splits pages. Use nolock with caution. Temporary tables can also be used to save “data anaesthesia”, which is similar to the function of Oracle undo tablespace. If temporary tables can improve concurrent performance, do not use NOLock.
19. Common simplification rules are as follows: Do not have more than five table joins, and consider using temporary tables or table variables to hold intermediate results. Use less subqueries and do not nest views too deeply. Generally, it is advisable to nest no more than 2 views.
20. Pre-calculate the results to be queried and place them in the table. Select the results when querying. This was the most important tool before SQL7.0. For example, the calculation of hospital admission fees.
21. Statements with OR can be split into multiple queries, and multiple queries can be joined by UNION. Their speed depends only on whether an index is used or not; if a query requires a federated index, it is more efficient to use UNION all. Multiple OR sentences do not use the index, rewrite into the form of UNION and then try to match with the index. A key question is whether to use indexes.
22. IN the list of denominations after IN, put the values that occur most frequently at the front and the values that occur least at the back to reduce the number of judgments.
23. Try to put the data processing work on the server to reduce the network overhead, such as using stored procedures. Stored procedures are SQL statements that have been compiled, optimized, organized into an execution plan, stored in a database, and are a collection of control-flow languages, which are certainly fast. Dynamic SQL that executes repeatedly can use temporary stored procedures that are placed in Tempdb (temporary tables).
24, when the server memory is large enough, the number of configuration threads = the maximum number of connections +5, so as to maximize efficiency; Otherwise, the number of threads configured < the maximum number of connections enable the THREAD pool of SQL SERVER. If the number of threads is still equal to the maximum number of connections +5, the SERVER performance will be severely damaged.
25, query association and write order
select a.personMemberID, * From chineseresume a, personMember B where personMemberID = b.ferenceId and a.personmemberId = 'JCNPRH39681' (a = b ,B = 'id') select a.memberid, * from chineseresume a, personMember B where a.pagermemberId = b.pagerenceId and a.pagermemberId = 'JCNPRH39681' and B.reeferenceid = 'JCNPRH39681' (A = B,B = 'number', A = 'id') select a.memberid, * from chineseresume a,personmember B where b.ferenceId = 'JCNPRH39681' and a.personmemberID = 'JCNPRH39681' (b = 'Number', A = 'number')Copy the code
26. Try to use exists instead of select count(1) to determine the existence of a record. The count function is used only for all rows in a table and count(1) is more efficient than count(*).
27. Use >= instead of >.
28. Index usage specifications: Index creation should be combined with the application. It is recommended that large OLTP tables have no more than 6 indexes; If possible, use index fields as query criteria, especially clustered indexes. If necessary, use index index_name to force an index to be specified. Avoid table Scan when querying large tables. Create indexes if necessary. When an index field is used as a condition, if the index is a joint index, the first field in the index must be used as a condition to ensure that the system can use the index. Otherwise, the index will not be used. Pay attention to index maintenance, periodically rebuilding the index, and recompiling stored procedures.
29, The columns in the following SQL condition statements are properly indexed but execute very slowly:
SELECT * FROM record WHERE amount/30< 1000 (11 seconds) SELECT * FROM record WHERE amount/30< 1000 (11 seconds) SELECT * FROM record WHERE substrINg(card_no,1,4)= '5378' (13 seconds) SELECT * FROM record WHERE amount/30< 1000 (11 seconds) SELECT * FROM record WHERE convert(char(10),date,112)= '19991201' (10 秒)Copy the code
Analysis:
The result of any operation on a column in the WHERE clause is computed column by column at SQL runtime, so it has to do a table search without using the index above that column; If these results are available at query compilation time, they can be optimized by the SQL optimizer to use indexes and avoid table searches, so rewrite the SQL as follows:
SELECT * FROM record WHERE card_no like '5378%' (< 1 SEC) SELECT * FROM record WHERE amount< 1000*30 (< 1 SEC) SELECT * FROM record WHERE amount< 1000*30 (< 1 SEC) SELECT * FROM record WHERE card_no like '5378%' (< 1 SEC) SELECT * FROM record WHERE amount< 1000*30 (< 1 SEC Record WHERE date= '1999/12/01' (< 1 SEC)Copy the code
30, when there is a batch of inserts or updates, use batch inserts or batch updates, not one record to update!
31, in all stored procedures, can use SQL statements, I will never use a loop to achieve!
(For example: listing every day in the last month, I would use Connect by to recursively query, not loop from the first day of the last month)
32, select the most efficient table name order (only valid in rules-based optimizer) :
Oracle’s parser processes the table names in the FROM clause FROM right to left. The last table in the FROM clause is processed first. In the case of multiple tables in the FROM clause, you must select the table with the least number of entries as the base table. If more than three tables join queries, you need to select the intersection table, which is referenced by other tables, as the base table.
33. Improve the efficiency of GROUP BY statements BY filtering out unwanted records before GROUP BY. The following two queries return the same result, but the second is significantly faster.
SELECT AVG(SAL) FROM EMP GROUP BY JOB HAVING JOB = 'PRESIDENT' OR JOB = 'MANAGER' SELECT AVG(SAL) FROM EMP WHERE JOB = 'PRESIDENT' OR JOB = 'MANAGER' GROUP BY JOBCopy the code
SQL statements are executed in uppercase because Oracle always parses SQL statements and converts lowercase letters to uppercase letters.
35, the use of alias, alias is a large database application skills, that is, table names, column names in the query with a letter alias, query speed is 1.5 times faster than the establishment of a link table.
36. Avoid deadlocks and always access the same table in the same order in your stored procedures and triggers. Transactions should be shortened as much as possible and the amount of data involved in a transaction should be minimized; Never wait for user input in a transaction.
37. Avoid using temporary tables unless you need them. Use table variables instead. Most of the time (99%), table variables reside in memory and are therefore faster than temporary tables, which reside in the TempDb database, so operations on temporary tables need to communicate across databases and are naturally slower.
(38) “It is best not to use triggers. Firing a trigger is a resource-consuming process in itself. Do not use triggers if you can implement them using constraints. Do not use the same trigger for different trigger events (Insert, Update, and Delete); Do not use transactional code in triggers.
39, index creation rule:
The primary and foreign keys of a table must have indexes.
Tables with more than 300 data should have indexes;
Tables that are frequently joined with other tables should have indexes on join fields.
Fields that appear frequently in the Where clause, especially in large tables, should be indexed;
Indexes should be built on highly selective fields;
Indexes should be built on small fields, not large text fields or even long fields.
The establishment of composite index needs careful analysis, and single field index should be considered as far as possible.
Select the main column fields in the composite index correctly, usually with good selectivity.
Do several fields of a composite index often appear together as AND in the Where clause? Are there few or no single-field queries? If so, you can create a composite index; Otherwise consider a single-field index;
If the composite index contains fields that often appear alone in the Where clause, it is decomposed into multiple single-field indexes;
If the composite index contains more than three fields, think carefully about the need to reduce the number of composite fields;
If there are both single-field indexes and compound indexes on these fields, you can delete the compound indexes.
Do not create too many indexes for tables with frequent data operations.
Delete unnecessary indexes to avoid negative impact on the execution plan.
Each index created on a table increases storage overhead, and indexes also increase processing overhead for insert, delete, and update operations. In addition, too many composite indexes, in the case of a single field index, are generally worthless; Conversely, it degrades performance when data is added and deleted, especially for frequently updated tables.
Try not to index a field in the database that contains a large number of duplicate values.
40, mysql query optimization summary: Use slow query logs to find slow queries, use execution plans to determine if queries are working properly, and always test your queries to see if they are working at their best. Performance changes over time. Avoid using count(*) on the entire table, which might lock the entire table and keep the query consistent so that similar queries can use the query cache later
, use GROUP BY instead of DISTINCT when appropriate, use indexed columns in WHERE, GROUP BY, and ORDER BY clauses, keep indexes simple and do not include the same column in multiple indexes, MySQL can sometimes USE the wrong INDEX, in which case USE INDEX, check that SQL_MODE=STRICT is used. For INDEX columns with less than 5 records, USE LIMIT instead of OR.
To avoid using INSERT ON DUPLICATE KEY or INSERT IGNORE before UPDATE, do not use UPDATE, do not use MAX, use index fields and ORDER BY clause, LIMIT M, In restarting MySQL, remember to warm up your database to ensure that your data is in memory and queries are fast. Consider persistent connections rather than multiple connections to reduce overhead. Including using load on the server, sometimes a simple query can affect other queries, and when the load increases on your server, use SHOW PROCESSLIST to see slow and problematic queries, all suspicious queries tested in the mirrored data generated in the development environment.
41, MySQL backup process:
Backup is performed from the secondary replication server. Stop replication during a backup to avoid inconsistencies in data dependencies and foreign key constraints. Stop MySQL completely and backup from the database file.
If MySQL dump is used for backup, back up binary log files as well – make sure the replication is not interrupted. Don’t trust LVM snapshots, which are likely to produce data inconsistencies that will cause you trouble in the future. To make single-table recovery easier, export data on a per-table basis – if the data is isolated from other tables.
Use -opt when using mysqldump. Check and optimize tables before backing up. To make imports faster, temporarily disable foreign key constraints during imports.
To facilitate faster import, temporarily disable uniqueness detection during import. Calculate the size of the database, tables, and indexes after each backup to better monitor the growth of the data size.
Automatic scheduling scripts monitor replication instances for errors and delays. Perform regular backups.
42. The query buffer does not automatically handle whitespace, so you should minimize the use of whitespace when writing SQL statements, especially at the beginning and end of the SQL (because the query buffer does not automatically intercept the beginning and end of the SQL whitespace).
Is it convenient for member to use mid as standard for sub-table query? In general business requirements are basically based on username query, normal should be username to hash the module to divide the table. The partition function of mysql is transparent to code.
It seems unreasonable to implement it at the code level.
44) We should create an ID for each table in the database as its primary key, preferably an INT (UNSIGNED is recommended), and set AUTO_INCREMENT flag on the table.
45, SET SET NOCOUNT ON at the beginning of all stored procedures and triggers and SET NOCOUNT OFF at the end.
There is no need to send a DONE_IN_PROC message to the client after each statement of the stored procedure and trigger is executed.
46, MySQL queries can be enabled with high-speed query caching. This is one of the most effective Mysql optimizations to improve database performance. When the same query is executed multiple times, it is much faster to extract data from the cache and return data directly from the database.
The EXPLAIN SELECT query is used to track the results of the view
Use the EXPLAIN keyword to let you know how MySQL handles your SQL statements. This can help you analyze performance bottlenecks in your query or table structure. The EXPLAIN query results will also tell you how your index primary keys are utilized, how your tables are searched and sorted… Wait, wait, wait.
48. LIMIT 1 is used when there is only one row of data
Sometimes when you query a table, you already know that the result will only have one result, but because you might need to fetch the cursor, or you might check the number of records returned. In this case, adding LIMIT 1 can increase performance. In this case, the MySQL database engine will stop searching after it finds a piece of data, rather than continuing to search for the next piece of data that matches the record.
49, Select the appropriate storage engine for the table:
Myisam: The application is mainly read and insert operations, only a small amount of update and delete, and the integrity of the transaction, concurrency is not very high requirements.
Innodb: Transaction processing and data consistency under concurrent conditions. In addition to inserts and queries, it includes a lot of updates and deletions. Innodb effectively reduces locks caused by deletes and updates. For transacts-enabled InnoDB-type tables, the main reason for speed is that AUTOCOMMIT is turned on by default, and the program does not explicitly call BEGIN to start a transaction, resulting in auto-commit for every insert, which severely affects speed. You can call BEGIN before executing the SQL, and multiple SQL statements form one thing (even if AutoCOMMIT is turned on), which greatly improves performance.
50. Optimize the data type of the table and select the appropriate data type:
Rule of thumb: Smaller is usually better, simple is good, all fields should have default values and try to avoid null.
For example, database tables are designed to take up disk space using smaller integer types whenever possible.(MediumInt is better than int.)
For example, the time fields: datetime and TIMESTAMP, datetime takes up 8 bytes, while timestamp takes up 4 bytes, which is only half, while timestamp represents a range of 1970-2037 suitable for updating time
MySQL supports large data volumes well, but in general, the smaller the tables in the database, the faster the queries executed on them.
Therefore, when creating a table, we can make the width of the fields in the table as small as possible for better performance. For example,
When defining the zip code field, setting it to CHAR(255) obviously adds unnecessary space to the database,
Even using the VARCHAR type is redundant, because CHAR(6) does the job just fine. And again, if I may,
We should use MEDIUMINT instead of BIGIN to define integer fields.
You should try to set the field to NOT NULL so that the database does NOT have to compare NULL values during future queries.
For some text fields, such as “province” or “gender,” we can define them as ENUM types. Because in MySQL, ENUM types are treated as numeric data,
Numerical data can be processed much faster than textual data. In this way, we can improve the performance of the database.
51, string data type: char, vARCHar, text Select difference
52. Any operation on a column will result in a table scan, which includes database functions, computed expressions, and so on. Move the operation to the right of the equals sign whenever possible.