This article covers 52 SQL statement performance optimization strategies.

1, to optimize the query, try to avoid full table scan, first should consider where and ORDER by related columns index.

2. Avoid null values in the WHERE clause. Null is the default value when creating a table, but most of the time use NOT NULL or a special value such as 0 or -1 as the default value.

Avoid using it in where clauses! = or <> operators, MySQL only uses indexes for the following operators: <, <=, =, >, >=, BETWEEN, IN, and sometimes LIKE.

4. Avoid using OR in the WHERE clause to join conditions. Otherwise, the engine will abandon the use of index and perform full table scan. Select id from t where num=10 union all select ID from t where num=20.

Select id from t where num between 1 and 3. Select id from t where num between 1 and 3.

Select id from t where name like ‘% ABC %’ or select id from t where name like ‘% ABC’ Select id from t where name like ‘ABC %’

7. If you use parameters in the WHERE clause, it will also cause a full table scan.

Avoid expression manipulation of fields in the WHERE clause. Avoid functional manipulation of fields in the WHERE clause.

9, Many times it is a good choice to use exists instead of in: select num from a where num in(select num from b). Select num from a where exists(select 1 from b where num=a.num).

10. While indexes can improve the efficiency of select operations, they can also reduce the efficiency of inserts and updates. The number of indexes in a table should not exceed 6. If there are too many, you should consider whether it is necessary to build indexes on infrequently used columns.

11. Avoid updating clustered index columns as possible because the sequence of clustered index columns is the physical storage sequence of table records. Any change in clustered index columns will result in the adjustment of the sequence of table records, which will consume considerable resources. If your application system needs to update clustered index data columns frequently, you need to consider whether to build the clustered index as a clustered index.

12. Use numeric fields as far as possible. If fields containing only numeric information should not be designed as characters, which will reduce query and join performance and increase storage overhead.

13. Use varchar/nvarchar instead of char/nchar whenever possible, because the storage space of variable length fields is small, and the search efficiency of a relatively small field is obviously higher.

14, It is best not to use “return all: select from t”, replace “*” with a list of specific fields, do not return any fields that are not needed.

15. Try to avoid returning large amounts of data to the client. If the amount of data is too large, the corresponding demand should be considered whether it is reasonable.

Use table aliases: When joining multiple tables in an SQL statement, use the Alias of the table and prefix the Alias to each Column. This reduces parsing time and reduces syntax errors caused by Column ambiguities.

17. Use “temporary table” to temporarily save intermediate results:

The important way to simplify the SQL statement is to use temporary table temporary intermediate results, but the advantage of the temporary table is far more than these, the provisional results spooled in the temporary table, the back of the query in tempdb for, this can avoid program scans the main table for many times, also greatly reduce the program execution of “Shared lock” blocking “update lock”, reduced the blocking, Improved concurrency performance.

18, Some SQL query statements should add NOLock, read and write will block each other, to improve concurrency performance, for some queries, add NOLock, this can allow write, but the downside is that it may read uncommitted dirty data.

There are three rules for using NOLock:

  • Nolock cannot be added to the query results for Insert, Delete, or modify.
  • If the queried table frequently splits pages, use NOLock with caution.
  • Temporary tables can also be used to save “data anaesthesia”, which is similar to the function of Oracle undo tablespace. If temporary tables can improve concurrent performance, do not use NOLock.

19. Common simplification rules are as follows:

Do not have more than five table joins, and consider using temporary tables or table variables to hold intermediate results. Use less subqueries and do not nest views too deeply. Generally, it is advisable to nest no more than 2 views.

20, will need to query the results of pre-calculated in the table, query when Select. This was the most important tool before SQL7.0, for example, calculating hospital charges.

21. Statements with OR can be split into multiple queries, and multiple queries can be joined by UNION. Their speed depends only on whether an index is used or not; if a query requires a federated index, it is more efficient to use UNION all. Multiple OR sentences do not use the index, rewrite into the form of UNION and then try to match with the index. A key question is whether to use indexes.

22. IN the list of denominations after IN, place the values that occur most frequently at the front and the values that occur least at the back to reduce the number of judgments.

23, try to put the data processing work on the server, reduce the network overhead, such as the use of stored procedures.

Stored procedures are SQL statements that have been compiled, optimized, organized into an execution plan, stored in a database, and are a collection of control-flow languages, which are certainly fast. Dynamic SQL that executes repeatedly can use temporary stored procedures that are placed in Tempdb (temporary tables).

24, when the server memory is large enough, the number of configuration threads = the maximum number of connections +5, this can play the most efficient; Otherwise, the number of threads configured < the maximum number of connections enable the THREAD pool of SQL SERVER. If the number of threads is still equal to the maximum number of connections +5, the SERVER performance will be severely damaged.

25, query association with write order:

select a.personMemberID, * From chineseresume a, personMember B where personMemberID = b.ferenceId and a.personmemberId = 'JCNPRH39681' (a = b ,B = 'id') select a.memberid, * from chineseresume a, personMember B where a.pagermemberId = b.pagerenceId and a.pagermemberId = 'JCNPRH39681' and B.reeferenceid = 'JCNPRH39681' (A = B,B = 'number', A = 'id') select a.memberid, * from chineseresume a,personmember B where b.ferenceId = 'JCNPRH39681' and a.personmemberID = 'JCNPRH39681' (b = 'Number', A = 'number')Copy the code

26, Try to use exists instead of select count(1) to determine whether a record exists. The count function is used only when the total number of rows in a table is counted and count(1) is more efficient than count(*).

27, use “>=” instead of “>”.

28. Specification for the use of indexes:

  • Index creation should be combined with the application. It is recommended that large OLTP tables have no more than 6 indexes.
  • If possible, use index fields as query criteria, especially clustered indexes. If necessary, use index index_name to force an index to be specified.
  • Avoid table Scan when querying large tables. Create indexes if necessary.
  • When an index field is used as a condition, if the index is a joint index, the first field in the index must be used as a condition to ensure that the system can use the index. Otherwise, the index will not be used.
  • Pay attention to index maintenance, periodically rebuilding the index, and recompiling stored procedures.

29. The columns in the following SQL condition statements are properly indexed, but execute very slowly:

SELECT * FROM record WHERE amount/30< 1000 (11 seconds) SELECT * FROM record WHERE amount/30< 1000 (11 seconds) SELECT * FROM record WHERE substrINg(card_no,1,4)= '5378' (13 seconds) SELECT * FROM record WHERE amount/30< 1000 (11 seconds) SELECT * FROM record WHERE convert(char(10),date,112)= '19991201' (10 秒)Copy the code

Analysis:

The result of any operation on a column in the WHERE clause is computed column by column at SQL runtime, so it has to do a table search without using the index above that column.

If these results are available at query compilation time, they can be optimized by the SQL optimizer to use indexes and avoid table searches, so rewrite the SQL as follows:

SELECT * FROM record WHERE card_no like '5378%' (< 1 SEC) SELECT * FROM record WHERE amount< 1000*30 (< 1 SEC) SELECT * FROM record WHERE amount< 1000*30 (< 1 SEC) SELECT * FROM record WHERE card_no like '5378%' (< 1 SEC) SELECT * FROM record WHERE amount< 1000*30 (< 1 SEC Record WHERE date= '1999/12/01' (< 1 SEC)Copy the code

When there is a batch of inserts or updates, use batch inserts or batch updates, never one record to update.

In all the stored procedures, can use SQL statements, I will never use a loop to achieve.

For example, to list every day in the last month, I would use Connect by to recursively query, rather than loop from the first day to the last day of the last month.

Select the most efficient table name order (only valid in rules-based optimizer) :

Oracle’s parser processes the table names in the FROM clause FROM right to left. The last table in the FROM clause is processed first. In the case of multiple tables in the FROM clause, you must select the table with the least number of entries as the base table.

If more than three tables join queries, you need to select the intersection table, which is referenced by other tables, as the base table.

33, Improve the efficiency of GROUP BY statements BY filtering out unwanted records before GROUP BY. The following two queries return the same result, but the second is significantly faster.

Inefficient:

SELECT SAL, AVG(SAL) FROM EMP GROUP BY JOB HAVING JOB = 'PRESIDENT' OR JOB = 'MANAGER'Copy the code

High efficiency:

SELECT AVG(SAL) FROM EMP WHERE JOB = 'PRESIDENT' OR JOB = 'MANAGER' GROUP BY JOBCopy the code

SQL statements are executed in uppercase because Oracle always parses SQL statements to convert lowercase letters to uppercase letters.

35, the use of alias, alias is a large database application skills, is the table name, column name in the query with a letter alias, query speed is 1.5 times faster than the establishment of a connection table.

Avoid deadlocks and always access the same table in the same order in your stored procedures and triggers. Transactions should be shortened as much as possible and the amount of data involved in a transaction should be minimized; Never wait for user input in a transaction.

Avoid using temporary tables unless you need them. Instead, use table variables instead. Most of the time (99%), table variables reside in memory and are therefore faster than temporary tables, which reside in the TempDb database, so operations on temporary tables need to communicate across databases and are naturally slower.

38. It’s best not to use triggers:

  • Firing a trigger, executing a trigger event is a resource-consuming process in itself;
  • Do not use triggers if you can implement them using constraints.
  • Do not use the same trigger for different trigger events (Insert, Update, and Delete);
  • Do not use transactional code in triggers.

39, index create rule:

  • The primary and foreign keys of a table must have indexes.
  • Tables with more than 300 data should have indexes;
  • Tables that are frequently joined with other tables should have indexes on join fields.
  • Fields that appear frequently in the Where clause, especially in large tables, should be indexed;
  • Indexes should be built on highly selective fields;
  • Indexes should be built on small fields, not large text fields or even long fields.
  • The establishment of composite index needs careful analysis, and single field index should be considered as far as possible.
  • Select the main column fields in the composite index correctly, usually with good selectivity.
  • Do several fields of a composite index often appear together as AND in the Where clause? Are there few or no single-field queries? If so, you can create a composite index; Otherwise consider a single-field index;
  • If the composite index contains fields that often appear alone in the Where clause, it is decomposed into multiple single-field indexes;
  • If the composite index contains more than three fields, think carefully about the need to reduce the number of composite fields;
  • If there are both single-field indexes and compound indexes on these fields, you can delete the compound indexes.
  • Do not create too many indexes for tables with frequent data operations.
  • Delete unnecessary indexes to avoid negative impact on the execution plan.
  • Each index created on a table increases storage overhead, and indexes also increase processing overhead for insert, delete, and update operations. In addition, too many composite indexes, in the case of a single field index, are generally worthless; Conversely, it degrades performance when data is added and deleted, especially for frequently updated tables.
  • Try not to index a field in the database that contains a large number of duplicate values.

40, MySQL query optimization summary:

Use slow query logs to find slow queries, use execution plans to determine if queries are working properly, and always test your queries to see if they are working at their best.

Performance changes over time. Avoid using count(*) on the entire table, which may lock the entire table, keeping queries consistent so that subsequent similar queries can use query caching, and use GROUP BY instead of DISTINCT when appropriate. Use indexed columns in the WHERE, GROUP BY, and ORDER BY clauses to keep indexing simple and not include the same column in multiple indexes.

MySQL can sometimes USE the wrong INDEX, in which case USE INDEX, check that SQL_MODE=STRICT is used. For INDEX columns with less than 5 records, USE LIMIT instead of OR.

To avoid using INSERT ON DUPLICATE KEY or INSERT IGNORE before UPDATE, do not use UPDATE, do not use MAX, use index fields and ORDER BY clauses, LIMIT M, N can actually slow down the query in some cases, Use sparingly, use UNION instead of subqueries in WHERE clauses, remember to warm up your database to ensure data is in memory and queries are fast, consider persistent connections rather than multiple connections to reduce overhead.

Benchmark queries, including using load on the server, sometimes a simple query can affect other queries, and when the load increases on the server, use SHOW PROCESSLIST to see slow and problematic queries, all suspicious queries tested in the mirror data generated in the development environment.

MySQL > select * from ‘MySQL’;

  • Backup from the secondary replication server.
  • Stop replication during backup to avoid inconsistencies in data dependencies and foreign key constraints;
  • Stop MySQL completely, backup from database file;
  • If MySQL dump is used for backup, please back up binary log files as well – make sure the replication is not interrupted;
  • Don’t trust LVM snapshots, which are likely to produce data inconsistencies that will cause you trouble in the future;
  • To make single-table recovery easier, export data on a per-table basis if the data is isolated from other tables.
  • When using mysqldump use -opt;
  • Check and optimize tables prior to backup;
  • To make imports faster, temporarily disable foreign key constraints during imports. ;
  • For faster import, the uniqueness detection is temporarily disabled during import.
  • Calculate the size of the database, tables, and indexes after each backup to better monitor data size growth;
  • Automatic scheduling scripts monitor replication instances for errors and delays;
  • Perform regular backups.

42. The query buffer does not automatically process whitespace, so when writing SQL statements, you should minimize the use of whitespace, especially at the beginning and end of THE SQL (because the query buffer does not automatically intercept the beginning and end of the whitespace).

Member with mid as standard for sub-table convenient query? In general service requirements, username is basically used as the query basis. Normally, username should be used to hash modules to separate tables.

The partition function of MySQL is transparent to the code; It seems unreasonable to implement it at the code level.

44, We should set each table in our database to an ID as its primary key, preferably an INT (UNSIGNED is recommended), and set AUTO_INCREMENT flag on the table.

45, SET SET NOCOUNT ON at the start of all stored procedures and triggers and SET NOCOUNT OFF at the end. There is no need to send a DONE_IN_PROC message to the client after each statement of the stored procedure and trigger is executed.

46, MySQL query can enable high speed query cache. This is one of the most effective MySQL optimizations to improve database performance. When the same query is executed multiple times, it is much faster to extract data from the cache and return data directly from the database.

47, EXPLAIN SELECT query to trace query results:

Use the EXPLAIN keyword to let you know how MySQL handles your SQL statements. This can help you analyze performance bottlenecks in your query or table structure. The EXPLAIN query results will also tell you how your index primary key is utilized and how your tables are searched and sorted.

48, Use LIMIT 1 when only one row is required:

Sometimes when you query a table, you already know that the result will only have one result, but because you might need to fetch the cursor, or you might check the number of records returned.

In this case, adding LIMIT 1 can increase performance. In this way, the MySQL database engine will stop searching after it finds one data item, rather than continuing to search for the next data item that matches the record.

49, Select table appropriate storage engine:

  • ** MyISAM: ** The application is mainly read and insert operations, only a small amount of update and delete, and transaction integrity, concurrency requirements are not very high.
  • **InnoDB: ** Transaction processing, and concurrent conditions require data consistency. In addition to inserts and queries, it includes a lot of updates and deletions. InnoDB effectively reduces locks caused by deletes and updates. For transacts-enabled InnoDB-type tables, the main reason for speed is that AUTOCOMMIT is turned on by default, and the program does not explicitly call BEGIN to start a transaction, resulting in auto-commit for every insert, which severely affects speed. You can call BEGIN before executing the SQL, and multiple SQL statements form one thing (even if AutoCOMMIT is turned on), which greatly improves performance.

50, optimize the table data type, select the appropriate data type:

** Rule: ** smaller is usually better, simple is good, all fields should have default values, try to avoid null.

For example: database tables are designed to take up disk space using smaller integer types whenever possible. (Mediumint is better than int.)

For example, the time fields: datetime and TIMESTAMP, datetime takes up 8 bytes, while timestamp takes up 4 bytes, which is only half, while timestamp represents a range of 1970-2037 suitable for updating time

MySQL supports large data volumes well, but in general, the smaller the tables in the database, the faster the queries executed on them.

Therefore, when creating a table, we can make the width of the fields in the table as small as possible for better performance.

For example, when defining the zip code field, setting it to CHAR(255) would obviously add unnecessary space to the database. Even using the VARCHAR type is redundant, because CHAR(6) does the job just fine.

Also, if possible, we should use MEDIUMINT instead of BIGIN to define integer fields. We should try to set the fields to NOT NULL so that the database does NOT have to compare NULL values when performing future queries.

For some text fields, such as “province” or “gender,” we can define them as ENUM types. Because in MySQL, ENUM types are treated as numeric data, and numeric data is processed much faster than text types. In this way, we can improve the performance of the database.

51, string data type: char, varchar, text Select different.

52, Any operation on a column will result in a table scan. This includes database functions, computed expressions, and so on. Move the operation to the right of the equals sign whenever possible.