MySQL common optimization solutions

Optimize is a way to respond more quickly to requests while ensuring that the system is correct. Moreover, some performance problems, such as slow queries, can lead to serious consequences if they accumulate to a certain extent or when the number of concurrent requests increases rapidly, ranging from busy services to unusable applications. It threatens us like a time bomb about to detonate. Therefore, strict checks are needed before launching the project to ensure that MySQL can run in the optimal state.

What are the optimization solutions for MySQL?

The common optimization methods of MySQL database are divided into three levels: SQL and index optimization, database structure optimization, system hardware optimization, etc. However, each major direction contains several small optimization points. Let’s take a look at the details below.

Considerations for optimization:

  • Use data, not guesswork
  • Avoid premature optimization
  • Avoid over-optimization
  • Deep understanding of the business
  • Performance tuning is a constant battle
  • Select appropriate metrics, test cases, and test environments

The optimization goal

  • Reduce I/O Count

IO is always the database is the most easy bottleneck, which is determined by the responsibility of the database, most of the database operation in more than 90% of the time is occupied by IO operation, reduce IO times is the first priority in SQL optimization, of course, but also the most obvious optimization means.

  • Reduce CPU computing

In addition to IO bottlenecks, SQL optimization needs to be considered in the CPU optimization. Order by, group by,distinct… Both are CPU hogs (these operations are basically the CPU processing of in-memory data comparison operations). When we reach a certain stage of IO optimization, reducing CPU computation becomes an important goal of our SQL optimization

Optimization plan 1: SQL and index optimization

  • Use the correct index

Index is one of the most important concepts in database, and also one of the most effective means to improve the performance of database, its birth itself is to improve the efficiency of data query, just like the directory of dictionary, through the directory can quickly find the relevant content. We should use primary key queries rather than other index queries whenever possible, since primary key queries do not trigger back table queries, thus saving some time and improving query performance.

** index type ** : <font color = "blue">Copy the code

If we do not add indexes, then the full table scan will be triggered during the query, so the query data will be a lot, and the query efficiency will be very low. In order to improve the query performance, we need to add corresponding indexes to the most commonly used query fields, so as to improve the query performance

  • Note when writing SQL

    • In MySQL 5.0 or earlier, you should avoid using or queries. You can use union or subqueries instead. In MySQL 5.0 or later, index merges are introduced. In simple terms, it is a function to merge the intersection or union of the result set of multi-condition query, such as OR or and query, so that there is no index failure problem. Use or sparingly if other fields in the constraint do not have indexes.
    • Avoid using it in WHERE query conditions! = or <> operators, because these operators cause the query engine to abandon the index for a full table scan.
    • Use prefix indexes appropriately. MySQL supports prefix indexes, which means you can define a portion of a string as an index. We know that the longer the index the greater the amount of disk space, then the same data page can put down the index values of the less, this means that the query time required for the search index is, the longer and the efficiency of the query will reduce, so we can choose the appropriate use of prefix index, in order to reduce the occupied space and improve the query efficiency.
    • Try to avoid using SELECT * and instead query for the required fields to increase speed and reduce bandwidth pressure on network traffic.
    • As for JOIN optimization, try to use JOIN statement instead of subquery, because subquery is a nested query, and the nested query will create a temporary table, and the creation and destruction of temporary tables will occupy some system resources and take some time. However, JOIN statement does not create temporary tables, so the performance will be higher.
    • If the data in table B is smaller than that in table A, the order of execution is to check table B and then table A.
    • Do not perform arithmetic or other expression operations on column fields, as this may cause the query engine to fail to use the index properly, thereby affecting the efficiency of the query.
    • Adding redundant fields can reduce a large number of linked table queries, because the performance of the linked table query of multiple tables is very low, so we can appropriately add redundant fields to reduce the associated query of multiple tables, which is a space-for-time optimization strategy.
    • Avoid type conversions, which occur when the type of the COLUMN column column in the WHERE clause does not match the type of the parameter passed in.
    • Union all should be used as far as possible to replace union. The main difference between union and Union All is that the former requires the combination of two (or more) result sets and then the unique filtering operation, which will involve sorting, increase a large number of CPU operations, and increase resource consumption and delay. So use union all instead of union when we can confirm that duplicate result sets are impossible or don’t care about duplicate result sets.
    • Sort by as little as possible. The sort operation consumes a lot of CPU resources. Therefore, reducing sort will greatly affect the RESPONSE time of SQL in the scenario of high cache hit ratio and sufficient I/O capability. If the sort field does not use an index, try to sort as little as possible.
    • SQL statements should not contain too many values IN. MySQL has optimized IN that all constants IN IN are stored IN an array, and the array is sorted. However, if the number is large, the consumption is also relatively large.
    • When only one piece of data is needed, limit 1 is used to make the type column in EXPLAIN const.
    • Distinguish between in and exists, not in and not exists.
    • Use proper pagination to increase pagination efficiency.
    • Segmented query
    • Avoid null values for fields in the WHERE clause
    • Fuzzy query with % prefix is not recommended. Full-text index is recommended for fuzzy query with % prefix.
    • For federated indexes, follow the leftmost prefix rule. Such as: composite index (a, b, c) three columns, we can use (a), (a, b), (a, b, c), (a, c < default walk a, don’t walk > c), (c, b, a), (c, a), but not to (c, b), (b, c), (b), (c) combined with otherwise cause disabling indexes.
    • If necessary, you can use force index to force a query to move an index.

All the above optimization schemes can be verified by EXPLAIN. EXPLAIN each field in detail: The value of the type field is as follows:

Optimization scheme 2: database structure optimization

① Minimum data length

Generally speaking, the smaller the table is, the faster the query speed is. Therefore, to improve the efficiency of the table, the field of the table should be set as small as possible. For example, the id number can be set to CHAR (18) rather than vARCHAR (18).

② Use the simplest data type

Do not use vARCHar when you can use int because int is more efficient than vARCHar.

③ Define text types as little as possible

The query efficiency of the text type is low. If you need to use text to define a field, you can separate the field into sub-tables. When you need to query the field, you can use joint query to improve the query efficiency of the main table.

4 Select an appropriate storage engine

Storage engine types: MyISAM, InnoDB, MEMORY, MERGE, TokuDB, CSV, Archive, etc.

MyISAM: If the application is mainly read and insert operations, with few updates and deletes, and the requirements for transaction integrity and concurrency are not very high, then this storage engine is very suitable. MyISAM is one of the most commonly used storage engines in Web, data warehouse, and other application environments.

InnoDB: For transaction processing applications, support foreign keys. InnoDB storage engine is a good choice for applications that have high requirements for transaction integrity, data consistency under concurrent conditions, and data operations that include many updates and deletes in addition to inserts and queries. InnoDB storage engine not only effectively reduces locking due to deletion and update, but also ensures full Commit and Rollback of transactions. InnoDB is a good choice for systems that require high data accuracy, such as billing or accounting systems.

MEMORY: Keeps all data in RAM, providing extremely fast access in environments where records and other similar data need to be located quickly. MEMORY has limitations on the size of tables that are too large to be cached in MEMORY, and the need to ensure that table data can be recovered after an unexpected database termination. MEMORY tables are often used to update small tables that are less frequent for quick access to results.

The MERGE: MERGE tables are used to logically group a series of identical MyISAM tables together and refer to them as one object. MERGE tables have the advantage of breaking the limit on the size of a single MyISAM table and improving access efficiency by distributing different tables across multiple disks. This is ideal for VLDB environments such as data warehouses. It is a third-party storage engine with high write performance and high compression rate. It is a storage engine for MySQL and MariaDB that supports transaction processing and supports most online DDL operations.

TokuDB: Use Fractal tree index to ensure efficient insertion performance; Excellent compression features, nearly 10 times higher than InnoDB; Hot Schema Changes supports online index creation and DDL operations such as adding or deleting attribute columns. The Bulk Loader is used to load a large amount of data quickly. Provides master-slave delay elimination technology; ACID and MVCC are supported.

Comparison of three common storage engines: Appropriate separate table, separate library strategy

Split table and split library scheme is also often referred to as vertical partition (split table) and horizontal partition (split library).

Split table means that when there are more fields in a table, we can try to split a large table into multiple sub-tables, and put the main information with high frequency into the main table and the other into sub-tables. In this way, most of our queries can be completed only with the main table with fewer fields, which effectively improves the efficiency of the query.

Branching refers to dividing a database into multiple databases. For example, we split a database into multiple databases, one of which is used for writing and modifying data, and the other is used for synchronizing master data and providing clients with queries. In this way, the pressure of reading and writing of a library is distributed to multiple libraries, thus improving the overall operation efficiency of the database.

Optimization scheme 3: system hardware optimization

The hardware requirements of MySQL are mainly reflected in three aspects: disk, network, and memory.

(1) the disk

Disk seek capability (DISK I/O). Take the current high speed SCSI disk (7200 RPM) as an example. In theory, this disk seeks 7200 times per second, which is determined by physical characteristics and cannot be changed. Disks should be used with high performance read and write capabilities, such as solid-state drives, to reduce I/O time and improve the overall efficiency of MySQL. You can also use multiple small disks instead of one large disk, because the rotational speed of a disk is fixed. Having multiple small disks is like having multiple disks running in parallel.

(2) network

To ensure the smooth network broadband (low latency) and sufficient network bandwidth are the basic conditions for the normal operation of MySQL. If conditions permit, multiple network adapters can be set up to improve the operating efficiency of MySQL server in network peak hours. DNS configuration uses skip-name-resolve as much as possible to reduce unnecessary trouble caused by resolution. Check the ping packet loss rate of the network. You can optimize network parameters in /etc/sysctl. CNF to improve performance.

(3) memory

The larger the memory of MySQL server is, the more information will be stored and cached, and the performance of memory is very high, thus improving the operating efficiency of MySQL.

Knowledge extension:

Slow query: Use the slow query log function to query slow SQL statements. Then use Explain to query the EXECUTION plan of SQL statements. Finally, analyze and locate the root cause of the problem, and then handle the problem (as described in the previous section).

Slow query log indicates that you can configure the slow query log recording function in MySQL. The SQL that exceeds the value of long_query_time will be recorded in the log. Slow query can be enabled by setting “slow_query_log=1”, which can be enabled in two ways:

Run “Set global slow_query_log=1” on the MySQL command line to enable the slow_query_log function. However, this mode will become invalid after restarting the MySQL service. Slow_query_log =1 slow_query_log=1 slow_query_log=1 In addition, you can set slow_query_log_file=/ TMP /mysql_slow.log to configure the directory for storing slow query logs. However, this configuration takes effect only after the MySQL server is restarted. Note that the slow log function may adversely affect MySQL performance. Therefore, use this function with caution in the production environment.

====================== END =========================