On the one hand, database optimization is to find out the bottleneck of the system and improve the overall performance of MySQL database. On the other hand, reasonable structure design and parameter adjustment are needed to improve the corresponding speed of users. At the same time, it is necessary to save system resources as much as possible so that the system can provide greater load.

1. Optimize the overview chart

2. Optimization The author divides the optimization into two categories, soft optimization and hard optimization. Soft optimization is generally to operate the database, while hard optimization is to operate the server hardware and parameter setting.

2.1 soft optimization

2.1.1 Query Statement Optimization

1. First we can analyze the execution information of a query using EXPLAIN or DESCRIBE(abbreviated :DESC).

2. Example:

DESC SELECT * FROM user

Display:



Information such as index and number of data read from query data is displayed.

2.1.2 Optimizing subqueries

In MySQL, try to use joins instead of subqueries. Because subqueries need nested queries, nested queries will create a temporary table, the establishment and deletion of temporary tables will have a large system overhead, and join queries will not create temporary tables, so the efficiency is higher than nested subqueries.

2.1.3 Using indexes

Index is one of the most important methods to improve the speed of database query, about the index can be referred to the author <MySQL database index > article, introduction is more detailed, here records the use of index three precautions:

  • The LIKE keyword matches strings starting with ‘%’ and does not use indexes.
  • Both fields of the OR keyword must be indexed for the query to use the index.
  • To use a multi-column index, the leftmost match must be met.

2.1.4 breakdown

For a table with many fields, if some fields are used less frequently, you should separate them out to form a new table.

2.1.5 intermediate table

Intermediate tables can be created for tables that will be queried in large numbers to reduce join time during queries.

2.1.6 Adding redundant fields is similar to creating intermediate tables. Adding redundancy is also to reduce join queries.

2.1.7 Analyze table, check table and optimize table

An analysis table is used to analyze the distribution of keywords in the table, a check table is used to check whether there are errors in the table, and an optimization table is used to eliminate the waste of table space caused by deletion or update.

ANALYZE TABLE: use the ANALYZE keyword, such as ANALYZE TABLE user;

  • Op: indicates the operation to be performed.
  • Msg_type: information type, have the status, the info, note, warning, error.
  • Msg_text: displays information. – CHECK TABLE: uses the CHECK keyword, such as CHECK TABLE user [option]
  • Option only applies to MyISAM and has five values:
  • QUICK: Do not scan rows and check for faulty connections.
  • FAST: checks only tables that are not closed properly.
  • CHANGED: Checks only tables that have CHANGED since the last check and tables that have not been closed properly.
  • MEDIUM: Scans lines to verify that the deleted connection is valid, or calculates a one-line keyword checksum.
  • EXTENDED: the most comprehensive inspection, the comprehensive search keywords in each row. – OPTIMIZE TABLE: use OPTIMIZE keywords, such as OPTIMIZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE the user;
  • LOCAL | NO_WRITE_TO_BINLOG is said not to a log., OPTIMIZE the TABLE only to VARCHAR, BLOB and TEXT effectively, through OPTIMIZE TABLE statement can eliminate file fragments, in the process of execution will add a read-only lock.

2.2 hard to optimize

2.2.1 Three sets of hardware

– configuration more core and high frequency of CPU, multi-core can perform multiple threads. Configuration with large memory, enhance memory, can improve the buffering capacity, thus can reduce disk I/O time, thus improve the response speed. – configuration high-speed disk or reasonable distribution disk: high-speed disk improve I/O, distribution disk can improve the ability of parallel operation.

2.2.2 Optimizing Database Parameters

Optimizing database parameters can improve resource utilization and thus improve MySQL server performance. MySQL service configuration parameters are in my.cnf or my.ini.

  • Key_buffer_size: indicates the index buffer size
  • Table_cache: indicates the number of open tables
  • Query_cache_size and query_cache_type: the former is the query buffer size, and the latter is the switch of the preceding parameters. 0 means no buffer,1 means buffer, but SQL_NO_CACHE means no buffer, and 2 means cache is explicitly used in the query Flush only use buffer, namely SQL_CACHE.
  • Sort_buffer_size: sort buffer

2.2.3 Sub-database sub-table

The first problem is that system performance may deteriorate at peak times because of high database load. On the other hand, what if stress crashes your database? In this case, you have to split the system into multiple libraries, deployed on multiple database services, and then serve as the master library to handle the write requests. Each master library then mounts at least one slave library, which hosts the read requests.

2.2.4 Cache Cluster

If you have more and more users, then you can keep adding machines, say at the system level, and you can handle more concurrent requests. Then, if the write concurrency at the database level becomes higher and higher, expand the database server, which can support the expansion of the machine through the partition table. If the read concurrency at the database level becomes higher and higher, expand and add more slave libraries. However, there is a big problem: the database itself is not used to carry high concurrency requests, so generally speaking, the database single machine carries concurrency in thousands of orders of magnitude per second, and the database using machines are relatively high configuration, relatively expensive machines, high cost. If you simply keep adding machines, that’s not true. Caching is a common part of high concurrency architectures, and caching systems are designed to accommodate high concurrency. Therefore, the concurrency of a single machine is in the tens of thousands per second, or even hundreds of thousands per second. The carrying capacity of high concurrency is one or two orders of magnitude higher than that of a database system. So you can introduce a cache cluster based on the business nature of the system for those requests that are written less and read more. Specifically, you write a copy of data to the cache cluster while writing to the database, and then use the cache cluster to handle most of the read requests. This way, by caching the cluster, higher concurrency can be carried with fewer machine resources.

Conclusion A complete and complex architecture of highly concurrent systems must contain a variety of complex self-developed infrastructure systems. All kinds of sophisticated architectural design. So a short article is at best a primer, but that’s about the idea of database optimization.