Zero. Why optimize
- The throughput bottleneck of the system often appears in the database access speed
- As the application runs, more and more data is stored in the database, and processing times are correspondingly slower
- Data is stored on disk and read and write speeds are not comparable to memory
A, observe
MySQL optimization ≠SQL statement optimization, which is important to understand, although most of the time we are tuning SQL statements.
However, MySQL optimizations start with observations, and can sometimes take minutes, hours, or more than a day.
The obvious purpose of this is to help us locate the problem. For example, the load of MySQL will suddenly jump at a fixed time node, or a request for a few pages will produce a relatively obvious delay, or even affect the subsequent requests.
There are various means of observation, ali Cloud has a powerful RDS console, can also build a set of monitoring platform, the most is to temporarily run a shell script, collect MySQL running status. The most famous show status command lists at least 200 indicators, so there are trade-offs. Some of the most popular metrics are the current connection count and maximum connection count, the number of threads currently running, the number of slow queries, and so on.
Second, the analysis
Further analysis of the observation results will form different solutions.
It is possible that the cache fails at a certain time node, resulting in a surge of MySQL load. You can try to spread the cache fails at a certain time node evenly over 24 hours a day, or refresh the cache at a time when there is less traffic.
There may be a potential problem with the SQL statement, and in some cases there may be a performance problem. You can use show Full ProcessList to locate the library, or you can turn on slow queries to directly locate the problem SQL statement and use Explain to analyze the statement execution plan. Adding an index may solve the problem, or join too many tables and need to split the query, or the size of a single table may be too large and need to split the table.
May be the machine itself performance problems, the so-called “clever housewife without rice”, this time to consider expansion.
Third, solve
Most of the solutions have been mentioned in the analysis phase, and the final conclusion is:
1, the introduction of caching, of course, this is a double-edged sword, to use the appropriate, or need a certain skill. On the one hand, the MySQL internal cache mechanism provides a variety of cache parameters, such as the cache of the query result set, the cache of the result set sorting, can be adjusted according to the actual situation. On the other hand is the cache outside MySQL, such as Redis+MySQL architecture, open Hibernate (Mybatis) cache function. Caching was introduced to ease MySQL’s query burden, but it must be balanced between performance stability and data timeliness.
2. SQL statements have performance issues. This happens from time to time, usually without a complete benchmark test before going live, rather than a simple functional test. When the amount of data accumulated to a certain extent, SQL performance problems burst out in a concentrated manner. So, after writing SQL, get in the habit of explaining to nip potential performance problems in the bud. Of course, we also want to avoid “over-optimization”, we want to anticipate whether a table will be read many times or updated many times, whether the data volume will explode or grow very slowly. When to use an IN query, when to use an EXISTS predicate, whether to simplify some of the table data before joining, and whether to create an index correctly.
3. If necessary, the machine can be expanded. Of course, the overall architecture of the system can also be optimized to build MySQL cluster, which can greatly improve reliability and availability.
4. Supplement: SQL paradigm
1NF
Each component must be an indivisible data item.
Features:
- There is a primary key and the primary key cannot be empty.
- Fields cannot be divided again.
2NF
On the basis of normal form one, and every non-main attribute complete function depends on the code.
Features:
- Satisfy the first paradigm.
- Each non-primary attribute in the table must be completely dependent on the table code.
- Second normal form violations occur only when the main code in a table consists of two or more attributes.
3NF
On the basis of satisfying the second normal form, each non-primary attribute is neither partially dependent on code nor transitively dependent on code.
Features:
- Satisfy the second normal form.
- Non-primary attributes cannot be passed depending on the code.
** BCNF**
On the basis of satisfying the third normal form, one part of the primary key is not allowed to be determined by another part or other parts.
Features:
- Satisfy the third normal form.
- All non-primary attributes are fully functional dependent on each code.
- All primary attributes are fully function dependent on each code that does not contain them.
- No attribute is completely functional depending on any set of attributes of the fly code.
The above is the framework of MySQL optimization thinking.