Most of the time, the “performance problem” of our program is actually the problem of the “lump” of code we wrote, the Coding problem of ourselves, and the problem of using the DML statement of Mysql. Here is my summary of the points to be aware of when using MySQL DML statements.
forselect *
Always be cautious
In most cases, you don’t need select *. When this is used, it prevents the optimizer from performing optimizations such as index override scans and adds additional I/O, memory, and CPU consumption. Of course, the use of SELECT * is not all bad. When used properly, select * can simplify development and increase the reusability of the same code.
Whether too many additional records are scanned
Sometimes you find queries that read thousands of rows but return only a few or few results. You can optimize them by:
- Let’s see if we can change the structure. For example, use summary tables
- See if the way to get the data results is the best, to get the road strength is the shortest.
- Using an override index, you put all the required columns in the index to reduce the number of steps required to retrieve data from the corresponding row in the table.
Shard some SQL statements
Traditional Internet system, emphasizes the network connection as far as possible, the data layer as far as possible in a connection do as much as possible, to prevent building links for many times, but the idea is not applicable for MySQL, MySQL from design to connect and disconnect all very lightweight, on general server can support more than 100000 queries per second.
So for some scenarios, you can “divide and conquer” a large query, break it up into smaller queries, and then put it together again. For example:
- This becomes paging for full data queries. If there are tens of millions of pieces of data in a table, select all at once, it will not work. You can take a little bit at a time, and you divide the pressure at a time.
- When deleting a large amount of old data, do not clean up a large statement at once, recommended
Ten thousand at a time
. If there is a big statement of one-time complete, may need a lock of large amounts of data, with a large number of transaction log, let the Mysql stood there, to avoid this kind of circumstance happening, best one-time delete data from ten thousand or so, every time finish delete pause for a moment then operation, scatter disposable pressure on the server.
Note: Although Mysql is very lightweight to set up connections, this does not mean that you can query and then concatenate each loop, which is still very slow and is usually the point of SQL optimization at work.
carefuljoin
operation
This is a no-no. Many companies’ Internet products eliminate the join operation, and replace it with two single table query operations, first fetching the data ID from one table and then using the WHERE in query from the other table. The reasons are as follows:
- Make application caches (Redis, memcache, etc.) more efficient. For example, if a partial ID is found in the first table, a WHERE in statement can be omitted if the cache is hit.
- It is easier to deal with business development, split the database, and achieve high performance and high expansion.
- After the ID in where in is sorted in ascending order, the query efficiency is more efficient than the random association of JOIN
- Reduce redundant queries. Two queries at the application layer means that the application needs to query a record only once, while using join may require repeated scanning to access part of the data.
- Single-table queries can reduce lock contention.
If you must use, you can use the following ways to optimize:
- Make sure that
ON
orusing
Clause has an index on the columns in the - Make sure that any
group by
andorder by
The expression in refers to only columns in a table.
In high performance scenarios, do not use it in queriesA temporary table
MySQL ‘temporary’ indicates that there is no index, and using a temporary table is usually associated with low performance. Therefore, in performance demanding scenarios, it is best not to use operations with a temporary table:
- On an unindexed field
group by
Operation. UNION
The query.- A subquery in a query statement.
- Part of the
order by
Operation, for exampledistinct
Functions andorder by
Used together anddistinct
andorder by
Same field. In some casesgroup by
andorder by
The fields are different.
You can use explain to check whether temporary tables are used or not. If Using TEMPORARY appears, you need to pay attention to the results of Extra columns.
count()
Function optimization
The count() function has one caveat: it does not count NULL fields! Therefore, you cannot specify a column in the query result to count the number of rows in the query result. Count (xx column) is not good.
If you want to count the result set, use count(*), and you’ll get good performance.
Try not to use subqueries
Try not to use subqueries and use associations instead
Optimize paginglimit
Usually when we’re paging, we’re using limit 50, 10 statements. Select xx, XXX from test_table limit 100000020, 20. After 100000020 pieces of data are scanned, 20 pieces of data are returned. At this time, we can use the following two ways to optimize:
usingbetween and
And the primary key index
Select * from test_table where id between XXXXX and XXXX where id between XXXXX and XXXX select * from test_table where id between XXXXX and XXXX where id between XXXXX and XXXX
Using autoincrement primary key indexes,order by
addlimit
, do not use offset
The problem with limit and offset, which is really the offset problem, causes MySQL to scan a lot of rows that are not needed and then discard them. If you use a label to record the location of the last fetched data, then the next scan can start directly from the bookmark location, thus avoiding the use of offset.
For example, the above query can be changed to:
Select * from test_table order by id desc limit 20; select * from test_table order by id desc limit 20;
Select * from test_table where id < ‘test_table’ order by ID desc LIMIT 20 select * from test_table where id < ‘test_table’ order by ID desc LIMIT 20 select * from test_table where id < ‘test_table’ limit 20
Familiarity and flexibilityexplain
Here is the entire process of mysql executing a query. You can see the red part of the diagram for explain.
Explain shows a lot of fields and content, which is often hard to remember. To use it, check out the following diagram: Explain diagram