Most of the time, the “performance problem” of our program is actually the problem of the “lump” of code we wrote, the Coding problem of ourselves, and the problem of using the DML statement of Mysql. Here is my summary of the points to be aware of when using MySQL DML statements.

forselect *Always be cautious

In most cases, you don’t need select *. When this is used, it prevents the optimizer from performing optimizations such as index override scans and adds additional I/O, memory, and CPU consumption. Of course, the use of SELECT * is not all bad. When used properly, select * can simplify development and increase the reusability of the same code.

Whether too many additional records are scanned

Sometimes you find queries that read thousands of rows but return only a few or few results. You can optimize them by:

  • Let’s see if we can change the structure. For example, use summary tables
  • See if the way to get the data results is the best, to get the road strength is the shortest.
  • Using an override index, you put all the required columns in the index to reduce the number of steps required to retrieve data from the corresponding row in the table.

Shard some SQL statements

Traditional Internet system, emphasizes the network connection as far as possible, the data layer as far as possible in a connection do as much as possible, to prevent building links for many times, but the idea is not applicable for MySQL, MySQL from design to connect and disconnect all very lightweight, on general server can support more than 100000 queries per second.

So for some scenarios, you can “divide and conquer” a large query, break it up into smaller queries, and then put it together again. For example:

  • This becomes paging for full data queries. If there are tens of millions of pieces of data in a table, select all at once, it will not work. You can take a little bit at a time, and you divide the pressure at a time.
  • When deleting a large amount of old data, do not clean up a large statement at once, recommendedTen thousand at a time. If there is a big statement of one-time complete, may need a lock of large amounts of data, with a large number of transaction log, let the Mysql stood there, to avoid this kind of circumstance happening, best one-time delete data from ten thousand or so, every time finish delete pause for a moment then operation, scatter disposable pressure on the server.

Note: Although Mysql is very lightweight to set up connections, this does not mean that you can query and then concatenate each loop, which is still very slow and is usually the point of SQL optimization at work.

carefuljoinoperation

This is a no-no. Many companies’ Internet products eliminate the join operation, and replace it with two single table query operations, first fetching the data ID from one table and then using the WHERE in query from the other table. The reasons are as follows:

  • Make application caches (Redis, memcache, etc.) more efficient. For example, if a partial ID is found in the first table, a WHERE in statement can be omitted if the cache is hit.
  • It is easier to deal with business development, split the database, and achieve high performance and high expansion.
  • After the ID in where in is sorted in ascending order, the query efficiency is more efficient than the random association of JOIN
  • Reduce redundant queries. Two queries at the application layer means that the application needs to query a record only once, while using join may require repeated scanning to access part of the data.
  • Single-table queries can reduce lock contention.

If you must use, you can use the following ways to optimize:

  • Make sure thatONorusingClause has an index on the columns in the
  • Make sure that anygroup byandorder byThe expression in refers to only columns in a table.

In high performance scenarios, do not use it in queriesA temporary table

MySQL ‘temporary’ indicates that there is no index, and using a temporary table is usually associated with low performance. Therefore, in performance demanding scenarios, it is best not to use operations with a temporary table:

  • On an unindexed fieldgroup byOperation.
  • UNIONThe query.
  • A subquery in a query statement.
  • Part of theorder byOperation, for exampledistinctFunctions andorder byUsed together anddistinctandorder bySame field. In some casesgroup byandorder byThe fields are different.

You can use explain to check whether temporary tables are used or not. If Using TEMPORARY appears, you need to pay attention to the results of Extra columns.

count()Function optimization

The count() function has one caveat: it does not count NULL fields! Therefore, you cannot specify a column in the query result to count the number of rows in the query result. Count (xx column) is not good.

If you want to count the result set, use count(*), and you’ll get good performance.

Try not to use subqueries

Try not to use subqueries and use associations instead

Optimize paginglimit

Usually when we’re paging, we’re using limit 50, 10 statements. Select xx, XXX from test_table limit 100000020, 20. After 100000020 pieces of data are scanned, 20 pieces of data are returned. At this time, we can use the following two ways to optimize:

usingbetween andAnd the primary key index

Select * from test_table where id between XXXXX and XXXX where id between XXXXX and XXXX select * from test_table where id between XXXXX and XXXX where id between XXXXX and XXXX

Using autoincrement primary key indexes,order byaddlimit, do not use offset

The problem with limit and offset, which is really the offset problem, causes MySQL to scan a lot of rows that are not needed and then discard them. If you use a label to record the location of the last fetched data, then the next scan can start directly from the bookmark location, thus avoiding the use of offset.

For example, the above query can be changed to:

Select * from test_table order by id desc limit 20; select * from test_table order by id desc limit 20;

Select * from test_table where id < ‘test_table’ order by ID desc LIMIT 20 select * from test_table where id < ‘test_table’ order by ID desc LIMIT 20 select * from test_table where id < ‘test_table’ limit 20

Familiarity and flexibilityexplain

Here is the entire process of mysql executing a query. You can see the red part of the diagram for explain.

Explain shows a lot of fields and content, which is often hard to remember. To use it, check out the following diagram: Explain diagram

For more exciting content, please follow my wechat official accountInternet Technology NestOr add wechat to discuss and communicate: