Developers have to be aware of MySQL index and query optimization

Click on “Road of Technology for Migrant Workers” and select “Star Label”

10 o ‘clock every day for you to share different dry goods

Reader benefits! Up to 2048G of various resources are complimentary

This paper mainly summarizes some common operations and unreasonable operations in the work, in the slow query to collect some useful information and information

Indexes related to

The index base

Cardinality is the number of different values that a data column contains. For example, if a data column contains values 1, 3, 7, 4, 7, 3, its cardinality is 4.

It works best when the index has a high cardinality relative to the number of rows in the data table (that is, the columns contain many different values with few duplicate values).

If a column contains many different ages, the index will quickly identify the rows. If a data column is used to record gender (only “M” and “F” values), then the index is of little use; If values occur almost equally often, then any search for a value is likely to yield half of the rows.

In these cases, it is best not to use indexes at all, because when the query optimizer finds that a value is present in a high percentage of the table’s rows, it will generally ignore the indexes and perform a full table scan. The usual percentage line is “30%”.

Index failure cause

There are several reasons for index failure:

Operations on index columns include (+, -, *, /,! , <>, %, like’%_’ (% in front).
Type error, for example, vARCHAR, where condition number.
Apply an internal function to an index, in which case you should build a function-based index. For example, select * from template t where ROUND (t.logicdb_id) = 1. In this case, ROUND (t.logicdb_id) should be set as an index.

MySQL 8.0 now supports functional indexes, 5.7 now supports virtual columns, previously it was only possible to create a ROUND (t.logicdb_id) column and maintain it.
If the condition has an OR, it will not be used even if the condition has an index (which is why it is recommended to use less OR). If you want to use OR and want the index to be valid, you must index each column in the OR condition.
If the column type is a string, make sure the data is quoted in the condition, otherwise no index is used.
The B-tree index is null, but is not NULL, and the bitmap index is null, but is not NULL.
Composite indexes follow the leftmost principle.

Index establishment

Note the following points when establishing an index:

Surely the most important are the statements that are frequently queried based on the business.
Try to select columns with high distinction as indexes. The formula of distinction is COUNT(DISTINCT Col)/COUNT(*), which indicates the ratio of DISTINCT col to COUNT(*). The larger the ratio is, the fewer records will be scanned.
It is better to set up unique keys for unique features in the service. On the one hand, the correctness of data can be guaranteed, and on the other hand, the efficiency of indexing can be greatly improved.

Useful information in EXPLIAN

Basic usage

EXPLIAN basic usage is as follows:

Desc or Explain add your SQL.
It is also useful to add extended Explain to your SQL, and then show Warnings to see what statements are actually executed. Many times different scripts will execute the same code after SQL analysis.

Features that improve performance

EXPLIAN features the following to improve performance:

Index covering (covering index) : need to query data in the index can be checked do not need to return to the table EXTRA column using index.
Index Condition Pushdown (ICP) : Originally, Index is only an access mode of data access. The data obtained by the storage engine through the Index back table is transmitted to the MySQL Server layer for WHERE Condition filtering.

As of version 5.6, MySQL Server will push down some where criteria to the storage engine layer if the ICP is enabled to use the index field.

EXTRA displays the using index condition. Understand that the MySQL architecture diagram is divided into Server and storage engine layers.
Index Merge: Perform a conditional scan of multiple indexes separately and then combine their respective results (INTERSECT/Union).

If the or condition is AND, consider creating a composite index. The index type shown in EXPLAIN shows index_Merge, and EXTRA shows the specific merge algorithm and indexes used.

Extra fields

Extra field use:

Using filesort: indicates that MySQL uses an external index sort for data, instead of reading data in the order of the indexes in the table.

The sort operation that cannot be done by index in MySQL is called “file sort”.
Using temporary: A temporary table is used to hold intermediate results. MySQL uses temporary tables when sorting query results. Common in sort order by and group by queries.
Using index: indicates that the corresponding SELECT operation using a Covering index, avoiding access to the table row, efficiency is good.
Impossible WHERE: The value of the where clause is always false and cannot be used to retrieve any tuples.
Select Tables Optimized Away: Optimization of MIN/MAX operations based on indexes without the group by clause or COUNT(*) operations for MyISAM storage engine does not have to wait until the execution stage to perform the calculation, the optimization is completed at the generation stage of the query execution plan.
Distinct: Optimizes the DISTINCT operation to stop searching for the same value once the first matching tuple is found.

Using filesort and using temporary are very performance – consuming.

Using filesort or temporary is possible if there is no index in the order by group.

Use filesort to remove the order by NULL from a group by order if the order is not sorted.

The type field

The type field uses:

System: The table has only one row of records (equal to the system table). This is a special case of const type.
Const: Const is used to compare primary key indexes or unique indexes, if found in sequence by index. Because you can only match one row of data, it’s fast. If you place the primary key in the WHERE list, MySQL can convert the query to a constant.
Eq_ref: Unique index scan, for each index key, only one record in the table matches it. Common for primary key or unique index scans.
Ref: a non-unique index scan that returns all rows matching a single value. It is also essentially an index access that returns all rows that match a single value, however it may find multiple rows that match the criteria, so it should be a mixture of lookup and scan.
Range: Retrieves only rows in a given range, using an index to select rows. The key column shows which index is used, which is typically the query that appears between, <, >, in, etc in your WHERE statement.

This range scan index is better than a full table scan because you only need to start at one point of the contraction and end at another point without scanning the whole index.
Index: Full index Scan, the difference between index and ALL is that index only traverses the index tree, which is usually faster than ALL because index files are usually smaller than data files.

That is, both ALL and index are read from the full table, but index is read from the index, while ALL is read from the hard disk.
All: Full Table Scan: traverses all tables to obtain matching rows.

Field type and encoding

MySQL returns the length of a string

The CHARACTER_LENGTH(same as CHAR_LENGTH) method returns the number of characters, and the LENGTH function returns the number of bytes, three for a Character.

Varchar and other fields to establish the index length calculation statement

Select count(distinct left(test,5))/count(*) from table; The closer you get to one, the better.

MySQL utf8

MySQL utF8 is a maximum of 3 bytes and does not support emoji. It must be utF8MB4 only. You need to set the client character set to UTF8MB4 in the MySQL configuration file.

JDBC connection strings do not support setting characterEncoding= UTF8MB4. The best way is to specify the initial SQL in the connection pool.

For example, hikari connection pool. Other connection pools are similar to spring.datasourc.hikari. connection-init-sql =set names UTF8MB4. Otherwise, set Names UTF8MB4 must be executed before each SQL execution.

MySQL > alter database

_bin and _genera_ci are generally used:

Utf8_genera_ci is case insensitive. Ci is short for Case Insensitive insensitive insensitive insensitive.
Utf8_general_cs is case sensitive. However, MySQL does not support *** _generA_cs sorting. Utf8_bin is used instead.
Utf8_bin stores each character in a string as binary data, case sensitive.

So, what is the difference between UTf8_general_cs and UTf8_bin, also case sensitive?

Cs stands for case sensitive. Bin means binary, which means binary code comparison.
Under utf8_general_cs, some Western European characters and Latin characters are not case-sensitive, such as a =a, but sometimes a =a is not required, hence utf8_bin.
Utf8_bin is characterized by using the binary encoding of characters to perform operations. Any different binary encoding is different, so in utF8_bin sorting: A <>a.

Initialization command

The initial connection in SQLyog specifies the encoding type using the initialization command for the connection configuration, as shown below:

SQL Statement Summary

Often used but easy to forget

SQL statements are common but easy to forget:

Do not insert if there is a primary key or unique key conflict: insert ignore into.
Update if there is a primary key or unique key conflict, note that this affects the increment increment: INSERT INTO room_remarks(room_id,room_remarks)VALUE(1,” SDF “) ON DUPLICATE KEY UPDATE room_remarks = “234”.
REPLACE INTO room_remarks(room_id,room_remarks) VALUE(1,” SDF “).
Backup TABLE: CREATE TABLE user_info SELECT * FROM user_info.
Replicate the TABLE structure: CREATE TABLE user_v2 LIKE user.
Import FROM the query statement: INSERT INTO user_v2 SELECT * FROM user or INSERT INTO user_v2(ID,num) SELECT ID,num FROM user.
UPDATE user a, room b SET a.num= A.num +1 WHERE a.room_id= B.id
DELETE user FROM user,black WHERE user.id=black.id.

Lock the related

Lock correlation (rarely used as an understanding) :

Share lock: select ID from tb_test where ID = 1 lock in share mode.
Select id from tb_test where id = 1 for update.

When optimizing, use

When optimizing, use:

Run the following command to force an index: select * from table force index(idx_user) limit 2.
Disable an index: select * from table ignore index(idx_user) limit 2.
Disable cache (remove cache effect during test) : select SQL_NO_CACHE from table limit 2.

Check the status

View status:

Check character set: SHOW VARIABLES LIKE ‘character_set%’
To check collation rules: SHOW VARIABLES LIKE ‘collation%’

SQL programming note

Note:

Where statements are parsed from right to left. Try to place where instead of HAVING.
Use deferred Join techniques to optimize super multi-page scenarios, such as Limit 1000010,10, to avoid back tables.
Distinct statements are very performance-wasting and can be optimized with group by.
Try not to connect more than three tables.

Hit the pit

The pits are summarized as follows:

If there are increment columns, the TRUNCate statement resets the cardinality of the increment column to 0. In some scenarios, the increment column is used as a service ID.
The aggregate function automatically filters out, for example, if column A is of type int and all NULL, SUM(a) returns NULL instead of 0.
SQL > select * from ‘where’ where ‘having’ where ‘where’ having ‘where’ having ‘UnKnown’ is always’ false ‘; Therefore, “A is null” is used.

10 million large table online modification

If the MySQL table structure is modified, the table will be locked and service requests will be blocked.

MySQL introduced online updates after 5.6, but still locks tables in some cases, so PT tools (Percona Toolkit) are generally used.

Add index to table;

pt-online-schema-change --user='root' --host='localhost' --ask-pass --alter "add index idx_user_id(room_id,create_time)"  D=fission_show_room_v2,t=room_favorite_info --executeCopy the code

Slow Query logs

Sometimes if an online request times out, you should pay attention to slow query logs. Slow query analysis is easy by finding the location of the slow query log file and using mysqlDumpSlow.

Query slow To query log information, you can run SQL commands to view related variables. The common SQL commands are as follows:

Mysqldumpslow’s tool is quite simple, and the main arguments I use are as follows:

-t: limits the number of output lines. I usually take the first ten lines.
-s: sort by what the default is average query time at, I also often use c query times, because the query times are very frequent but the time is not high is also necessary to optimize, and t query times, check which statement special card.
-v: displays detailed information.

Example: mysqlDumpslow -v -s t -t 10 mysql_slow.log.2018-11-20-0500.

View SQL processes and kill processes

If you execute an SQL operation and it does not return, you can query the process list to see how it is actually performing.

If the SQL is time-consuming, you can run the kill command to kill the process and view the process list to view the current SQL execution status. If the current database load is high, the process list may show up with a large number of processes jammed and taking a long time to execute.

The command is as follows:

SHOW PROCESSLIST; -- Kill a process kill 183665Copy the code

If you are using SQLyog, there is also a graphical page in the menu bar → Tools → Display → Process list.

On the process list page, you can right-click to kill the process, as shown below:

Some database performance considerations

When optimizing a company’s slow query logs, it is often possible to forget to build an index. Problems like this can be easily solved by adding an index. But there are a few cases that can’t be solved simply by adding an index:

The business code cycles through the database

Consider a scenario where the user’s fan list information is obtained by adding pages of ten. In fact, SQL like this is very simple, and the performance of the query through the table is also very high.

However, in some cases, many developments take a series of ids and then loop through the information of each ID, so that a large number of ids can put a lot of strain on the database, and the performance is low.

Statistical SQL

In many cases, there will be leaderboards in the business, and it is found that the company directly uses the database for calculation in many places. When it does aggregation calculation for some large tables, it often takes more than five seconds. These SQL are generally very long and difficult to optimize.

For scenarios like this, if the business allows (for example, the consistency requirement is not high or the statistics are collected only after a period of time), you can do statistics exclusively from the database. I also recommend Redis caching for this type of business.

Large pages

Slow query log (Limit 400001000, Limit 400001000, Limit 400001000, Limit 400001000, Limit 400001000)

But look at the relevant business code normal business logic is not such a request, so it is likely to be malicious users in the brush interface, it is best in the development of the interface with verification interception of these malicious requests.

Author: Chen Fangzhi

Reference: https://www.cnblogs.com/chenfangzhi

Key words: 1024 You can get a copy of the latest collation of technical dry goods: Including system operation and maintenance, database, Redis, MogoDB, e-book, Java foundation course, Java practice project, architect comprehensive tutorial, architect practice project, big data, Docker container, ELK Stack, machine learning, BAT interview intensive video, etc.

Click here to find out more

For those of you watching, please click here ↓left left

Developers have to be aware of MySQL index and query optimization

Related Posts

MySQL > Double Check Lock (DCL) cause deadlock

Investigate bytecode piling technology for system monitoring design and implementation

Rounding Kubernetes resource limitations of requests | limits