MySQL index invalidation
preface
This note is learned from the advanced MySQL database in Silicon Valley
DQL, DML, DDL statements are not included in this course. If necessary, please refer to the video mysql_Basics
My mysql version is
Server version: 5.5.62-log MySQL Community Server (GPL)
Course introduction
This course is aimed at Java developers, not fully suited to operations and database engineers
1. Introduction of mysql architecture
2. Index optimization analysis
3. Query interception analysis
Mysql Logical Architecture
Mysql is a bit different from other databases in that its architecture can be used and used well in many different scenarios
This is mainly reflected in the architecture of the storage engine
Plug-in storage engine architecture separates query processing from other system tasks and data storage and extraction
In this architecture, you can select an appropriate storage engine based on service requirements
This section describes the architecture level and functions of mysql
1. Connection layer
At the top layer are some client and connection services, including local SOCK communication and most TCP/IP-like communication implemented based on client/server tools
Mainly complete some similar connection processing, authorization and authentication, and related security schemes
On this layer, the concept of thread pools is introduced to provide threads for clients that are securely accessed through authentication.
SSL – based secure links can also be implemented at this layer. The server also validates that it has operational permissions for each client that is securely connected
2. Service layer
The second layer architecture mainly completes the core service functions, such as SQL interface, and completes the cache query, SQL analysis and optimization and the execution of part of the built-in functions. All cross-storage engine functions are also realized in this layer, such as procedures and functions.
In this layer, the server parses the query and creates the corresponding internal parse tree, and performs the corresponding optimization such as determining the order of the query table, whether to utilize the index, and finally generates the corresponding execution operations.
In the case of a SELECT statement, the server also queries the internal cache. If the cache space is large enough, it can greatly improve system performance in an environment where a large number of read operations are handled.
3. Engine layer
Storage engine layer, storage engine is really responsible for the storage and extraction of data in MySQL, server through API communication with storage engine. Different storage engines have different functions, so you can select them according to your actual needs.
The MyISAM and InnoDB engines that are used a lot will be discussed later
4. Storage layer
The data storage layer stores data on file systems running on raw devices and interacts with storage engines.
Introduction to Storage Engines
The storage engine is responsible for storing and extracting data from mysql
We use SQL command to view
mysql> show engines;
mysql> show variables like '%storage_engine%'
Copy the code
The show engines command writes out the current engine and a brief description of it
The show variables like ‘%storage_engine%’ command shows innoDB as the default engine
You can see that the most common **InnoDB engine in our current version is the default **, which is described as
Supports transactions, row-level locking, and foreign keys
Namely: support for transactions, row locks, and foreign keys
Contrast MyISAM with InnoDB
Compare the item | MyISAM | InnoDB |
---|---|---|
The main foreign key | Does not support | support |
The transaction | Does not support | support |
Line table lock | Table locks. Operating on even one record locks the entire table,Not suitable for high concurrency! | Row locks. Lock one row, not the other rows,Suitable for highly concurrent operations |
The cache | Only indexes are cached, not real data | Caching not only indexes but also real data requires a lot of memory, and memory size has a decisive impact on performance |
Table space | small | big |
concerns | Performance, | The transaction, |
The default installation | Y | Y |
Talk about SQL
Execution order
For the SQL we developers write, yes
select distinct
<select_list>
from
<left_table> <join_type>
join <right_table> on <join_condition>
where
<where_condition>
group by
<group_by_list>
having
<having_condition>
order by
<order_by_condition>
limit
<limit_number>
Copy the code
For mysql services, parsing starts from
from
<left_table>
on <join_condition>
<join_type> join <right_table>
where
<where_condition>
group by
<group_by_list>
having
<having_condition>
select
distinct
<select_list>
order by
<order_by_condition>
limit
<limit_number>
Copy the code
7. Join theory
-
== inner join == : inner is used to look up intersection, that is, common data in multiple tables
-
== left join == : Left join overwrites the left table, and can check the left table and intersection, or exclude the data existing in both the left and right tables
-
== right join == : Right join overwrites the right table. You can check the right table and the intersection or exclude the data that exists in both the left and right tables
-
Select * from outer select * from outer select * from outer select * from outer select * from outer select * from outer
-
Note: MySQL does not support FULL OUTER JOIN, but we can union the results of left and right joins
-
The MySQL UNION operator is used to join the results of more than two SELECT statements into a result set
Multiple SELECT statements delete duplicate data (you can use UNION ALL to query duplicate data)
SELECT name FROM name UNION [ALL] SELECT name FROM name ORDER BY name;
-
Talk about the index
What is an index?
The official definition of an index is: an index is a data structure that helps mysql obtain data efficiently
Can be understood as: a kind of ordered efficient lookup data structure
Whether innoDB engine in mysql is B tree or B+ tree is controversial on the Internet. Let’s check the mysql development manual
As you can see, it’s mentioned in the development manual
Everyone has seen a B-tree and knows that the entries in the root page point to the leaf pages. (I indicate those pointers with vertical ‘|’ bars in the drawing.) But sometimes people miss the detail that leaf pages can also point to each other (I indicate those pointers with a horizontal two-way pointer ‘<–>’ in the drawing). This feature allows InnoDB to navigate from leaf to leaf without having to back up to the root level. This is a sophistication which you won’t find in the classic B-tree, which is why InnoDB should perhaps be called a B+-tree instead.
Everyone has seen a B tree and knows that entries in the root page point to leaf pages. Vertical (I was in a diagram “|” said the pointer.) But sometimes people overlook the fact that pages can also point to each other (I’ve represented these Pointers in the graph as horizontal bi-directional Pointers “<–>”). This feature allows InnoDB to navigate from one leaf to another without backing up to the root level. This is a complexity not found in a classic B-tree, which is why InnoDB should be called a B+ tree.
Mysql indexes also use B+ trees
Mysql > create index BTREE (BTREE)
The purpose of an index is to improve the efficiency of queries. It can be likened to a catalogued book or dictionary
If there is no index, a full table scan will be performed, which is very inefficient when the data volume is large!
Advantages and disadvantages of indexes
advantages
- Fast query, improve retrieval efficiency, reduce I/O costs
- Sorting data reduces the cost of sorting data and CPU consumption
disadvantages
- An index is also a file. The table holds primary keys and index fields and points to records in the entity table, so index columns also take up space
- Although indexing greatly improves retrieval efficiency, it doesReduce the efficiency of updating tables
- When updating a table, mysql not only saves the data, but also updates the index file with each update
- Indexes can be a factor in improving efficiency, but if mysql has a large number of tables, you need to spend time researching indexes or optimizing queries that best meet your business needs
Indexes are usually too large to be stored in memory, so they are often stored on disk as index files
Since indexing is about sorting and finding, it affects where lookup and order by sorting in SQL
The index classification
Single value index
- An index contains only one column
- There can be multiple single-valued indexes in a table
This is applicable when a single column query is in high demand
The composite index
A composite index is a multi-column index created by combining multiple columns. It follows the left-most prefix rule when used
Left-most prefix rule: Indexes are used only when the first field of a composite index is used in a query condition
The only index
The value of the indexed column must be unique, allowing null values
Define the index as UNIQUE with the keyword UNIQUE
The primary key index
A primary key index is a special unique index, and a null value ** is not allowed
The basic grammar
create
create [unique] index Indicates the index nameonTable name (column name (length));alter tableThe name of the tableaddIndex [index name] (column name (length))Copy the code
1,PRIMARYKEY (primary KEY index)ALTER TABLE table_name ADD PRIMARY KEY ( column )
2,UNIQUE(Unique index)ALTER TABLE table_name ADD UNIQUE (column)
3INDEX(common INDEX)ALTER TABLE table_name ADD INDEX index_name ( column )
4, FULLTEXT(FULLTEXT index)ALTER TABLE table_name ADD FULLTEXT ( column )
5, multi-column indexALTER TABLE table_name ADD INDEX index_name ( column1, column2, column3 )
Copy the code
delete
drop index [indexName] on tableName;
Copy the code
To view
show index from tableName
Copy the code
Index structure
BTree index
B tree here
B + tree here
The light blue block in the background is what we call a disk block
You can see that each disk block contains several data items (dark blue blocks) and Pointers (yellow blocks),
For example, disk block 1 contains item 17 and 35, and Pointers P1, P2, and P3
B tree features: Each node has data and Pointers
- The node ordering
- A node can hold more than one meta, and the meta is sorted
- Each node stores keys and data
B+ tree features: Data only exists in leaf nodes
-
It has the characteristics of a B tree
-
All Pointers on non-leaf nodes are index Pointers. Leaf nodes store all elements and indexes in a sorted order
-
There are Pointers between the leaf nodes
- Sequential access Pointers were added, with each leaf node adding a pointer to adjacent leaf nodes
-
Only leaf nodes store data, and leaf nodes contain all indexes and data for the tree
In the B + tree
Real data exists in the leaf nodes, namely 5,8,9,10,15,18,20,26,27,28,30,33,35,38,50,56,60,63,65,73,79,80,85,88,90,96,99
Non-leaf nodes do not store real data, but only the data items that guide the search direction. For example, 5, 28 and 65 do not really exist in the data table
A Hash index
Because of the structure of the hash index, it is directly used in each query, unlike b-tree. Therefore, hash indexes are more efficient than B-tree indexes. However, hash also has the following disadvantages:
- Since it stores hash values, only <=> and in operations are supported
- The hash index cannot be sorted by manipulating the index. This is because the hash value calculated during storage is not necessarily the same as the stored one. Therefore, the hash index cannot be sorted
- There is no way to index parts in a composite
- A full table scan cannot be avoided because the memory table supports hash indexes with non-unique values, that is, different index keys that may have the same hash value
- Hash indexes become inefficient when there are many identical hash values
Full – text index
Full text is only supported by myISam in mysql, and only char, vARCHar, and text data types are supported in mysql
Full-text is used to replace like “%***%”
R – Tree indexes
R-tree is rarely used in mysql. It supports only the geometry data type. Only myISAM, BDB, InnoDB, NDB, and Archive support this type of storage engine.
Compared with B-tree, R-tree has the advantage of range search.
Whether an index should be created
Which situations need to be created
- The primary key automatically creates a unique index, the default name is
PRIMARY
- Fields frequently used as query criteria
- A foreign key field associated with another table as a query condition
- Single key index/compound index? Compound indexes are recommended for high concurrency
- A sorted field in a query that is accessed by index to speed sorting
- Select * from group by; select * from group by; select * from group by
Which cases do not need to be created
-
The table has too few entries, which may cause index invalidation
-
If a field is frequently updated, the index is updated at the same time, consuming performance and reducing efficiency
-
Fields not used in the WHERE condition
-
A watch that is often added, deleted, or modified
-
With many duplicate columns of data, indexing is not very effective
-
If A table has 100,000 rows, A field A has only T and F values, and the probability of distribution of each value is about 50%, then the index of such A table A field will not improve the query speed of the database.
Index selectivity refers to the ratio of the number of different values in the index column to the number of records in the table. If there are 2000 records in a table with 1980 different values in the index column, the index selectivity is 1980/2000=0.99
The closer the selectivity of an index is to 1, the more efficient the index will be
-
Performance optimization analysis: THE SQL performance is slow
Server hardware performance bottlenecks: TOP,free,iostat, and vmstat to view the system performance status
- CPU bottleneck:
- CPU saturation occurs when data is loaded into memory or read from disk
- IO bottleneck:
- Disk I/O bottlenecks occur when much more data is loaded than there is memory
Use IP addresses instead of domain names for database paths to avoid DNS resolution problems
Explain the use of
What is explain?
Using the Explain keyword, you can simulate the mysql statement optimizer to execute optimized SQL statements to see how mysql is optimized to analyze performance bottlenecks in SQL statements or table structures
Id, type, key, Rows Extra
Explain how to use?
Explain + SQL statement
explain select * from t_user;
Copy the code
explain select * from t_user where uid=1;
Copy the code
The read order of the table
It is not written in the same order, because mysql does overwrite sort (similar to Java instructions reordering).
Table Specifies the name of the table
Id is the execution level
- Execute from top to bottom with the same ID
- A table with a larger ID is executed earlier
Derived2
indicates that the table is a virtual table derived from the table whose id is 2.
The type of data read operation
Select_type is the query type
Complex queries used to distinguish common queries, federated queries, subqueries, and so on
- simple
- A simple SELECT query, with no subqueries or unions
- primary
- If the query contains any complex subparts, the outermost query is marked as PRIMARY
- subquery
- A subquery followed by select or WHERE
- derived
- Temporary virtual tables DERIVED from subqueries contained in the FROM list will be marked as DERIVED
- union
- If the second select appears after the union, it is marked as union
- If UNION is included in the subquery of the FROM clause, the outer select is marked as DERIVED
- union result
- Select to get the result from the UNION table
Type is the access type
Shows which type is used for the query
From best to worst
System >const>eq_ref>ref>range>index>ALL
In general, make sure the query is at least at the range level and preferably at ref
- system
- The table has only one row (equal to the system table), which is a const column
- const
- **const is used to compare a primary key with a unique cluster index.
- Because only one row of data is matched, MySQL can quickly convert the query to a constant if the primary key is placed in the WHERE list
- eq_ref
- A unique index scan in the case of a multi-table query in which only one record in the table matches each index key
- Common for federated queries with primary key or unique index scans
- ref
- A non-unique index scan that returns all rows matching a single value
- It is also essentially an index access that returns all rows that match a single value
- It may find multiple rows that match the criteria, so it should be a mixture of lookup and scanning
- range
- Only rows of a given range are retrieved, using an index to select rows. The key column shows which index is used
- Where statement between, <, >, in, etc
- This range scan index scan is better than a full index scan because it only needs to start at one point in the index and end at another point without scanning the whole index (if the index is added)
- index
- Full Index Scan: The Index type is different from ALL. Only the Index tree is traversed
- This is usually faster than ALL because index files are usually smaller than data files
- That is, both all and Index are read from the full table, but Index is read from the Index, while all is read from the hard disk
- ALL
- Walk through the table to find matching rows
Which indexes can be used and which indexes are actually used
Possible_keys displays possible references to one or more indexes in the table
If there is an index on the field involved in the query, the index will be listed, but the query may not be actually referenced!
Keys indicates the actual index used. If null, no index is used
If an overridden index is used in the query, it only appears in the key list
If an index contains (or overwrites) the values of all the fields that need to be queried, it is called a ‘overwrite index’. That is, you only need to scan the index without returning to the table
EXPLAIN select * from T_user where uname=’yyf’ and uid=2
Where uid is the default primary key index and uname is the created normal index
Possible_keys is PRIMARY and index_user_uname, indicating two possible indexes
Keys = PRIMARY
Key_len indicates the number of bytes used in the index
This column is used to calculate the length of the index used in the query. With no loss of accuracy, the shorter the length, the better
Key_ len displays the value for the maximum possible length of the index field, not the actual length used
That is, key_ len is computed from the table definition, not retrieved from the table
References between tables
Ref shows which column of the index is used
A const, if possible
Which columns or constants are used to find values on index columns
How many rows per table are queried by the optimizer
Rows roughly estimates the number of rows to read to find the desired record based on table statistics and index selection
In the case of a full table scan, this value is large and inefficient
Extra contains additional information that is important but not suitable for display in other columns
- Using filesort
- Note mysql uses an external index sort for data, rather than reading data in the order of the indexes in the table.
- Filesort uses the corresponding sorting algorithm to sort the obtained data in memory
- A sort operation in MySQL that cannot be done using an index is called “file sort”
- There are actually two algorithms for filesort implementation in MySQL
- Double sort: the sort field and the row pointer information that can directly locate the row data are retrieved according to the corresponding conditions, and then sorted in sort buffer
- Single-way sort: Sort all the fields that meet the criteria in sort buffer at one time.
- Note mysql uses an external index sort for data, rather than reading data in the order of the indexes in the table.
- Using temporary
- Temporary tables are used to hold intermediate results. MySQL uses temporary tables when sorting query results.
- USING index
- The corresponding select operation using a Covering Index, avoiding access to the table row, efficiency is good
- Using WHERE indicates that the index is used to perform a key lookup.
- If using WHERE is not present at the same time, the index is used to read data rather than perform lookup actions.
- Using where
- Indicates where filtering is used
- using join buffer
- Connection caching is used
- impossible where
- The value of the WHERE clause is always false and cannot be used to get any tuples
- select tables optimized away
- Optimize MIN/MAX operations based on indexes in the absence of the GROUPBY clause
- Or for MyISAM storage engine optimization COUNT(*) operations, you don’t have to wait until the execution phase to perform the calculation, the optimization is done during the query execution plan generation phase.
- distinct
- Optimize the distinct operation to stop finding the same value once the first matching tuple is found
Mysql stored procedure create loop
Here we write a mysql loop insert data for testing
Because mysql can’t do a while loop like Java, you need to create and call a stored procedure
A stored procedure is a code snippet that executes when semicolons are encountered during mysql execution. How do you change the semicolons and keep the code executing? Delimiter
Delimiter tells the mysql interpreter that the command is finished and ready to execute.
By default, delimiter is a semicolon; Execute when a semicolon is encountered.
The // symbol tells mysql to wait until it encounters //
delimiter //# define the identifier as a double slashdrop procedure if existstest; Delete the test stored procedure if it existscreate procedureTest () # create a stored procedure with no parameters named testbegin
declare i int default 0; Declare the variable and assign the value while I< 1000Do # ends the loop when I is greater than10Break out of the while loopinsert into test(username,balance) values("while",i); Select * from testset i = i + 1; # loop once, I plus oneendwhile; # End the while loopend
//# end definition statement delimiter; # change back to the original identifiercalltest(); Call the stored procedureselect * from test order by id desc limit 0.1000; Select * from testCopy the code
We create a table and insert 100,000 pieces of data into the table for testing
CREATE TABLE `t_index` (
`id` int unsigned zerofill NOT NULL AUTO_INCREMENT,
`uid` VARCHAR(255) not NULL,
`username` varchar(255) not NULL,
`password` VARCHAR(255) not null,
`balance` int unsigned zerofill not NULL,
`views` int not null.PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
delimiter //
drop procedure if exists test;
create procedure test()
begin
declare i int default 0;
while i < 100000 do
insert into t_index(uid,username,`password`,balance,views)
values (i,"while","while",i,i);
set i = i + 1;
end while;
end
//
delimiter ;
call test();
select * from t_index order by id desc limit 0.1000;
Copy the code
Select query statement poorly written
What are the possible causes of long SQL execution times and long wait times?
So let’s analyze it
Unneeded records were queried
- Select * from table_name; select * from table_name; select * from table_name;
- Select * from table where LIMIT 1 = 1 and LIMIT 1 = 1
Return all columns for multiple table associations
- The association of multiple tables, such as some meaningless columns, can also affect performance.
Always pull out all columns
- Queries such as SELECT *, which fetch all columns, do not necessarily achieve index overwrite optimizations, but trigger various back-table queries, incurring additional IO, memory, and CPU consumption on the server.
Query the same data repeatedly
- For this type of query that returns the same result every time, in fact, it is enough to check once, save the result in Redis such cache, relieve MySQL pressure.
Small tables drive large tables
That is, operate on the small dataset first, and then operate on the large dataset
How to choose in and exists?
Check the internal table in exists
Therefore, in is preferred if a subset of in is small, and exists is used conversely
- The exists:
- Put the data of the master query into the sub-query for conditional verification, and determine whether the data result of the master query can be retained according to the verification result (TRUE or FALSE)
- Since only TRUE/FALSE is returned, either select *, select 1, or anything else in the subquery will work, and fields will be ignored when executed
- In:
- The in query is executed only once. It caches all the queried data and then checks if the queried fields are present in the cache. If so, it adds the queried data to the result set until all the queried results are iterated
Select * from B where B = 1select * from A where id in (select id fromB) select * from Bselect * from A where exists (select 1 from B where B.id = A.id)
Copy the code
Our principle is to operate small first, so when B is smaller than A, we should operate table B first
B = in B = in B = in
When the data volume of A is smaller than that of B, we should operate table A first
If exists exists, select A from the exists table
Order by sort FileSort
Order by sentence, try to use index sort, namely index sort, avoid using filesort sort, namely filesort
Sort on index columns as much as possible (default ASC sort)
Keep in mind that the order by index is strictly in order
There are two conditions for sorting using index:
- The order by statement uses the left-front index principle
- The combination of WHERE clause and order by clause condition columns satisfies the left-front index principle
When will filesort come?
-
Not indexed
-
Do not follow the leftmost prefix rule
-
Inconsistencies follow the default ASC order (that is, one uses ASC and the other uses DESC) and the index is invalid
- Of course, as long as all are ASC or all are DESC, it will not be invalid
Do the reverse of filesort
There are actually two algorithms for filesort implementation in MySQL
- Double sort (before Mysql4.1) :
- Scan the disk twice and finally get the data. It is time-consuming and inefficient
- Take the row pointer to the data and the ORDER BY column, sort them, then scan the sorted list and reread the corresponding data output from the list according to the values in the table
- Read the sort fields from disk, sort them in sort_buffer, and fetch other fields from disk
- Single-way sort (after mysql4.1) :
- Sort_buffer is to take out all the fields that meet the condition at one time, sort them according to the order BY column in sort_buffer, and then scan the sorted fields for output
- It is faster and avoids a second read of the data. And it turns random IO into sequential IO, but it uses more space because it saves each row inIn the memorysorting
- If the problem is not solved once, it is necessary to perform IO operations in batches to retrieve data, which becomes inefficient
- Increase the sort_buffer_size parameter to increase the buffer size
- Increases the max_LENGTH_FOR_sort_data parameter
- If the problem is not solved once, it is necessary to perform IO operations in batches to retrieve data, which becomes inefficient
Increased order by speed
Use indexes, of course, and try not to invalidate them
It is best not to use SELECT *, which results in a query with too much data
- When total field size less than max_length_for_sort_data, Query and sort field is not a TEXT | BLOB, will use the improved algorithm, single channel sorting, or with the old algorithm, multi-channel sorting, So you can increase the max_LENGTH_FOR_sort_data parameter
- Both algorithms may exceed the size of sort_buffer. After that, TMP files will be created for merge sorting, resulting in multiple I/ OS. However, single-path sorting is more risky, so the sort_buffer_SIZE parameter should be increased
Just a quick summary
create index test on test(a,b,c);
Create federated indexes for fields A, B, and C
Follow the leftmost rule: indexes can be used
- order by a
- order by a,b
- order by a,b,c
- order by a DESC , b DESC , c DESC
- where a = const ORDER BY b , c
- where a=const and b=const ORDER BY c
- Where a=const and b>=const ORDER BY b, c
Index of the failure
- ORDER BY a DESC, B ASC, c DESC
- ORDER BY b, c
- Where a=const ORDER BY c
- Where a=const ORDER BY b, d
- where a in (…) ORDER BY b, c
Group by is similar to Order BY
The essence of group by is to sort and then group the index according to the left-most prefix principle
Where takes precedence over HAVING, so don’t write in HAVING if you can write in WHERE
Seven conditions of index failure
Index failure is when an index is created but not used
Take a look at the blog “Index failures and precautions” for a more comprehensive summary
The following are possible index failures
Index invalid formula:The number of models is fastest by air
- Mode: fuzzy query LIKE starts with %
- Type: data type error
- Number: Use internal functions for index fields
- NULL: Index column is NULL
- Operation: The index column performs four operations
- Most: The composite index does not start the search at the left of the index column
- Fast: Full table lookup is expected to be faster than indexing
1.Separate references to non-first position index columns in a composite index
Compound indexes follow the left-most prefix principle
That is, the index is used only when the first field of the composite index is used in the query condition.
Therefore, the order of index columns in a composite index is critical. An index cannot be used unless the search starts in the leftmost column of the index.
If there is INDEX(a,b,c), the INDEX can be used when the condition is A or A,b or a,b,c, but the INDEX will not be used when the condition is B, C.
Therefore, when creating a composite index, we should place the columns most commonly used as constraints at the far left, descending in order
It’s actually going to look it up in the index file. Type is index
2,Perform operations on indexed columns
Operations include (+, -, *, /,! , < > and! =, %, or, in, exist, etc.), causing index invalidity
Or causes index invalidation: use union
Note that OR also invalidates indexes, so either perform the operation in business or use the union in mysql
Union is used to combine the results from multiple SELECT statements into a result set
When using union, mysql deletes duplicate records from the result set
When using union all, mysql returns all records and is more efficient than using union
select * from t_index where uid ='921930193014' or balance = 499010Change # ↑ to ↓, useunion
select * from t_index where uid ='921930193014'
UNION
select * from t_index where balance = 499010
Copy the code
Select * from test where id-1=9;
Select * from test where id=10;
Attention!!!!!
SQL > not in, not exists, <>! = Field indexes that do not run range are invalid
<, >, <=, >=, this is judged according to the actual query data. If the overall scan speed is faster than the index speed, the index is not moved
The range query field of type range and the prime following it are invalid
What are range type query criteria and their following field indexes invalid?
Let’s look at an example
T_index table has 140W data structure as follows
CREATE TABLE `t_index` (
`id` int(10) unsigned zerofill NOT NULL AUTO_INCREMENT,
`uid` varchar(255) NOT NULL,
`username` varchar(255) NOT NULL,
`password` varchar(255) NOT NULL,
`balance` int(10) unsigned zerofill NOT NULL,
`views` int(11) NOT NULL.PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1400001 DEFAULT CHARSET=utf8;
Copy the code
CREATE index index_tindex_username_balance_views on t_index(username,balance,views)
EXPLAIN
select * from t_index where username = 'while' and balance > = 499008 ORDER BY views DESC limit 1
Copy the code
SQL > create index username,balance,views3
As can be seen, although 3 field index is used, the query type is still range level, and extra is using filesort external file sorting, not using views index sorting, of course, the efficiency will be lower
Type becomes range, which is tolerable. Using filesort in Extra is still unacceptable.
But we already have the index, why not use index sort?
This is because the way the BTree index works,
To sort the username
If the same username is encountered, then balance is sorted. If the same balance is encountered, then views is sorted
When the balance field is in the middle of the joint index,
The balance > 499008 condition is a range value.
MySQL cannot use the index to retrieve the following views, that is, the range query field and its prime citation are invalid
So how do we solve this problem? Remove the index of the range condition field, i.e., subtract the balance index column
Add only the indexes of username and views
DROP index index_tindex_username_balance_views on t_index
CREATE index index_tindex_username_views on t_index(username,views)
EXPLAIN
select * from t_index where username = 'while' and balance > = 499008 ORDER BY views DESC limit 1
Copy the code
You can see that type is changed to ref, ref displays const constants, and extra loses the using filesort
The optimizer optimizes based on index order
Of course, the mysql optimizer will be optimized according to the index order
(mysql does not fully obey the conditional order of SQL statements, and executes the optimal order according to the index to achieve the optimal result)
Let’s look at a table that has fields A, B, C, d
And create multiple column indexes for ABCD simultaneously
CREATE TABLE `abcd` (
`id` int(10) unsigned zerofill NOT NULL AUTO_INCREMENT,
`a` varchar(255) NOT NULL,
`b` varchar(255) NOT NULL,
`c` varchar(255) NOT NULL,
`d` varchar(255) NOT NULL.PRIMARY KEY (`id`),
KEY `i` (`a`,`b`,`c`,`d`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8; Perform several more inserts to test the datainsert into abcd(a,b,c,d) values('a'.'b'.'c'.'d');
CREATE index i on abcd(a,b,c,d)
Copy the code
Let’s take a look at the execution of the following four SQL statements
1. Normal order A-B-C-D
a='a' and b='b' and c='c' and d='d'
EXPLAIN
select * from abcd where a='a' and b='b' and c='c' and d='d'
Copy the code
If type is ref, the index is normally used, and ref displays 4 const values, meaning that all 4 of them are indexed
D, C, B, A
d='d' and c='c' and b='b' and a='a'
EXPLAIN
select * from abcd where d='d' and c='c' and b='b' and a='a'
Copy the code
It is the same as the normal order because the optimizer adjusts the order to a- B – C – D consistent with the index
A, b, C, d, 3
If we put d in front of C and use the range index, will c’s index be invalidated?
The result: no
EXPLAIN
select * from abcd where a='a' and b='b' and d> ='d' and c='c'
Copy the code
We can see that ref has three more const, because ref shows which column of the index is used, so we can get
Where a=’a’ and b=’b’ and c=’c’ and d>=’d’
A, B, c, d, c, d, c, d
A, b, C, d, 4
If the order is still a-B-C-D, then C uses the range query, and D uses the index?
Mysql > select * from ‘c’; mysql > select ‘d’; mysql > select * from ‘c’
So there should only be 2 const ref’s left, a and b, so let’s see what happens
EXPLAIN
select * from abcd where a='a' and b='b' and c> ='c' and d='d'
Copy the code
C > ref (c, c); c > ref (c, c); d > ref (c, c)
5, a, B, D and c
Go straight to code
EXPLAIN
select * from abcd where a='a' and b='b' and d='d' ORDER BY c
Copy the code
How many indexes are used at this point? Two, three or four?
The answer is three
Strictly speaking, the c index is used to sort rather than find, so there is no count
(If c is not indexed, it should display using filesort, see Example 6)
And this const is the constant of a and b
6. A — B — C — D
EXPLAIN
select * from abcd where a='a' and b='b' ORDER BY d
Copy the code
C is broken in the middle, so of course D doesn’t go to the index, so only A and B go to the index
So using Filesort is displayed in Extra
7, A, b, C, D
Where a=’a’ ORDER BY B,c uses several indexes.
Three. A, B and C are used
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
How many indexes are used in the following SQL?
1, only A is used, c and B are not used, because they do not conform to the order, the order is strictly followed in order by
So Using Filesort
EXPLAIN
select * from abcd where a='a' ORDER BY c,b
Copy the code
3,Apply internal functions to indexes
In this case, a function-based index should be established
select * from template t where ROUND(t.logicdb_id) = 1;
The index should be ROUND(t.logicdb_id).
4,Type error
For example, if the field type is VARCHAR, the WHERE condition is number
Example: The template_id field is of vARCHAR type.
Select * from template t where t.template_id = 1
Select * from template t where t.template_id = ‘1’
It is worth noting that the invalidation condition for this type error does not mean that the index is invalidated by type
For example, the uid of t_user is of type int and the id of student is of type varchar
Examine whether the following two statements use indexes
# failure EXPLAINselect * from student where id = 1;
Copy the code
# primary key index EXPLAINselect * from t_user where uid = '1';
Copy the code
Another word:
If the vARCHAR query condition is a digit, the vARCHAR will become a string and truncate digits for query
Int implicitly converts a number to a string, so the index is used
Varchar in mysql is not used to compare equals, but instead is a string interception
Select * from ‘where id= 1’ where id= 1
It is ok to
Select * from ‘123 ‘where id=’123’;
It’s also possible
Where id = ‘xx123’ where id = ‘xx123
Where uid = ‘1’ where uid = ‘1’ where uid = ‘1’ where uid = ‘1’
5,If MySQL expects a full table scan to be faster than an index, do not use the index
6,Fuzzy query with like starts with %, index invalid
Like “%aaa%” does not use index
Like “aaa%” can use the index
7,Index columns are not restricted to NULL
Columns that contain NULL values will not be included in the index
Any column in a composite index that contains a NULL value is invalid for the composite index
Too many associated queries JOIN: Index from the table
It could be a flaw in the developer’s design when writing SQL
It could also be the result of a forced need in the business
Take left Join left join as an example
In a left join, the left table accesses all the data, so we should index the right table
Select * from a left join b on a.id = b.id; select * from a left join B on A.id = B.id; select * from a left join B on A.id = B.id
Right join The same method is used to index the condition columns of the left table
The same is true for multi-table queries
- Minimize the total number of NestedLoop loops in Join statements: always drive large result sets with small ones
- The inner loop of NestedLoop is optimized first
- Ensure that Join condition fields on the driven table in the Join statement are indexed
- If you cannot ensure that the Join condition fields of the driven table are indexed and the memory resources are sufficient, do not be too generous with the JoinBuffer Settings
So the conclusion is: build indexes from tables
Slow SQL
Here we have a rough guide, if you need to students can go to in-depth understanding
Slow query open and capture
By default, slow query logging is disabled for the MySQL database. You need to set this parameter manually.
Of course, it is generally not recommended to enable this parameter unless it is necessary for tuning, because slow query logging can have a significant or minor performance impact.
Slow query logs can be written to a file
Enable the slow query log and set the threshold. For example, the SQL that exceeds 10 seconds is slow and captured
View slow query logs
You can view whether the log file is enabled and the log file path
show VARIABLES like '%slow_query_log%';
Copy the code
Check the slow SQL threshold time. The default time is 10 seconds
Mysql determines that the threshold is triggered when the time is greater than long_query_time, not greater than or equal to it
show VARIABLES like '%long_query_time%'
Copy the code
Enable slow query configuration
# Enable slow query logsset global show_query_log = 1; # Change the file pathset global slow_query_log_file ="If-slow. log" # change the time thresholdset global long_query_time = 10
Copy the code
After this function is enabled, it takes effect only for the current database and becomes invalid after mysql is restarted
If the threshold is changed, you need to re-establish a query session to query the new threshold
If you want to make it permanent, you can add it to the my.ini configuration file
- Enabling Slow Query
- Specify the log file and its path
- Slow SQL query time threshold
slow-query-log=1
slow_query_log_file="IF-slow.log"
long_query_time=10
Copy the code
Explain + slow SQL analysis
We set the threshold to 3, which is set global long_QUERy_time = 3
Then execute select sleep(4) to make the SQL sleep for 4 seconds and log it as slow SQL
Then open log to view
# Time: 211118 12:07:44
# User@Host: root[root] @ [127. 0. 01.]
# Query_time: 4003336. Lock_time: 0000000. Rows_sent: 1 Rows_examined: 0
use test1;
SET timestamp=1637208464;
select SLEEP(4);
Copy the code
You can see a lot of data
- The time when the record was generated
Time
- SQL execution time
Query_time
- Lock time
Lock_time
- The row record returned
Rows_sent
- Number of rows scanned
Rows_examined
- Table TEST1 is used
use test1;
- SQL statement executed
select SLEEP(4);
Install perl
To execute mysqlDumpslow.pl (which is a Perl program), download the Perl compiler.
Download it at pan.baidu.com/s/1i3GLKAp
Is activePerl_5.16.2.3010812913.msi, step by step after installation, add bin environment variable path
Go to the bin directory of mysql and run the command
C:\Program Files\MySQL\MySQL Server 5.5\bin>perl mysqldumpslow.pl --help
Copy the code
If the parameters can be displayed, the installation succeeds
C:\Program Files\MySQL\MySQL Server 5.5\bin>perl mysqldumpslow.pl --help
Usage: mysqldumpslow [ OPTS... ] [ LOGS... ]
Parse and summarize the MySQL slow query log. Options are
--verbose verbose
--debug debug
--help write this text to standard output
-v verbose
-d debug
-s ORDER what to sort by (al, at, ar, c, l, r, t), 'at' is default
al: average lock time
ar: average rows sent
at: average query time
c: count
l: lock time
r: rows sent
t: query time
-r reverse the sort order (largest last instead of first)
-t NUM just show the top n queries
-a don't abstract all numbers to N and strings to 'S' -n NUM abstract numbers with at least n digits within names -g PATTERN grep: only consider stmts that include this string -h HOSTNAME hostname of db server for *-slow.log filename (can be wildcard), default is '*', i.e. match all -i NAME name of server instance (if using mysql.server startup script) -l don't subtract lock time from total time
Copy the code
-s, sort, c,t, L,r, and AC,at,al,ar are query times respectively. Time, lock time, returns record sort. Plus a is in reverse order.
– t, top n. To keep up with the numbers is to figure out the top number of minus g. And regular expressions.
Run commands to view logs
mysqldumpslow.pl -r -s c -a -t >C:\ProgramData\MySQL\MySQL Server 5.5\data\IF-slow.log
Copy the code
End and spend
Because this article mainly talks about index failure, so the other chapters of this article are briefly skipped, this article is over, thank you for watching!