Welcome to follow our wechat official account: Shishan100
My new course ** “C2C e-commerce System Micro-service Architecture 120-day Practical Training Camp” is online in the public account ruxihu Technology Nest **, interested students, you can click the link below for details:
120-Day Training Camp of C2C E-commerce System Micro-Service Architecture
Author: Wind has no trace – Tang
scenario
I use the database mysql5.6, the following is a brief introduction to the scenario
The curriculum:
create table Course(
c_id int PRIMARY KEY,
name varchar(10)
)
Copy the code
Article 100 the data
The student table:
create table Student(
id int PRIMARY KEY,
name varchar(10)
)
Copy the code
Article 70000 the data
Student transcript SC
CREATE table SC(
sc_id int PRIMARY KEY,
s_id int,
c_id int,
score int
)
Copy the code
Article 70 w
Search objective: Find candidates with 100 marks in Chinese test
The query
select s.* from Student s where s.s_id in (select s_id from SC sc where sc.c_id = 0 and sc.score = 100 )
Copy the code
Execution time: 30248.271s
Dizzy, why so slow, first to check the query plan:
EXPLAIN select s.* from Student s where s.s_id in (select s_id from SC sc where sc.c_id = 0 and sc.score = 100 )
Copy the code
Select * from ‘where’; select * from ‘where’; select * from ‘where’;
Select * from sc where c_id = score
CREATE index sc_c_id_index on SC(c_id);
CREATE index sc_score_index on SC(score);
Copy the code
Execute the above query statement again in 1.054 seconds
Fast 3W times, greatly shorten the query time, it seems that the index can greatly improve the query efficiency, it is necessary to build an index.
Most of the time, I forgot to build the index. When the amount of data is small, I don’t feel at all. This optimization feels great.
But 1s is still too long, can we still optimize it? Take a closer look at the execution plan:
SELECT
`YSB`.`s`.`s_id` AS `s_id`,
`YSB`.`s`.`name` AS `name`
FROM
`YSB`.`Student` `s`
WHERE
< in_optimizer > (
`YSB`.`s`.`s_id` ,< EXISTS > (
SELECT
FROM
`YSB`.`SC` `sc`
WHERE
(
(`YSB`.`sc`.`c_id` = 0)
AND (`YSB`.`sc`.`score` = 100)
AND (
< CACHE > (`YSB`.`s`.`s_id`) = `YSB`.`sc`.`s_id`
)
)
)
)
Copy the code
How to view the optimized statement?
Execute in the command window
A type = all
In my previous thinking, the order of execution of this SQL would be to execute the subqueries first
select s_id from SC sc where sc.c_id = 0 and sc.score = 100
Copy the code
Time: 0.001 s
The results are as follows:
And then execute
Select s.* from Student where s. _id in(7,29,5000)Copy the code
Time: 0.001 s
EPENDENT SUBQUERY (SQL, SUBQUERY, SUBQUERY, SUBQUERY, SUBQUERY, SUBQUERY, SUBQUERY)
What about joining queries instead?
SELECT s.* from
Student s
INNER JOIN SC sc
on sc.s_id = s.s_id
where sc.c_id=0 and sc.score=100
Copy the code
In order to re-analyze the connection query, delete the indexes sc_C_ID_INDEX and sc_score_index temporarily
The execution time is 0.057 seconds
Efficiency has improved. Look at the execution plan:
Select * from sc where s_id = 1
CREATE index sc_s_id_index on SC(s_id);
show index from SC
A join query is being executed
Time: 1.076s, unexpectedly the time also became longer, what is the reason? View the execution plan:
The optimized query statement is as follows:
SELECT
`YSB`.`s`.`s_id` AS `s_id`,
`YSB`.`s`.`name` AS `name`
FROM
`YSB`.`Student` `s`
JOIN `YSB`.`SC` `sc`
WHERE
(
(
`YSB`.`sc`.`s_id` = `YSB`.`s`.`s_id`
)
AND (`YSB`.`sc`.`score` = 100)
AND (`YSB`.`sc`.`c_id` = 0)
)
Copy the code
SQL > select * from ‘where’
Back to the previous execution plan:
SQL > alter table SQL > alter table SQL > alter table SQL > alter table SQL > alter table SQL > alter table SQL
Normally, join is performed before WHERE filtering, but in our case, if join is performed first, 70W data will be sent to join for operation, so it is wise to perform WHERE filtering first
Now in order to eliminate mysql query optimization, I write an optimized SQL
SELECT
s.*
FROM
(
SELECT
*
FROM
SC sc
WHERE
sc.c_id = 0
AND sc.score = 100
) t
INNER JOIN Student s ON t.s_id = s.s_id
Copy the code
That is, filter sc tables first and then join sc tables. The execution time is 0.054 seconds
SQL > create index s_ID (s_ID);
Extracting sc first and then connecting tables is much more efficient. The problem now is that scanning tables appear when extracting SC, so it is clear that relevant indexes need to be established
CREATE index sc_c_id_index on SC(c_id); CREATE index sc_score_index on SC(score);Copy the code
Then execute the query:
SELECT
s.*
FROM
(
SELECT
*
FROM
SC sc
WHERE
sc.c_id = 0
AND sc.score = 100
) t
INNER JOIN Student s ON t.s_id = s.s_id
Copy the code
Execution time: 0.001s, this time is quite reliable, 50 times faster
Execution Plan:
We’ll see that the index is used to extract the SC and then the table.
So let’s do the SQL again
SELECT s.* from
Student s
INNER JOIN SC sc
on sc.s_id = s.s_id
where sc.c_id=0 and sc.score=100
Copy the code
Execution time: 0.001s
Execution Plan:
SQL > select * from ‘where’; SQL > select * from ‘where’; SQL > select * from ‘where’;
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
(I am a gorgeous dividing line)
Recently, we re-imported some production data. After testing, we found that the execution efficiency of the SQL optimized a few days ago became low again
Adjusted content for SC table data increased to 300W, student scores more discrete.
A quick recap:
show index from SC
Execute SQL
SELECT s.* from
Student s
INNER JOIN SC sc
on sc.s_id = s.s_id
where sc.c_id=81 and sc.score=84
Copy the code
Execution time: 0.061s, this time is a bit slow
Execution Plan:
Intersect union operation is used here, that is, union of results retrieved by two indexes at the same time, and then look at the distinction between field score and c_id.
From a single field, the degree of differentiation is not very large. The retrieval result of C_id =81 from SC table is 70001, and the retrieval result of score=84 is 39425.
However, the result of C_id =81 and score=84 is 897, that is, these two fields have a relatively high degree of differentiation when combined, so the query efficiency of establishing joint index will be higher.
From another point of view, the data in this table is 300w, and there will be more in the future. In terms of index storage, it is not a small amount. As the data amount increases, indexes cannot be loaded into the memory, but must be read from disk.
Therefore, it is necessary to create a multi-column federated index based on the specific business situation, so let’s try it out.
alter table SC drop index sc_c_id_index;
alter table SC drop index sc_score_index;
create index sc_c_id_score_index on SC(c_id,score)
Copy the code
The elapsed time of executing the above query statement is 0.007s, which is acceptable
Execution Plan:
The optimization of this statement is over for now
Conclusion:
-
Mysql nested subqueries are really inefficient
-
It can be optimized for join queries
-
When joining a table, you can filter the table with a WHERE condition and then join the table.
-
Set up appropriate indexes, and set up multi-column federated indexes if necessary
-
Analyze SQL execution plans. Mysql optimizes SQL, so it is important to analyze SQL execution plans
The index optimization
How to optimize subqueries, and how to build indexes, and create a single index for each field when multiple fields are indexed
Later, it is found that in fact, it is more efficient to establish joint index, especially in the case of large data volume and low differentiation of single column.
Single index
The query statement is as follows:
select * from user_test_copy where sex = 2 and type = 2 and age = 10
Copy the code
Index:
CREATE index user_test_index_sex on user_test_copy(sex);
CREATE index user_test_index_type on user_test_copy(type);
CREATE index user_test_index_age on user_test_copy(age);
Copy the code
Sex, Type and age fields are indexed respectively, the data volume is 300W, and the query time is 0.415s
Execution Plan:
Found type = index_merge
This is mysql’s optimization for multiple single-column indexes with the INTERSECT union operation on the result set
The column index more
We can create multiple column indexes on these three columns and copy the table for testing purposes
create index user_test_index_sex_type_age on user_test(sex,type,age);
Copy the code
Query statement:
select * from user_test where sex = 2 and type = 2 and age = 10
Copy the code
Execution time: 0.032s, more than 10 times faster, and the higher the differentiation of multi-column indexes, the faster the improvement
Execution Plan:
The most left prefix
Multi-column indexes also have a left-most prefix, so execute the following statement:
select * from user_test where sex = 2
select * from user_test where sex = 2 and type = 2
select * from user_test where sex = 2 and age = 10
Copy the code
That is, the first field in the index, sex, should appear in the WHERE condition
Indexes cover
In this way, when obtaining the result set, you do not need to obtain the data of other columns from disk, and directly return the index data, such as:
select sex,type,age from user_test where sex = 2 and type = 2 and age = 10
Copy the code
Execution time: 0.003s, much faster than fetching all fields
The sorting
select * from user_test where sex = 2 and type = 2 ORDER BY user_name
Copy the code
Time: 0.139 s
Indexing sorted fields improves sorting efficiency
create index user_name_index on user_test(user_name)
Copy the code
Finally, I have attached some summary of SQL tuning, which I can delve into later:
-
Column types should be defined as numeric types and as short as possible, such as primary and foreign keys, type fields, and so on
-
Create a single-column index
-
Create multi-column federated indexes as needed
-
When there is a lot of data after a single column is filtered, indexes are inefficient, that is, columns are less differentiated
-
If you index multiple columns, the multiple columns are much more differentiated, resulting in significant efficiency gains.
-
Create coverage indexes based on service scenarios to query only the fields required by services. If these fields are covered by indexes, the query efficiency is greatly improved
-
Indexes must be created on fields of multiple table joins to greatly improve the efficiency of table joins
-
An index is required on the WHERE condition field
-
An index is required on the sort field
-
Indexes need to be created on grouped fields
-
Do not use operations on Where conditions to avoid index invalidation
END
Source: www.cnblogs.com/tangyanbo/p…
The copyright of this article belongs to the author
Personal public account: Architecture Notes of Huishania (ID: Shishan100)
Welcome to long press the picture below to pay attention to the public number: Huoia architecture notes!
The official number backstage replies the information, obtains the author exclusive secret system study material
Architecture notes, BAT architecture experience taught each other