Visited me
For example, it has been nearly two months since the last update. As promised, why hasn’t the technical article been updated?
Recently, I was too busy, so I was suddenly asked to do a refactoring project. It is easy to say, translating code, translating PHP logic from other teams into Java code.
However, due to time constraints and organizational structure adjustment, few students are familiar with the original business, so they can only be familiar with the business and generally translate codes.
The PHP code of the original project (no language discussion here, just code style issues) had no design to speak of and the same logic was repeated all over the place. It was a load of garbage.
I worked a lot of overtime and stayed up late, and there was no time to export.
Let’s talk about a simple SELECT statement.
The select tour
Set up a simple user table, table structure and test data as follows
create table user (
id int primary key,
name varchar(64)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
insert into user values (1, "zhangsan");
insert into user values (2, "lisi");
insert into user values (3, "wangwu");
Copy the code
For example, for a simple user table, if you want to query information about a user whose name = Zhangsan, what is the process?
select * from user where name = 'zhangsan';
Copy the code
The basic architecture of MSYQL
Speaking of the query process, we also need to introduce the basic architecture of mysql, the overall architecture is as follows:
It is mainly divided into Server layer and storage engine layer.
The Server layer includes connectors, query caches, profilers, optimizers, actuators, and so on.
Step 1: Connect
If you want to query a statement, you first need to connect to the Mysql server. This is done by the connector. The main tasks include establishing connections, obtaining permissions, and maintaining and managing connections.
mysql -hip -pport -uusername -p
Copy the code
Here the connection is a connection between the client and the server, is a way of TCP connections, also go through the process of the three-way handshake, so to establish a connection is not a simple thing, usually set a reasonable connection time, within the scope of the time, even without any query action or to keep the client connected to the database server.
This time is controlled by the wait_timeout parameter, and the default is 8 hours. It’s in s. You can run the following command to query the parameters
show variables like 'wait_timeout';
Copy the code
Step 2: Cache
An optional step.
When the query request arrives, the query cache is checked to see if the statement has been executed before. Previously executed statements and their results may be cached directly in memory as key-value pairs.
Key is the query statement and value is the query result. If your query can find the key directly in the cache, the value will be returned directly to the client.
However, as soon as any update operation is done to the table, the cache is invalidated. What are the general criteria for evaluating cache quality? Cache hit ratio, such as Redis, is especially bad if the cache hit ratio is set to be low.
Therefore, in general, it is not recommended to use caches, which are laborious to cache data, rarely used, and have no effect on performance.
How do YOU determine if the database has caching enabled?
show variables like 'query_cache_type';
Copy the code
My current mysql version is 5.7.22, and by default, mysql has query caching turned off.
Step 3: Profiler
So, ON the right side of the parser, I wrote parse + lexical.
What is lexical analysis?
Basically, translate the statement into a language that the machine can read, and figure out what the strings are and what they stand for.
For example, select indicates query, update indicates update, identify string “user” as “table name user”, and identify string “ID” as “column ID”.
So what is parsing?
Check whether the SQL statement you entered meets the MySQL syntax based on the syntax rules, just like the Syntax in English.
Still remember grade one English, when for a simple statement, such as
I am a boy.
Copy the code
If you write
I is a boy.
Copy the code
The teacher will tell you that your grammar is wrong.
Mysql will also alert you if you enter a form instead of a ‘from’.
select * form user;
Copy the code
The next time you see a similar prompt, don’t look. It’s probably because you didn’t write your SQL correctly.
Step 4: Optimizer
Judging by the name, this step is Mysql telling the server how to execute statements more quickly. Determine which index to use if there are multiple indexes, or determine the order in which tables are joined if a statement has multiple table joins.
After the optimizer phase is complete, the execution plan of the statement is determined, and then the executor phase is entered.
Step 5: Actuator
Judging by the name, this step is a soulless tool man role, which is pure execution.
Let’s go back to the original statement
select * from user where name = 'zhangsan';
Copy the code
-
Call InnoDB engine interface to get the first row of this table, determine whether name is zhangsan, if not, skip, if yes, this row will be stored in the result set;
-
Call the engine interface to fetch “next row” and repeat the same logic until the last row of the table is fetched.
-
The executor returns the recordset of all rows that meet the criteria in the above traversal as a result set to the client.
Since there is no index added to name, all rows are scanned across the table, line by line, which we can see through the slow query log
So you can see the whole process, three rows are scanned, one of which satisfies the requirement.
What if name has an index?
Let’s add a new index here
ALTER TABLE `user` ADD INDEX name_index ( `name` )
Copy the code
-
InnoDB engine to name index tree query name = zhangsan, find the corresponding primary key value, get id=1, take the primary key ID to the primary key index query id=1 row.
-
Repeat the same logic until you reach the last row of the table.
-
The executor returns the recordset of all rows that meet the criteria in the above traversal as a result set to the client.
You can see that when you add the index, you’re actually just scanning one row.
The interesting thing I found in my tests is that the query time with the index is actually longer than the query time without the index. Doesn’t indexing speed up queries?
In fact, this is mainly because of the small amount of data, and after adding the name field index, you need to look up a row in the primary key index tree (this process is called back table), which is definitely slower than selecting a row from three rows.
I enable the slow query log and set the threshold of the slow query log to 0s for easy observation and recording.
conclusion
Here is the main introduction to the daily development of the most used select query process, in fact, mainly to understand the entire mysql architecture.
In fact, this concept is not unfamiliar to most people, but may be for index, primary key index, ordinary index, back table and other concepts are not very familiar with, here is just a brief introduction, and then with you to discuss in depth.