I’ve been on the top of mountains and I’ve been in valleys, and I’ve learned a lot from both. —– Gem Knight – Tariq
How well do you know mysql?
Mysql is a relational database management system developed by mysql AB in Sweden and supports plug-in storage engines. Mysql is a relational database management system developed by mysql AB in Sweden. But how many people can correctly say what is the flow of an SQL execution in mysql? With this question in mind, enter the subject of this article.
MySQL has become the most popular open source database in the past due to its high performance, low cost and good reliability, so it is widely used in small and medium-sized websites on the Internet. As MySQL matures, it is increasingly being used on more large-scale sites and applications, such as Wikipedia, Google and Facebook. The “M” in LAMP, a popular open-source software portfolio, refers to MySQL.
This is some basic information about mysql, which we all know, I don’t have to introduce too much, let’s directly analyze the mysql execution process.
Mysql execution process
In order to understand the execution flow of mysql, we need to understand a few important components.
There are six components. Let’s take a look
- Connector: Connect to the mysql server for permission verification
- Cache: Saves the results of the last query to improve performance
- Analyzer: lexical and grammatical analysis
- Optimizer: Optimize your query as appropriate
- Actuator: Operates the storage engine and reads and writes data
- Storage engine: Stores data
These are some of the most important components of mysql. Do you know how these components relate to each other or how mysql uses these components to perform a common query? Let me draw a picture to show you.
So the complete flow of mysql to perform a query is as follows: == the client establishes the connection through the connector, and performs permission verification for this operation. After passing the operation, it will first go to the cache, query according to THE SQL as the key, and return the query directly. Otherwise, it will go to the analyzer, after the analyzer analyzes and parses the SQL statement, a syntax that mysql can understand is obtained, and then enter the optimizer. Mysql optimizes your query according to the conditions, and after passing through the executor, this actually starts to go to the storage engine to query the data, and finally returns the data to the client, incidentally writing to the cache (not necessarily) ==.
Let’s take a look at what each component is responsible for.
Before the body begins, let’s create a test table, which the user will show in the demo below
CREATE TABLE `sys_user` (
`id` bigint(20) NOT NULL,
`username` varchar(255) NOT NULL,
`age` smallint(6) NOT NULL,
`sex` tinyint(1) DEFAULT NULL,
`mail` varchar(128) DEFAULT NULL.PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Copy the code
insert into sys_user values (1.'zhangsan'.18.1.'[email protected]');
Copy the code
The connector
As can be seen from the above figure, mysql is basically divided into two components, one is the server layer, the other is the storage engine, why server and storage engine are separated separately? As you can imagine from the diagram above, mysql’s storage engine can change with the scene, it can be MYISAM, InnoDB, or Memory, so it is designed to be plug-in.
Now let’s look at the connector, which is located in the Server layer and is mainly used for connection and permission verification. After verification, mysql returns the permissions of the current logon user for subsequent operations.
How do we connect? If you are using a third-party tool like Navicat, you can simply enter the specified parameters
If the connection is made using a command, the command is as follows:
mysql -Hip address-P port-U a username-P passwordCopy the code
Instead of writing the password directly after -p, you can write it as follows:
mysql -h127. 0. 01. -P3306 -uroot -p
Copy the code
If the local service is connected, the IP address and port number can be omitted
mysql -uroot -p
Copy the code
Press Enter to enter the password:
Enter the correct password to establish a connection
Connection is through the TCP connection, need through the handshake to establish a connection, after finished, the server will start to verify your identity information and by return the current user has the permission information, otherwise, an exception is thrown, the authentication information for you input the user name, password, IP, etc., if the user name or password mistake, The server will throw an error code and disconnect with the following error message:
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)
Copy the code
The root in ‘root’@’localhost’ indicates your user name, and localhost indicates the IP address of the current client, that is, the client currently shaking hands with the server.
If you create a user that is only locally connected, what happens when an external client forces a connection? Let’s create a testLocal user for local logins only
Try local login
It’s easy to connect locally, but what if we switch to another IP client? Please have a look at
Obviously, the connection fails, the wrong information with the user name or password error, so the students must be able to distinguish when connect the mysql user name password mistake or IP restrictions, many colleagues are on past me this pit, always thought that is his own password there is a problem, changed many times or not, no IP direction lenovo.
If the connection is authenticated using SSL, the x. 509 certificate can be used to authenticate the connection. When will the changed permission take effect?
Time when the permission change takes effect
When mysql starts, the contents of all permission tables are loaded into memory.
When the server detects that the authorization table changes, the following changes will occur to connected clients:
- Table and column permissions take effect the next time the client requests them.
- The database permission changes to the next USEdb_nameThe command takes effect.
- The global permission change and password change take effect the next time the client connects.
If you modify the authorization table with GRANT, REVOKE, or SET PASSWORD, the server will notice and immediately reload the authorization table.
If you modify the authorization table manually (using INSERT, UPDATE, DELETE, etc.), you should execute mysqladmin flush- PRIVILEGES or mysqladmin reload to tell the server to reload the authorization table, otherwise your changes will not take effect. Unless you restart the server.
If you change the authorization table directly but forget to reload it, restart the server and your change will take effect. This may confuse you as to why your changes don’t change at all!
Permission tables are typically found in mysql libraries.
User: records the information about user accounts that are allowed to connect to the server. The permissions are global.
Db: records the operation rights of each account on each database.
Table_priv: records data table-level operation permissions.
Columns_priv: indicates the permission to record data column-level operations.
If there is no further action after the connection is completed, mysql will set the connection state to idle
sqlshow processlist;
Copy the code
- Id: unique id of a thread.
- User: the User who starts this thread.
- Host: records the IP address and port of the client that sends requests.
- Db: in which database the command is currently executed. If NULL is not specified, for example, the SQL: SQLshow processList I just queried did not specify a database, so NULL is displayed.
- Command: Command that the thread is executing at the moment.
- Time: indicates the Time when the thread is in the current state.
- State: indicates the thread status, which corresponds to Command.
- Info: records the SQL statement executed by the thread. The first 100 characters are displayed by default. To view all of them, run show Full processList.
Let’s take a look at the total values of Command
-
Binlog Dump
This is the thread on the replication source that is used to send binary log content to the replica.
-
Change user
The thread is performing a change user operation.
-
Close stmt
The thread is closing the prepared statement.
-
Connect
The copy is connected to its source.
-
Connect Out
The replica is connecting to its source.
-
Create DB
The thread is performing a create database operation.
-
Daemon
This thread is inside the server, not the thread that serves the client connection.
-
Debug
The thread is generating debugging information.
-
Delayed insert
The thread is a deferred insert handler.
-
Drop DB
The thread is performing a database drop operation.
-
Error
-
Execute
The thread is executing the prepared statement.
-
Fetch
The thread is getting results from executing prepared statements.
-
Field List
The thread is retrieving information for table columns.
-
Init DB
The thread is selecting the default database.
-
Kill
This thread is killing another thread.
-
Long Data
The result of executing the prepared statement is that the thread is retrieving long data.
-
Ping
The thread is processing a server ping request.
-
Prepare
The thread is preparing a prepared statement.
-
Processlist
This thread is generating information about the server thread.
-
Query
The thread is executing a statement.
-
Quit
The thread is terminating.
-
Refresh
This thread is either flushing tables, logs, or caches, or resetting state variables or copying server information.
-
Register Slave
The thread is registering the replica server.
-
Reset stmt
The thread is resetting the prepared statement.
-
Set option
The thread is setting or resetting client statement execution options.
-
Shutdown
The thread is shutting down the server.
-
Sleep
The thread is waiting for a client to send it a new statement.
-
Statistics
The thread is generating server status information.
-
Time
Never used.
Back to the main point, the client is set to Sleep if it has not been used for a while, but if it has not been used for a long time, the server automatically disconnects the connection. The default duration is 8 hours, but this can be set with the wait_timeout parameter.
Lost Connection to MySQL Server during Query: Lost Connection to MySQL Server during Query: Lost Connection to MySQL Server during Query
Connection is divided into two types, one is long connection, short connection, a connection is established, the client sends the request, if have been in the same connection, then this is the long connection, if each request a connection that is short connection, we know that the connection will have user information check permissions validation, more troublesome, so it is recommended to use long connection, However, long connections are not perfect, there are definitely disadvantages, too many long connections will lead to mysql memory occupation too much, resulting in memory strain, extreme cases may lead to memory leak (OOM), so how do we solve this problem?
- Periodically clear long connections.
- Reinitialize the connection resource by executing mysql_reset_connection, but mysql version 5.7 or above is required.
The cache
If yes, the result of the query will be stored in the cache in the form of key-value. In this case, the next query will directly hit the cache and return the corresponding data. If the current key (SQL statement) does not exist in the cache, the next stage, the parser, is entered.
Mysql determines whether the cache hit is simple. Mysql stores the cache in a reference table and applies the hash value to the cache. Mysql does not parse the SQL syntax in advance when determining whether the cache is hit. Instead, it directly uses THE SQL statement and the basic information (protocol) of the client to perform the hash algorithm. Note: When you write SQL, you need to write exactly the same SQL as the last one executed. Whitespace, comments, encoding, or any other difference can result in different hash results that will not hit the cache. Therefore, you need to maintain a uniform coding specification when operating.
In addition to this, there are many other situations that can cause the data of the query to be unable to be cached, such as aggregate functions, custom functions, stored, user variables, temporary tables, system tables in the mysql library, and permission tables.
Mysql cache can improve query performance, but it can also cause additional consumption in other aspects, as follows:
- You must check whether the CACHE is hit before querying. For the SQL that is not in the cache, there is a cached query.
- When the first query or data in the table is modified, the current query needs to write the results to the cache, incurs additional system overhead.
- Mysql will clear all data about the current cache during write operations. If the data in the cache is too large or the SQL statements in the cache are too many, the system may consume a lot of data.
Therefore, the benefits of the cache can improve the efficiency of the query, drawbacks may bring additional system consumption to the system, especially in InnoDB transactions, so in the use of time need to be careful, not to query efficiency two blind use of cache, improper use, may be counterproductive.
How does mysql enable caching? Simply set query_cache_type to DEMAND. This will cause mysql to use caching. This is obviously not recommended, so there is another way to specify on DEMAND. SQL_CACHE specifies the cache to be used in the SQL statement where the cache is required.
select SQL_CACHE * from sys_user where id = 1;
Copy the code
Mysql has removed the cache module directly from mysql8.0, meaning that all queries from mysql8.0 are not directed to the cache, but directly to disk. Two: the return of the query cache will verify the permission information directly. If there is no permission, even if the cache is used, it cannot be queried.
Analyzer
Mysql will go to the next step of the process after not hitting the cache, but it does not directly enter the parser, but needs to first query SQL into an execution plan, after this execution plan and storage engine interaction, which includes ** : parser, preprocessor, optimizer, executor **.
After the execution plan is generated, mysql will parse the SQL keyword and generate a corresponding “parse tree”. In this process, the mysql parser will parse and verify the SQL using the syntax rules. The first step is to perform lexical analysis, mysql does not execute the SQL statement you wrote. Instead, you parse your SQL statements into statements that mysql can execute.
Before you generate the parse tree, you also need to verify that your SQL statement is written properly and that it meets the syntax of mysql. If your SQL statement is entered incorrectly, the program will throw an exception to end the query.
select * rom sys_user here id = 1;
Copy the code
We deliberately wrote from as ROM, where ctrip here. If we execute this SQL, what exception will be thrown? Please look at:
Error near the ROM is a grammar mistakes, this shows that when performing syntax analysis, if there is a grammar question, no matter how many problems, the first detected when they returned directly, not then downwards analysis, so don’t think this is the only when we see an error in question, be sure to check carefully, otherwise the production environment, are likely to commit serious.
Mysql > select * from table where name and alias are different; mysql > select * from table where name and alias are different;
select * from sys_user1 here id = 1
> 1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'id = 1' at line 1
>Time:0.007s
Copy the code
SQL > alter table name (id=1); SQL > alter table name (id=1); SQL > alter table name (id=1)
select * from sys_user1 where id = 1
> 1142 - SELECT command denied to user 'testlocal'@'113.xxx.xxx.xxx' for table 'sys_user1'
>Time:0.006s
Copy the code
SQL > verify table name, field name, etc.
The next preprocessing step also validates permissions, which is generally fast unless the permissions configuration is quite complex.
The optimizer
Once the parser is complete and the syntax tree is legal, the optimizer comes in and converts the syntax tree into an execution plan. Mysql official is also very for the sake of our developers, set up this optimizer, in the development process, we want to query a SQL statement, there are many ways to perform, such as whether to go to the index, which index to go, associated query which table as the main table, etc., these are variable, The optimizer’s job is to find what it thinks is the best execution plan based on the SQL statements written by the programmer.
As an example, look at the following SQL:
select id,user_id,username from math m join english e using(user_id)
where m.score = 80 and e.score = 90
Copy the code
I’m not going to create a table here, but I’m going to write an SQL statement that you can understand, and I’m going to explain what the optimizer is. Mysql > select * from student where score = 80 and score = 90 in English;
- Where user_id = 80 in math, user_id = 90 in English, user_id = 80 in Math, user_id = 90 in English
- Select user_id from English where user_id = 90 and then from Math where user_id = 80.
If there are five students with a score of 80 in math and only one student with a score of 90 in English, would the second method be much more efficient?
This is what the mysql optimizer does. It will find the most efficient path to execute according to its unique algorithm. This raises a question: Is the SQL optimized by the optimizer always the best execution plan? The answer is no. What would make the execution plan generated by the optimizer worse? The following seven points are for reference in High Performance mysql.
- Incomplete or inaccurate statistics, such as InnoDB’s MVCC multi-version concurrency control, can result in inaccurate statistics of table rows.
- The cost of the execution plan is not the same as the actual execution cost. In this case, even if the statistical information is accurate, the execution plan given by the optimizer may not be optimal. For example, there may be times when a certain implementation of Jiahua may need to read more pages or data, but its actual cost may be very small. Why? The reason is simple, if the pages are read in order or the pages (data) are already loaded into memory, then the cost of access is much less than the estimated cost of the execution plan, mysql does not know which data is in memory and which data is in disk, so the number of I/OS is unknown.
- Mysql’s optimality may not be what you think it is, you might want as short a execution time as possible, but mysql just chooses the optimal execution plan based on the cost model, and sometimes that’s not the fastest way to execute. So what we see here is that choosing an execution plan based on the cost of the execution plan is not a perfect model.
- Mysql never considers other queries that are executed concurrently, which can affect the speed of the current query.
- Mysql is not optimized entirely based on cost. Sometimes it gives some fixed rules, such as the match() clause for full-text indexes, so full-text indexes are used when full-text indexes exist, even though other indexes and where conditions can be used much faster than full-text indexes. But mysql also chooses to use full-text indexes.
- Mysql does not consider the cost of operations that are not under its control, such as the cost of stored procedures or user-defined functions.
- The optimizer is sometimes unable to estimate all possible execution plans, so it may miss the actual optimal execution plan.
The optimizer of mysql is a very complex component, and the algorithm has many optimization strategies. It will select an execution plan that the optimizer thinks is optimal through its own optimization strategy. Although there are many optimization strategies, they can be roughly divided into two types: static optimization and dynamic optimization.
Static optimization: The analytic tree can be directly analyzed and optimized. The optimizer can convert a WHERE condition to an equivalent form by an algebraic transformation that is independent of the value of the condition, and the static optimization works even if the value of the WHERE condition changes. Think of it as “compile-time optimization.”
Dynamic optimization: It depends on the context of the query. Dynamic optimization can be influenced by many factors, such as the number of rows corresponding to the index, the value in the WHERE condition, and so on, which can cause the optimizer to re-optimize when executing the SQL. Think of it as “runtime optimization.”
So whether the optimizer chooses static optimization or dynamic optimization depends on the SQL statement. Static optimization only needs to be done once, while dynamic optimization needs to be reevaluated each time it is executed.
actuator
Analyzer will we write the SQL statement parsing a syntax tree, the optimizer will syntax tree turned it thinks the most optimal an execution plan, this time really to read data will be needed, the actuator is the executor of the work is in front of the preparation, analyzer, the optimizer to mysql know in what ways, if any, do you want to do, the most simple and effective, The actuators take these prepared solutions and implement them.
Suppose we now need a total number of zhangsan names, write a SQL statement:
select * from sys_user where username = 'zhangsan';
Copy the code
This is a common query the where condition, but the actuator before executing queries need to do one thing, and it is very important thing: authentication, actuators according to the library name, table name, type of operation, and so on to see if the current user has permission to operate, if it is found that the current user does not have the permissions, then directly terminal execution, directly to the end.
After permission verification is passed, the executor will open to enter the storage engine and open the data table for data reading. The executor also operates through the API provided by the storage engine.
Select * from sys_user where username = ‘zhangsan’; select * from sys_user where username = ‘zhangsan’; Until the entire table is read (if the USERNAME field is not indexed), the executor returns the result set to the client.
This is the general execution process of the executor. What is the interface through which the executor calls the storage engine? This is commonly called the “Handler API”. Each table in the query has an example handler, which is created in the optimizer. Not all operations are done by “handler”, such as table locks.
Mysql repeats each operation in the plan until the executor has queried all the data. Therefore, the executor is not as complex as the analyzer or optimizer. Its main function is to execute.
The storage engine
Data in MySQL is stored in files (or memory) using various techniques. Each of these technologies uses different storage mechanisms, indexing techniques, locking levels, and ultimately offers a wide range of different capabilities and capabilities. By choosing different technologies, you can gain additional speed or functionality that can improve the overall functionality of your application.
Commonly used storage engines are as follows:
MyISAM: has high insert and query speed, but does not support transactions
InnoDB: transactionally enabled storage engine, set as default after mysql5.5.
BDB: An alternative to transactional databases, with support for other transaction features such as COMMIT and ROLLBACK
Memory: A memory-based storage engine that stores all data in Memory for efficient query, insertion, and deletion. It is a space-for-time concept, but the restart of the service will cause data loss.
Merge: Merges a portion of MyISAM tables into a whole, suitable for big data storage.
There are many different storage engines, so mysql has set it to plug and plug structure, improve the overall flexibility of mysql.
conclusion
What are the basic components of mysql?
- The connector
- Cache (before 8.0, not including 8.0)
- analyzer
- The optimizer
- actuator
- The storage engine
What is the execution flow of mysql?
Client — — — — > connector — — — — — — — — > > cache analyzer — — — — — — — — > > optimizer actuators — — — — — — — — > > storage engine to the client
This article just introduces mysql a query statement of the entire execution process, looking at the content is very much, in fact, as long as seriously to think, is nothing more than a few components between the call, each component has its own use, connector is responsible for login verification, authorization verification; Cache to improve query efficiency; The parser parses mysql statements into syntax trees, and then performs preprocessing to determine permissions, etc. The optimizer allows mysql to generate an execution plan that it thinks is optimal; The executor takes the execution plan given by the optimizer and calls the “Handler API” of the storage engine to read the data, and finally returns it to the client. The whole article looks like each component is quite complicated, and there is no need to memorize it. Try to understand why it appears and you will understand.
Code word is not easy, but also ask you to look at the master to give a attention, so that I can continue to code word power.