preface
When we use navicat, mysql Workbench and other mysql clients to execute a SQL statement, we can get the corresponding results. Such as:
So what happens to this process?
An SQL execution is an Rpc call
Mysql is a client-side, server-side architecture. Most apps we use are actually made up of two parts, one is a client application and the other is a server application.
Take our commonly used wechat and QQ for example. The client installed in our mobile phone runs the server in the machine room. We usually send messages are actually using client side and server side to deal with. Like when you text your girlfriend:
-
The message is wrapped by the client, adding sender and receiver information, and then transmitted from your wechat client to the wechat server;
-
The wechat server obtains its sender and receiver from the message, and sends the message to your girlfriend’s wechat client according to the recipient information of the message. Your girlfriend’s wechat client will show that you sent him a message.
The process of using mysql is the same as this one. Its server program works directly with the data we store, and then many clients can connect to this server program and send requests to add, delete, change, and review, and then the server responds to these requests to manipulate the data it maintains.
The main process is as follows:
Therefore, an SQL execution is an RPC call. I will also share RPC related things when I have time later to communicate and learn together!
How does the server handle client requests
No matter which client is used or how the server process communicates, the result is that the client process sends a piece of text to the server process (MySQL statement), and the server process processes it and then sends a piece of text to the client process (processing result). The main process is as follows:
As can be seen from the figure, the server program roughly needs to process the query request from the client through three parts, namely connection management, parsing optimization and execution, and storage engine. Connection management, parsing optimization, and execution are often divided into mysql’s Server layer.
Connection management
A client process can connect to a server process using one of several methods: TCP/IP, named pipes or shared memory, or Unix domain sockets.
Pooling techniques are also used for connection management:
Whenever there is a client process to connect to the server, the server process will create a thread to handle with this client interactions, when the client out of disconnected with the server, the server does not immediately dropped the interact with the client thread to destroy, but to cache it up and on the other when a new client to connect again, Assign the cached thread to the new client. This has the effect of creating and destroying threads less frequently, thereby saving overhead.
The query cache
Because the tables are frequently updated and the query cache invalidates frequently, the benefits of the query cache often outweigh the disadvantages. In MySQL 8.0, the entire query cache function was removed directly.
Syntax parsing
If the query cache is not hit, then it is time to enter the formal query phase. Because the request sent by the client program is just a piece of text, so the MySQL server program should first analyze the text to determine whether the syntax of the request is correct, and then extract the table to be queried from the text and various query conditions into some data structures used by the MySQL server.
Query optimization
After parsing, MySQL knows what you need to do. Before execution can begin, it needs to go through query optimization. Optimization refers to deciding which index to use when there are multiple indexes in the table. Or determine the order in which tables are joined when a statement has multiple joins. We can use EXPLAIN statements to see the execution plan for a statement.
Most of the logic of optimization is cost-based optimization
The cost of executing a query in MySQL consists of two aspects:
-
I/O Cost: The time spent loading from disk to memory is called I/O cost.
-
CPU cost: The time spent reading and checking whether records meet the appropriate search criteria, sorting result sets, and so on is called CPU cost.
For the InnoDB storage engine,mysql specifies that the cost of reading a page is 1.0 by default, and the cost of reading and checking if a record meets the search criteria is 0.2 by default.
Taking a single table query as an example, the costing steps are as follows:
-
Find out all possible indexes based on the search criteria
-
Calculate the cost of a full table scan
-
Calculates the cost of executing queries using different indexes
-
Compare the costs of various implementation options and find the one with the lowest cost.
The storage engine
-
Until the server program completes the query optimization, there is no real access to the data table, MySQL server data storage and extraction operations are encapsulated in a module called the storage engine. We know that a table is made up of rows of records, but this is only a logical concept. The storage engine is responsible for how records are represented physically, how data is read from the table, and how data is written to specific physical storage. To achieve different functions, MySQL provides a variety of storage engines. The specific storage structure of tables managed by different storage engines may be different, and the access algorithm adopted may also be different.
-
For ease of management, most people classify connection management, query caching, syntax parsing, query optimization and other functions that do not involve real data storage as functions of MySQL Server.
-
The function of accessing real data is divided into the function of the storage engine.
actuator
An executor is a storage engine that provides a unified calling interface (storage engine API) to the MySQL Server layer. It contains dozens of low-level functions such as “read index first”, “read index next”, “insert record”, and so on.
For example, when executing a query SQL, you should first judge whether you have the permission to execute the query on this table, if not, it will return no permission error;
In our example, where the id field has no index, the execution flow of the actuator looks like this:
-
Call InnoDB engine interface to fetch the first row of the table, check whether the ID value is 10, if not skip, if yes, save the row in the result set;
-
Call the engine interface to fetch “next row” and repeat the same logic until the last row of the table is fetched.
-
The executor returns the recordset of all rows that meet the criteria in the above traversal as a result set to the client.
The same logic is performed for indexed tables. The first call is the “fetch the first row that meets the condition” interface, and then the “next row that meets the condition” interface is iterated, all of which are already defined in the engine.
However, redolog, undolog, and binlog operations are also involved for insert, delete, and modify SQL statements. We’ll talk about that later.
So just call the API (executor) provided by the underlying storage engine according to the generated execution plan and return the data to the client. An SQL statement is executed.