background

We all use Mysql as a database to store and query frequently used data. When you enter a line like SELECT * FROM table_name WHERE id=26, Mysql will output the information you want if it executes correctly.

So, what happens to Mysql between the time you type this line and the time it displays the information you want? This article will talk about it briefly.

Mysql basic architecture diagram

Let’s take a look at the overall architecture of Mysql.

I’m going to take an example of a specific SQL statement and describe in detail what happens from the time you type the statement on the client to the time it returns the information you want.

The client

The so-called client is the terminal we use to log in and operate Mysql. We all operate on the client side of Mysql, whether it is to enter information to connect to the database, or to enter SQL to query a table, or to receive the query information returned to us by Mysql, these are all done on the client side.

The connector

User Information Verification

If you want to use Mysql database on a client, the first step is to connect to the database you want to use.

Mysql -h$IP -p $port -u$username -p

The client will then ask us for a password. After that, if the information we entered is all right, we will enter the Mysql operation interface.

If we enter something wrong, we will receive an error message from the client. For example, when we enter the password incorrectly, we will receive an error message like “Access denied for user”.

So, what exactly does this intermediate connector do?

First, the connector will take the IP and port we entered and do the classic TCP handshake. If the handshake fails, there will be no follow-up and the corresponding error message will be returned.

If the handshake is successful, the user name and password will be verified. If the authentication fails, the corresponding error message will be returned.

Obtaining User Rights

If the user name and password are all right, the connector will then fetch the permission table and read the corresponding permission data for the user. All subsequent operations performed by the user are based on the user permissions read at this time.

There are four permission tables: user, DB, tables_priv, and columns_priv.

When user rights are authenticated and assigned, the user rights are assigned in the order of user, DB, tables_priv, and columns_priv. That is, check the global permission table of user. If the corresponding permission of user is Y, the user has the permission of Y on all databases and the remaining three tables are not checked. If it is N, check the database corresponding to the user in the DB table and obtain the permission of Y in the DB. If N is in the DB, check the table corresponding to the database in tables_priv and obtain the permission Y in the table. If N, check columns_priv for specific columns.

This means that when we change a user’s user permissions, it doesn’t affect that user until the next time that user logs in (creating a new connection).

Connection and wait timeout

We can show ProcessList to see all the current user connections and their behavior.

However, if Command displays a state of Sleep, the user is currently in a waiting state. If you wait for more than a period of time, the connector is automatically disconnected.

This timeout is controlled by the wait_timeout variable, which can be viewed by running show global variables like ‘wait_timeout’.

Long connection and short connection

The so-called long connection refers to the continuous operation of users using the same connection, the connection is established in a long period of time.

The so-called short connection, that is, the user does several operations to disconnect, and then the next operation to connect.

The advantage of a long connection is that it can save a lot of time to establish a connection during continuous operation. However, the temporary data to be stored in the long connection is stored in the connection object, which accumulates for a long time. As a result, the system memory overflow occurs.

The advantages and disadvantages of the short connection are the opposite of the long connection. Although there is no need to worry about the problem of memory overflow, the short connection consumes a lot of time and the overall operation efficiency is low.

cache

The next step in the connector connection is the cache query for the cache, which may be used if we need to perform frequent queries against a static table that is not updated frequently.

The cache uses the storage form of key-value. The key value stores the query statement and the value stores the corresponding result.

Note that once the table is updated, the cache for that table is cleared. So there are not many usage scenarios.

So there is currently less use of the cache. We can check whether the cache is enabled by using query_cache_type.

analyzer

Suppose we do not use the cache, or the SQL statement is not hit through the cache.

After the connector makes the connection, we enter a query, such as SELECT host FROM mysql.user LIMIT 1.

What profilers do is “lexing” and “parsing” the statements you type.

The so-called “lexical analysis” is to judge each word you type. For example, the analyzer will first determine that the first word you type is “SELECT”, the second word you type is “host”, etc.

Grammatical analysis is followed by lexical analysis, where you use the words to determine whether they are grammatical or not.

If the syntax is correct, it will proceed smoothly and return the corresponding information.

The optimizer

After the parser is finished, if there is a syntax problem, an error message is returned and no further operation is performed.

If the syntax is correct, then the optimizer part of the work is done. The optimizer, as its name implies, optimizes the execution of this statement.

For example, when a statement queries a table, the table may have multiple indexes. Which index is used to make the statement run most efficiently? This is what the optimizer does.

Select * from T1 join t2 on t1.ID=1 and t2.ID=2;

Select * from t2 where ID=2; select * from T1 where ID=1;

Select * from t1 where ID=1; select * from T2 where ID=1; select * from T1 where ID=1;

The two order of execution can lead to a difference in execution efficiency. How to choose the order of execution will improve the execution efficiency, which is also what the optimizer does.

actuator

After the above steps are complete, it is the executor’s turn to execute the specific statement.

For example, select * from mysql.tables_priv

Before executing a specific statement, the executor verifies the operation permission of the table. If the verification fails, an error message is displayed indicating that the operation permission is incorrect. As follows:

Precheck cannot verify permissions on tables that are involved at runtime, such as when triggers are used. Therefore, there is also a run-time permission validation in the executor.

If the verification succeeds, the statement continues to be executed using the interface of the storage engine corresponding to the table. Finally, the result of successful execution is returned to the client.

conclusion

In simple terms, a SQL statement executed in Mysql goes through four steps (including connecting to Mysql) : join, analyze, optimize, and execute. Each step is executed precisely and an error is returned to the client if a problem is found. Only if each step is performed correctly will you eventually get the result of your query or operation on the client side.