Logical architecture of MySQL server

MySQL’s storage engine architecture separates query processing from data storage/extraction. MySQL logical architecture

The first layer is responsible for connection management, authorization, security, and so on.

Each client connection corresponds to a thread on the server. A thread pool is maintained on the server to avoid creating and destroying a thread for each connection. When the client connects to the MySQL server, the server authenticates it. You can use the user name and password or an SSL certificate for authentication. After the login is authenticated, the server verifies whether the client has the permission to perform certain queries.

The second layer is responsible for parsing queries (compiling SQL)

And optimize it (such as adjust the table read order, select the appropriate index, etc.). For the SELECT statement, the server checks the query cache before parsing the query. If the query result can be found in the cache, the server directly returns the query result without performing query parsing or optimization. Stored procedures, triggers, views, and so on are implemented in this layer.

The third layer is the storage engine

The storage engine is responsible for storing data in MySQL, extracting data, starting a transaction, and so on. Storage engines communicate with the upper layer through apis that mask differences between storage engines and make those differences transparent to the upper layer query process. The storage engine does not parse SQL.

Concurrency control

Concurrency control problems arise whenever there are multiple queries that need to modify data at the same time. There is a classic way to solve this problem. Read-write locks are divided into read locks and write locks, also known as shared locks and exclusive locks.

Read locks are shared and complementary blocking; Write keys are exclusive. A write lock blocks other write and read locks.

Lock strategy

The problem is that locking consumes resources, and frequent locking and unlocking will affect system performance. The so-called locking strategy is to strike a balance between the cost of locking and security. Here are two of the most important locking strategies.

Table locks

Table key The most basic and least expensive lock policy in MySQL. A table lock locks the entire table and requires a write lock before a user can write a table (insert, update, delete).

Row-level locks

Row-level locking maximizes concurrent processing (while incurring the maximum lock overhead). Row-level locking is only implemented at the storage engine layer, not at the MySQL server layer.

The transaction

A transaction is a set of atomic SQL queries, or a single unit of work.

It is not enough to talk about the concept of transactions unless it passes a rigorous ACID test. ACID stands for atomicity, consistency, Isolation and durability.

  • atomic

All operations in the entire transaction either all succeed or all fail.

  • consistency

Databases always move from one consistent state to another.

  • Isolation,

Changes made by one transaction are not visible to other transactions until they are finally committed.

  • persistence

Once a transaction commits, its changes are permanently stored in the database.

Isolation level

There are four isolation levels defined in the SQL standard, each of which specifies what changes are made within a transaction and which are visible and which are not.

Here are four isolation levels:

  • READ UNCOMMITTED

At this level, changes in a transaction, even if they are not committed, are visible to other transactions, and transactions can Read uncommitted transactions, which are called Dirty reads.

  • READ COMMITTED

The changes made by a transaction are not visible to other transactions from the start until the commit. This is also called nonrepeatable read, because the same query may get different results twice.

  • REPEATABLE READ(REPEATABLE READ)

Repeatable reads are MySQL’s default transaction isolation level, and while it fixes dirty reads, it doesn’t fix another Phantom Read problem.

The so-called phantom reading refers to that when one thing reads records in a certain range, another transaction inserts new records in this range. When the previous transaction reads records in this range again, phantom rows will be generated.

  • SERIALIZABLE

SERIALIZABLE is the highest isolation level and avoids the previous phantom read problem by forcing transactions to be executed serially. To put it simply, SERIALIZABLE locks every row read, which can cause a lot of timeouts and lock contention.

A deadlock

A deadlock is a vicious cycle in which two or more transactions occupy the same resource and request to lock the resource occupied by the other.

Did the two cars on the left cause a deadlock? No! Did the four cars cause a deadlock? Is!!!

Multi-version concurrency control

Multi-version concurrency control (MVVC), can avoid locking operation, therefore lower overhead. MVVC is implemented by saving a snapshot of data at a certain point in time. There are typically optimistic concurrency control and pessimistic concurrency control.

InnoDB’s MVVC is implemented by storing two hidden columns at the end of each row. These two columns store the creation time and expiration (or deletion) time of the row respectively, not the specific time, but the system version number.

MVVC works only at the REPETABLE READ and READ COMMITED isolation levels. The other two isolation levels are incompatible with MVVC because READ UNCOMMITED always reads the latest row, not the row that matches the current transaction version. SERIALIZABLE locks all rows that are read.

The storage engine

  • InnoDB is the default transactional engine for MySQL.
  • The MyISAM storage engine is MySQL5.1 and earlier, but MyISAM does not support transaction and row-level locking, and there is no doubt that it cannot be safely recovered after a crash.