Treasure, let’s have something together

1. Underlying data structure and algorithm of MYSQL index

We often think of adding indexes when it comes to SQL optimization. What the hell is that index?

An index is a data structure that helps MYSQL efficiently retrieve and order data.

As we know, MYSQL can use B+ tree and hash to maintain the index. The hash structure is not commonly used. Although it can use the hash algorithm to quickly route to the corresponding data, it is very weak for sorting.

To some extent, tree structure is binary search, so the query time complexity is very low. However, it also faces a series of problems, such as tree degradation, and with the increasing data, the tree height is getting higher and higher, the number of disk I/O times is increasing, resulting in lower and lower performance, query time is more and more uncontrollable

For example:

  1. Binary tree: In extreme cases, the binary tree will degrade to a linked list, and the time complexity of the query will also degrade. And the trees are getting higher and higher

  1. Red-black tree: also known as a binary balanced tree, although spin and color change solve the problem of degradation to a linked list, to some extent, the problem of increasing the height of the tree is not solved

  1. B tree: It amounts to an evolutionary version of binary tree, became more tree, then a single node stored data more, then the height of tree control, but there is a B tree characteristics of each of its nodes are stored the corresponding to the data of the line, so we can imagine, the index of memory size will be particularly big, and MYSQL’s memory is limited. In most cases, our single table has more than one or two indexes. If each index tree maintains a data, it will be a huge overhead for MYSQL and cause a waste of resources. On the other hand, if we were to sort the index fields, the B-tree would take a series of recursive walks, and performance would suffer

  1. B + tree: It is an evolutionary version of the B-tree, where every node stores data, and only leaves store data, where all the data in the tree is actually on the leaves, and B+ tree maintains a one-way pointer to the leaves. This pointer solves most of our problems with sorting index fields

At this point, you should understand that “an index is an ordered data structure.”

PS: MYSQL index tree of single node can store up to 16 k by default data, namely the 16 k is a disk block, means an index of disk blocks can store a lot of fields, so as to get the height of the tree control, and the function of the operating system and have a lookahead take, so every time read the index tree is actually reading a disk block, To load data from a disk block into memory for operation.

Now that we have the data structure of the index, let’s see how MYSQL uses the index tree to perform quick queries.

This involves MYSQL storage engines like Innodb, MyISAM, Memorry. Let’s talk about common Innodb and MyISAM

PS: The data structure of the index is maintained in the /data folder of the MYSQL installation directory

  1. MyISAM: The index tree and the data file are separate (non-aggregated), the data in the index tree is maintained only by the memory address of the corresponding data file, and the corresponding data is found by the memory address.

  1. Innodb: There are two cases, one is primary key index tree (or unique index tree) and the other is normal index tree (non-clustered). The data in the primary key index tree maintains the data of the corresponding row, while the primary key ID of the corresponding row is maintained in the ordinary index tree. With this primary key ID, the table query in the primary key index tree can be conducted to find the corresponding data.

PS: Can you guess how MYSQL’s federated index compares the size of the tree? For example, if (name,age) and (name,age) form a joint index, then the size of the index will be compared based on the ASCII code of name. If the size of the index is already determined by name, then the size of the index will not be compared. We only compare the size of age if the ASCII values of name are equal.

2. SQL execution plan analysis

Execution plan analysis plays an important role in SQL tuning. Through Explain + our custom SQL, we can get the execution plan of this SQL, as follows:

Let’s examine some of the more important columns in the execution plan

  1. Id column: This is the sequence number of the select, as many select as there are ids, and the order of the IDS increases in the order in which the select appears. The larger the ID column is, the higher the priority is. If the ID is the same, the execution starts from the top down. If the ID is NULL, the execution ends.

  2. Table column: the table corresponding to select

  3. System >const>eq_ref>ref>range>index>ALL

  4. Key column: the actual index

  5. Rows column: The number of results or scans that mysql internally estimated

  6. Extra column: This column shows some Extra information. The important ones are:

    A. Usingindex: indicates that overwrite index is used. (Overwrite index means that only the columns in the index tree are queried.

    B. Usingwhere: The WHERE statement is used to process the result and the queried column is not overwritten by the index

    C. Usingindexcondition: The queried columns are not completely overwritten by the index. The where condition is the range of a leading column

    D. Usingtemporary: mysql needs to create a temporary table to process queries. In this case, it is usually necessary to optimize, and the first thing to do is to think of using indexes to optimize.

    E. Usingfilesort: Use normal field sorting instead of index sorting. If data is small, sort data from memory. Otherwise, sort data on disk. In this case, you should also consider using indexes to optimize.

How does a SQL file execute in MYSQL?

To understand this problem, first we need to understand the internal structure of MYSQL, as follows:

The SQL execution process is as follows:

  1. MYSQL as a server, our program as a client to maintain a long connection to MYSQL through TCP.

  2. MYSQL uses our SQL as a key to query in the cache (the cache of MYSQL uses LRU elimination algorithm to realize the elimination mechanism of the cache), and determines whether the cache is hit. If it is hit, the data will be returned directly. If no, the process continues

  3. MYSQL implements a parser (written in C) to determine the correctness of our SQL syntax.

  4. After the syntax is correct, we use the internally implemented optimizer to perform some optimization of our SQL, including Cost calculation, etc. (this is why we think a certain SQL will theoretically be indexed, but the execution plan display is not indexed), and finally generate the execution plan of our SQL. Of course, we can also use FORCE_INDEX(….) Force off index

  5. Once optimized, it will go to the MYSQL internal executor, which will then call the corresponding storage engine for our table, such as Innodb, MyISAM, Memory, etc.

  6. The execution engine will find the corresponding data in the corresponding index tree according to the results analyzed by the optimizer, and maintain the index tree at the same time.

MYSQL lock and transaction isolation level

Our database generally executes multiple transactions concurrently. Multiple transactions may add, delete, modify and check the same batch of data concurrently, which may lead to such problems as dirty write, dirty read, unrepeatable read and magic read as we say. The essence of these problems is the multi-transaction concurrency problem of database. In order to solve the multi-transaction concurrency problem, the database has designed the transaction isolation mechanism, lock mechanism and MVCC multi-version concurrency control isolation mechanism to solve the multi-transaction concurrency problem.

First, let’s look at transactions and the ACID properties of transactions

So what is a transaction? A transaction is a set of SQL statements that either all execute successfully or all fail.

ACID properties of transactions:

  • Atomicity: A transaction is an atomic unit of operation that either makes all or none of the changes to data.
  • Consistent: Data must be in a Consistent state at the start and finish of a transaction. This means that all relevant data rules must be applied to transactional changes to maintain data integrity.
  • Isolation: The database system provides a mechanism to ensure that transactions are executed in an “independent” environment that is not affected by external concurrent operations. This means that the intermediate state during the transaction is invisible to the outside world and vice versa.
  • Durable: Changes to data that are permanent after transactions are completed, even in the event of system failures.

Problems with concurrent transactions:

  • Dirty write: When multiple transactions operate on the same piece of data, and the last update overwrites updates made by other transactions, the updated data from other transactions is lost. This problem is called dirty write.
  • Dirty read: Transaction A has read data that transaction B has modified but has not committed, and transaction A has performed operations on this data. In this case, if transaction B rolls back, the data read by transaction A is invalid and does not meet the consistency requirements.
  • Unrepeatable read: The results of the same query statement in transaction A at different times are inconsistent, which does not comply with isolation.
  • Phantom read: Transaction A reads the new data committed by transaction B, which does not comply with isolation.

So how does MYSQL solve these problems? By the isolation level of the transaction, there are several:

  • Read uncommitted: Transaction A can read data that transaction B has not committed.

  • Read Committed: Transaction A can only read committed data from transaction B

  • Repeatable read: Transaction A reads the same data from the same SQL in the current transaction.

  • Serialization: equivalent to a transaction-level lock. Only after transaction A commits can other transactions proceed.

Here is a diagram documenting the problems at these four isolation levels:

Show variables like ‘%isolation%’;

Set transaction isolation level: set transaction_ISOLATION =’REPEATABLE-READ’;

The default transaction isolation level of Mysql is repeatable read. If the isolation level is not set by Spring, the default isolation level is set by Mysql. If Spring sets the isolation level, the default isolation level is set by Spring

Before going into the details of these four isolation levels, let’s take a look at MYSQL locking, which is a big premise.

Lock classification:

  • In terms of performance, there are optimistic locks (implemented by version comparison) and pessimistic locks
  • According to the type of operation performed on the database, it can be divided intoRead and write locks (both pessimistic locks)
    • Read lock (Shared) : Multiple read operations on the same piece of data can be performed simultaneously without affecting each other
    • Write lock (eXclusive or X lock) : This lock blocks other write locks and read locks until the current write operation is complete
  • From the granularity of the data operation, divided into table lock and row lock

Row lock is the most common in our development process, so let’s take row lock as an example, each operation of row lock locks a row of data, one session open transaction update is not committed, another session update the same record will block, update different records will not block. High overhead, slow lock; Deadlocks occur; The locking granularity is the lowest, the probability of lock conflict is the lowest, and the concurrency is the highest.

There are two major differences between InnoDB and MYISAM:

  • InnoDB supports transactions
  • InnoDB supports row-level locking

Now that you have an understanding of row locking, let’s demonstrate the four isolation levels:

1. Read uncommitted

First set the isolation level to read uncommittedset transaction_isolation='read-uncommitted';, and then start two sessions, as shown below:

When transaction A reads data that transaction B has not committed, dirty write, dirty read, non-repeatable read, and magic read may occur.

2. Read committed

Set the isolation level to read uncommitted set transaction_ISOLATION =’ read-COMMITTED ‘; , and then start two sessions, as shown below:

Transaction A can only read the committed data of transaction B.

  1. Repeatable read

Set the isolation level to read uncommitted set transaction_ISOLATION =’ read-COMMITTED ‘; , and then start two sessions, as shown below:

According to the sequence marked in the figure above, from 1 to 6, when querying in 6, it is found that the new data of TRANSACTION B is indeed not found, so there is no magic read. However, if you execute 7 and 8 again, you will find that at 8, there is a magic read.

Since MYSQL’s default isolation level is repeatable reads, there are also phantom reads. How does MYSQL solve phantom reads? Through the MVCC mechanism, which will be covered below.

  1. serialization

Set the isolation level to read uncommitted set transaction_ISOLATION =’ read-COMMITTED ‘; , and then start two sessions, as shown below:

When the isolation level is serialized, the rows queried by transaction A will be locked, and the update of the row lock by transaction B will be blocked until transaction A commits.

PS: The default level is repeatable-read. Is there any way to solve the magic read problem? Gap locking can solve phantom reading problems in some cases.

Gap lock, it locks the gap between two values.

Update account set name=’zhuge’ where id>8 and id<18; set name=’zhuge’ where id>8 and id<18; , other sessions cannot insert or modify any data in all the rows (including the gap rows) included in the range, and in the gaps between the rows, that is, the id in (3,20] cannot modify the data, note that the last 20 is included. Gap locking takes effect only at the repeatable read isolation level.

5. In-depth understanding of MVCC

MVCC multi-version concurrency control mechanism, Mysql read committed and repeatable read isolation level implemented MVCC mechanism.

The biggest advantage of MVCC: read without locking, read and write no conflict. In applications with more read and less write, non-conflict between read and write is very important, which greatly increases the concurrency performance of the system

The realization of MVCC mechanism is through ReadView and undo_log version chain, so that different transactions can read different versions of the same data in the version chain according to the data version chain comparison rules.

Let’s look at the undo_log version chain:

Undo_log version chain refers to a row that has been modified by multiple transactions. Mysql keeps the undo rollback log of the previous row and uses two hidden fields, trx_ID and ROLL_pointer, to form a history version chain.

Can be understood as a single linked list, table tail data is the latest data, and table header data is the oldest data.

With that said, let’s look at ReadView. The difference between committed and repeatable reads is that they have different strategies for generating readViews. Each query in a committed read transaction generates a new ReadView, whereas a repeatable read transaction does not.

So what exactly is a ReadView?

MYSQL maintains a list of active transactions that begin but do not have a conmMIT transaction ID when executing a query. For example, if the ReadView is [30,60], then all transactions before 30 are committed. Transactions after 60 begin after the current ReadView is generated.

For example, at committed read isolation level:

For example, if there is a transaction whose ID is 100, change the name so that the name is equal to Xiaoming 2, but the transaction is not committed yet. Then the version chain is:

At this point, another transaction initiates a SELECT statement to query for the record with ID 1, and the list of readViews generated is only [100]. Then go to the version chain to find, first of all, definitely find the latest one, find trx_id is 100, that is, the name of xiaoming 2 record, found in the list, so cannot access. At this time, the pointer continues to search for the next record whose name is Xiaoming 1. It is found that trx_id is 60, which is less than the minimum ID in the list, so it can be accessed. The direct access result is Xiaoming 1.

At this time we commit the transaction with id 100, create a new record with id 110 and change id 1, and do not commit the transaction

The version chain is

At this point, the previous select transaction performs another query for the record with ID 1.

Now comes the crucial point

If you have committed the read isolation level, then you redo the ReadView and the value in your active transaction list changes to [110].

According to the above statement, you go to the version chain and compare trx_id to find the appropriate result is Xiaoming 2.

If you are at the repeatable read isolation level, your ReadView will be the same as the one generated when you first selected, and the list will have the same value [100]. So the result of select is Xiaoming 1. So the second select result is the same as the first, so it is called repeatable read!

That is, a transaction at the committed read isolation level generates a separate ReadView at the start of each query, whereas a transaction at the repeatable read isolation level generates a ReadView at the first read, and subsequent reads reuse the previous ReadView.

This is the Mysql MVCC, through the version chain, to achieve multiple versions, concurrent read-write, write-read. Different implementations of different isolation levels are generated through ReadView policies.

Innobd’s core BurfferPool caching mechanism

Why don’t Mysql just update the data on disk instead of setting up a BurfferPool mechanism to execute SQL?

  • The performance of updating the data in the disk file can be quite poor because of the random read/write to the disk file directly on a single request.
  • Because the performance of random disk reads and writes is very poor, updating the disk files directly will not allow the database to withstand high concurrency. Mysql’s mechanism may seem complicated, but it ensures that each update request updates the memory BufferPool, and then writes to the log files sequentially, while also ensuring data consistency in the case of various exceptions. The performance of updating memory is very high, and the performance of sequential writing to log files on disk is also very high, much higher than random reading and writing to disk files. It is through this mechanism that our MySQL database can resist several dry read and write requests per second on the machine with high configuration.

Let’s take a look at the flowchart of the BurfferPool caching mechanism:

  1. For example, if our SQL is an update statement, MYSQL will actually query before updating. After the final execution plan is optimized by the MYSQL optimizer, the actuator will first remove the disk to find the corresponding data, because the data is stored as pages on the disk, and load a full page of data into the BufferPool based on the operating system’s prefetch function.
  2. Write the data before the update to undo_log to facilitate rollback.
  3. After the undo_log is successfully written, the data is then updated in the BufferPool cache pool.
  4. After BufferPool data is successfully updated, the latest data is written to the Redo_log_buffer.
  5. When a transaction is committed, the data in the Redo_log_buffer is written to the Redo_log to restore the data in the BufferPool
  6. While operation 5 is in progress, a thread is asynchronously started to write the bin_log to the bin_log file
  7. After the bin_log file is successfully written, the data in the redo_log file is marked with the commit mark to ensure the data consistency between the redo_log and bin_log after the transaction is committed.