Mysql column – MVCC mechanism

preface

MVCC mechanism is an important mechanism for mysql to solve the transaction problem. Through this mechanism, mysql solves the problem about the transaction: dirty write, dirty read, repeat read problem, but the default cannot repeat read situation will still appear magic read problem.

Summary:

  1. Undo log version chain and read View implementation
  2. How do Undo Log and Read View solve common transaction problems
  3. A brief introduction to exclusive and shared locks

Mysql > alter table transaction

There are two types of problems with transactions:

  • When multiple transactions are executed concurrently, one row of data in the cache page may be updated at the same time. How to handle this conflict? Do I need to lock it?
  • A transaction may update a row of data, and another transaction may query the row of data.

Implementation of undo log transaction rollback

Basic introduction:

Trx_id is the transaction ID of the last update. Roll_pointer refers to the undo log generated before you update the transaction. We talked about undo log before and I’m not going to go into it too much.

Undo log Version chain

Before introducing the UNDO log rollback mechanism, we need to understand the structure of the undo log version chain. It is obvious that this mechanism is introduced to pave the way for the MVCC mechanism

What is?

The undo log version chain is linked by a linked list. When multiple transactions are committed for the same record, to ensure that the transaction can be rolled back normally, multiple transaction versions are connected through roll_pointer and TXd_id, and the values of multiple versions are connected at the same time.

For example, the undo log version chain looks like this:

  1. First, transaction A writes A piece of data with the value OF A. Transaction A’s ID is 50, so the undo log chain also stores trx_id
  2. Transaction B updates the row with a value B, which updates trx_id to 58, and roll_point points to the row with trx_id 50
  3. In the same way, if transaction C updates the data, a linked list node will appear and the transaction C will point to the record with trx_ID 69

So far, an undo log multi-version control chain has been implemented, and its structure is as follows:

What does this design do?

The purpose of this design is to ensure that if multiple transactions commit and need to be rolled back, the same transaction can only read the value committed earlier than the current version, and cannot see the value committed later.

Read view

What is?

Read View is a key component in the implementation of MVCC mechanism, which is implemented by mysql based on the undo log multi-version chain. When a transaction is started, a default Read View table is generated for the current transaction. All parameters are determined when the transaction is started. It is destroyed at the end of the transaction.

A read view contains the following four basic fields: m_IDS, max_trx_ID and min_trx_id.

  • One is m_IDS, which tells you what transactions are running in MySQL that have not yet been committed.
  • One is min_trx_id, which is the smallest value in m_IDS.
  • Max_trx_id = max_trx_id = max_trx_id = max_trx_id
  • One is creator_trx_id, which is the ID of your transaction

Note that max_trx_id is the next transaction ID to be generated, which is designed to facilitate the determination of the transaction ID

According to the simple logical understanding, its storage structure is as follows. The meanings of specific parameters will be explained in the following paragraphs based on the actual case. This is only used to display data storage for understanding.

Check and read steps

Now that we know the basic structure diagram above, let’s look at how to read and retrieve an undo log rollback operation. To better understand this, we need to add some simulated operations according to the diagram above to explain:

If transaction A needs to read data, and transaction B needs to update data, and transaction A has id 50 and transaction B 58, the design of the read View will end up with the following situation:

Without any restrictions, there is a dirty read situation, where a transaction may read an uncommitted value.

But actually it’s not like that, according to the above the undo log chain is introduced, matters need to query A value but in the process of query was suddenly stuck A foot transaction B put this value is updated, the need to generate an undo log record, update and let its value for the transaction value B filed, so the following actually structure will be like, Here the reader can pause for A moment to consider how thing A is to be read and judged:

Key: transaction A’s id is 50, but trx_id is 58, so transaction A’s ID is newer than its record ID. The value of min_trx_id (50) is less than the value of trx_id (58). The value of min_trx_id (50) is less than the value of trx_id (58). The value of min_trx_id (50) is less than the value of trx_id (58). I look at the m_IDS list and see that there are other transactions interfering.

Therefore, transaction A knows that this data is not changed by transaction A, so it needs to find the next data according to roll_point (it can be understand that this data is the snapshot of transaction A’s operation) and also check whether its trx_id is greater than min_trx_id. By comparison, it finds that it is equal to it (both 50). Therefore, it can be determined that this data is the data modified by transaction A. His final judgment structure is shown below:

Using the above steps, we can see how undo log version chain and Read View work together to implement MVCC: When each transaction is started, a read View linked list is maintained. When multiple transactions operate on a row at the same time, a linked list chain of records is constructed according to the Undo log version chain. Through the coordination of these two structures, basic MVCC operations are realized.

Solve the problem of dirty reading, phantom reading and repeated reading

With the Read View and undo log chains above, let’s look at how to solve the problem of dirty, phantom, and duplicate reads. In fact, it’s the same thing. Here’s the key part:

Dirty read:

If transaction A reads A value that has not been committed, it will find the value committed by transaction A according to the undo log chain. Then, as long as transaction A reads the value that it saw when it started the transaction, there is no problem.

Unrepeatable read:

Repeat reads are similar to dirty reads, but with A slight devious twist. If transaction A needs to read A value that can be modified by transaction B, there are two cases:

1. Read the value after transaction B changes, 2. Read the value after transaction B changes but does not commit transaction.Copy the code

For the second point, we can use undo log to retrieve the value read by transaction A and perform operations, so that the operation result of transaction A is not dirty read. Transaction B is easier to understand because transaction B has already committed the transaction. When transaction A queries again, A new Read View is generated, and transaction B is no longer in m_IDS. However, the m_IDS transaction does not have this unknown “tamper”, so it can be assumed that the transaction has been modified and the operation is not affected.

Phantom reads:

After transaction B commits the transaction, transaction A finds that it cannot read the data whose ID is greater than the current max_trx_id. In other words, it can only read the data whose ID is less than the max_trx_id. In this way, mysql avoids the illusion problem.

How can multiple transactions avoid dirty writes?

With the MVCC mechanism in mind, let’s look at how to avoid dirty writes when executing multiple transactions. That is, how to avoid one transaction reading an uncommitted item:

Here we directly according to a diagram to explain, when the first transaction to visit, at this point the transaction will create a lock, which contains its own trx_id and wait state, then lock associated with this row data, in a lock are in memory to complete the operation at the same time, because the operation data in the buffer rather than a disk file. When the second transaction comes, it finds that the data is locked, so it has no choice but to wait. At this time, a lock is generated and the wait state is set to true, indicating that it is also waiting. The key step is: ** transaction A will unlock the data and look to see if the second transaction has also locked the data, so it will release the lock and wake up B to work. ** This double-lock design guarantees access to a row of data in the case of multi-threaded access.

The transaction problems addressed by the transaction isolation level

Before moving on, let’s review how transaction isolation levels address transaction issues.

Isolation level Dirty read Unrepeatable read Phantom read
Read uncommitted RU Square root Square root Square root
Read Committed RC x Square root Square root
Repeatable read RR x x Square root
Serializable x x x

Exclusive locks and shared locks

With transactions out of the way, let’s talk briefly about the characteristics of exclusive and shared locks.

When multiple transactions running mysql to join is an exclusive lock, but because of using the Mvcc mechanism, so they are divided into the read and write lock lock, the priority of the write lock is higher than that of read lock, but the mysql by Mvcc realized the separation of the read-write lock operation, which is a data update, read the data does not affect the other affairs, This also ensures that mutex blocks transactions.

Shared and exclusive locks are mutually exclusive, and you can only add one of them. Finally, let’s look at the mutual exclusion of shared and exclusive locks

Perform a query operation is to lock how to do?

Select * from table lock in share mode select * from table lock in share mode lock in share mode

The lock mentioned above is a feature of row lock. When multiple transactions concurrently update data, exclusive lock is added to row level. This is called row lock

So if you update the data, the following happens:

  • The first is an MVCC-based transaction isolation mechanism
  • The second type is exclusive and shared locks based on special syntax

Note that DLL statements and add, delete, and modify operations are mutually exclusive

Locking rules for row and table locks

How to add table lock

The following two statements are usually used to add a table lock, but in fact this lock operation

LOCK TABLES XXX [READ: add table shared LOCK]

LOCK TABLES XXX WRITE: This is a table level exclusive LOCK

Intent locks

About the content of intent lock, what we need to know is that the intent exclusive lock will be carried out when adding, deleting and modifying, and the intent shared lock will be added when querying. What is intent lock? Assuming that transaction A has acquired an exclusive lock for A row but has not committed it yet, when transaction B wants to acquire A table lock, it must confirm that no exclusive lock exists for each row of the table. Obviously, the efficiency will be very low. After the introduction of intentional lock, the efficiency will be greatly improved:

  1. If transaction A acquires an exclusive lock on A row, there are actually two types of locks on the table: an exclusive lock on A row and an intentional exclusive lock on the table.
  2. If transaction B tries to lock at the table level and is blocked by the previous intent lock, it does not need to check the individual page or row locks before locking the table, but only the intent locks on the table.

Here is a summary of the entire lock:

conclusion

Repeatable Read (RR) isolation level is repeatable Read (REPEATable Read) based on the undo log multiversion chain and ReadView. Avoid dirty write, dirty read, unrepeatable read these three problems, but can not avoid magic read problems.

In subsequent installments we cover the issue of isolation levels for transactions and how to avoid dirty writes. Finally, we cover the rules and content of locking and a brief understanding of what intentional locking is.

Write in the last

The above is a brief understanding of MVCC, as long as you have a deep understanding of the undo log and Read View components of the mechanism of the MVCC mechanism will be quickly understood.