• preface
  • concept
  • implementation
  • conclusion
  • Refer to the link

preface

I learned about the concept of multi-version concurrency control (MVCC) while reading High Performance MySQL. However, this concept is explained in less than two pages…

Therefore, I went to the Internet to find some relevant information, and found that MVCC should be a very important function in MySQL, so I want to sum up ( ̄)  ̄)

concept

MVCC multi-version concurrency control, as the name suggests, is a mechanism used to implement concurrency control, and in databases, concurrency control is often for transactions, so you should have some understanding of transactions before understanding MVCC.

Of course, the concept of transactions itself is very basic, so I won’t go into too much detail here.

Returning to the concept of MVCC, it is described in the book High Performance MySQL as follows:

MVCC can be considered a variant of row-level locking, but it avoids locking in many cases and is therefore less expensive. Implementation mechanisms vary, but most implement non-blocking reads, and writes lock only the necessary rows.

In other words:

  • MVCC achieves the effect of row-level locking while avoiding the occurrence of locking operations
  • MVCC implements non-blocking reads, and writes lock only necessary lines

Meanwhile, MVCC only works at two isolation levels: RC(READ COMMITTED) and REPEATABLE READ (RR).

implementation

From the concept of MVCC, we know that it only works at the two isolation levels RC and RR, which have some characteristics:

  • RC – Reads committed data to avoid dirty reads, but when other transactions modify and commit a row, the changed data may be read, resulting in unrepeatable reads
  • RR – Repeatable read. In the same transaction, the data read should be the same each time the data is not modified

MVCC implements RC and RR features by saving a snapshot of data at a point in time. To implement this, we need to know:

  • What are the committed transactions
  • The transaction version number of the row record
  • The historical version of the row record

In InnoDB, we can use the hidden column of the row record to know the transaction ID that created the row record, use undo log to get the historical version of the row record, and use Read View to get the transaction that committed when the read view was created:

  • DATA_TRX_ID: There are three hidden columns for InnoDB row records. One of the hidden columns is DATA_TRX_ID, which indicates the transaction ID of the row record. When a transaction changes a row, the DATA_TRX_ID of the new record is the TRANSACTION ID.

  • DATA_ROLL_PTR: DATA_ROLL_PTR is another of the three hidden columns of row records. In InnoDB, when modifying a row record, a new record is created and the transaction ID is set. At the same time, the old record is saved in undo log. The DATA_ROLL_PTR of the new record points to the record in the undo log, and the DATA_ROLL_PTR pointer of the old record also exists in the undo log.

    In this way, it forms a linked list of all the historical records of the record. While the UNDO record still exists, the historical version of the corresponding record can be constructed.

  • READ VIEW: The READ VIEW is created before the SQL statement is executed.

    • low_limit_id– createread viewHas not yet been submittedTransaction ofThe biggestThe transaction ID
    • up_limit_id– createread viewHas not yet been submittedTransaction ofThe minimumThe transaction ID
    • trx_ids– createread viewHas not yet been submittedTransaction list of

    With read View, you can divide all transactions into three groups:

    • trx_id < up_limit_id– createread viewIs a transaction that has been committed
    • up_limit_id <= trx_id <= low_limit_id– createread viewIs a normally executed transaction
    • trx_id > low_limit_id– createread viewIs not yet created

    At this point, we can use the read View to determine the visibility of the row:

    1. When the record ofDATA_TRX_IDLess thanread vewup_limit_id“Indicates that the record is being createdread viewIt was submitted before, and the record is visible
    2. If the recordedDATA_TRX_IDAnd the transaction creatorTRX_IDWhen the same, the record is visible
    3. When the record ofDATA_TRX_IDIs greater thanread vewup_limit_id“, the record is being createdread viewThe record is not visible for subsequent new transaction modifications committed
    4. If the record corresponds toDATA_TRX_IDread viewtrx_idsInside, then the record is also invisible

    When the read View determines that the row record is not visible to the current transaction, DATA_ROLL_PTR finds the previous data record from undo log and determines again until the data is empty or visible.

  • RC && RR: At the RC level, we only need to know the list of committed transactions each time the SELECT statement creates a read view, thus achieving the requirement to read committed transactions.

    At the RR level, the Read View can only be created before the transaction begins, and any data committed after the transaction is created is not visible to the current transaction.

conclusion

Overall, MVCC is not too difficult to understand, and the implementation is clever enough to satisfy both isolation levels by using different READ VIEW creation strategies for both isolation levels RC and RR.

However, when comparing the description on the Internet and in the book, I found that there are some differences, which may be the reason why the book is too old. After all, the book is 13 years old.

Due to the preparation of spring recruitment, the blog has been off for more than a month, and I feel my hands are raw again. During the process of completing this blog, I am not adapted to all kinds of <(_ _)>

Refer to the link