This is the fifth day of my participation in the November Gwen Challenge. Check out the details: The last Gwen Challenge 2021

Transaction isolation level

If, when a transaction on a visit to a data queue and other issues should be after the transaction is committed, other transactions can continue to access the data, this kind of treatment effect on performance, namely wants to keep the transaction isolation, and want to server performance when dealing with multiple transactions accessing the same data is high, part of the isolation and take performance.

Problems encountered with concurrent execution of transactions

Problems that may occur when transactions accessing the same data are not guaranteed serial execution: \

  • Dirty write: A Dirty write occurs if a transaction modifies data that has been modified by another uncommitted transaction

image.png

Session A and Session B each start A transaction. The transaction in Session B is updated first, and then the transaction in Session A updates and commits the transaction. If the transaction in Session B is rolled back later, the update in Session A will no longer exist. This phenomenon is called dirty writing. \

  • Dirty Read: If a transaction reads data that has been modified by another uncommitted transaction, a Dirty Read has occurred

image.png

Session A and Session B each start A transaction. The transaction in Session B first updates the column to XXX, and then the transaction in Session A queries the record. If it reads the column value XXX, then the transaction in Session B rolls back. A transaction in Session A reads data that does not exist. This is called A dirty read. \

  • Non-repeatable Read: If a transaction can only Read the data modified by another committed transaction, and the transaction can query the latest value after the data is modified and committed by other transactions, it means that a non-repeatable Read has occurred.

image.png

There are several implicit transactions committed in Session B (committed at the end of the statement) that change the column value. If the transaction in Session A can see the latest value after each transaction committed, this phenomenon is also considered unrepeatable. \

  • Phantom: A Phantom occurs when a transaction queries records based on some criteria and then another transaction inserts records that meet those criteria, and when the original transaction queries the criteria again, the records inserted by the other transaction are read as well.

\

image.png

Session A commits an implicit transaction that inserts A new record into the table. Session B commits an implicit transaction that inserts A new record into the table. Transactions in Session A then query the table according to the same conditions, and the resulting set contains the records newly inserted by transactions in Session B. This phenomenon is also called phantom reading. Phantom reading emphasizes that when a transaction reads records for several times according to a certain condition, it reads the records that it did not read before and cannot read the records that it has read before. In fact, this is equivalent to the phenomenon of unrepeatable reads for each record.

The four isolation levels in the SQL standard

Problems in concurrent transaction execution are listed in order of severity as follows:

Dirty write > Dirty read > Unrepeatable read > Phantom readCopy the code

Develop SQL standards in which there are four isolation levels:

  • READ UNCOMMITTED: Indicates that the READ is not committed
  • READ COMMITTED: Indicates that the READ is COMMITTED
  • REPEATABLE READ: REPEATABLE READ
  • SERIALIZABLE: SERIALIZABLE

image.png

  • At the READ UNCOMMITTED isolation level, dirty reads, unrepeatable reads, and phantom reads may occur
  • At the READ COMMITTED isolation level, unrepeatable reads and magic READ problems can occur, but dirty reads cannot occur
  • REPEATABLE READ isolation level, phantom READ problems may occur, but dirty READ and unrepeatable READ problems cannot occur
  • At the SERIALIZABLE isolation level, problems are not possible.
  • Regardless of the isolation level, dirty writes are not allowed.

Concurrency Control for multi-version Concurrency Control

For tables using the InnoDB storage engine, its clustered index record contains two necessary hidden columns

  • Trx_id: Each time a transaction changes a cluster index record, the transaction ID of that transaction is assigned to the trx_ID hidden column
  • Roll_pointer: Every time a change is made to a clustered index record, the old version number is written to the undo log. The hidden column is then used as a pointer to find information about the record before the change

Assuming that the transaction ID of the inserted record =80, the record now looks like this: \

Suppose the next two transactions of 100 and 200 UPDATE the record as follows: \

image.png

InnoDB uses locks to ensure that no dirty writes occur, that is, the first transaction updates a record and the next transaction updates the record until the first transaction commits and the lock is released.

An undo log is logged every time a change is made to the record, and each undo log has a roll_pointer attribute (the undo log for INSERT does not have this attribute because there is no earlier version of the record). You can concatenate these undo logs into a linked list as follows: \

After each update to the record, the old value will be put into an undo log. As the number of updates increases, all versions of the record will be connected into a linked list by the roll_pointer attribute, which is called the version chain. The first node of the version chain is the latest value of the current record. Each version also contains the transaction ID for which the version was generated





Four, ReadView

For transactions using the Read Uncommited isolation level, since records modified by uncommitted transactions can be Read, it is good to Read the latest version of the record;

For transactions with Serializable isolation level, records are accessed by locking;

Repeatable Read (Read Commited) and Repeatable Read (Read Commited) (Read Commited) (Read Commited) (Read Commited) (Read Commited) To determine which version in the version chain is visible to the current transaction, we propose the concept of ReadView, which contains four important contents: \

  • M_ids: represents a list of transaction ids of the read and write transactions currently active in the system at the time the ReadView was generated

  • Min_trx_id: Indicates the smallest transaction ID of the active read/write transaction in the system at the time the ReadView was generated, that is, the minimum value in m_IDS

  • Max_trx_id: indicates the ID value in the system that should be assigned to the next transaction when the ReadView is generated

    Note that max_trx_id is not the maximum value in M_IDS, and the transaction ID is incrementally assigned. M_ids = 1 and 2, min_trx_id = 1, max_trx_id = 4, min_trx_id = 4

  • Creator_trx_id: indicates the transaction ID of the transaction that generated this ReadView

    The transaction ID is assigned only when changes are made to a table record (Insert, Delete, Update), otherwise the transaction ID value defaults to 0 in a read-only transaction.

With ReadView, when accessing a record, you need to follow the following steps to determine whether a version of the record is visible: \

  • If the value of the accessed version’s trx_id attribute is the same as the creator_trx_id value in ReadView, it means that the current transaction is accessing its own modified record, so the version can be accessed by the current transaction
  • If the value of the accessed version’s trx_id attribute is less than the value of min_trx_id in ReadView, the transaction that generated the version was committed before the current transaction generated the ReadView, so the version can be accessed as a transaction.
  • If the value of the access version’s trx_id attribute is greater than or equal to the value of max_trx_id in ReadView, the transaction that generated the version was started after the current transaction generated the ReadView, so the version cannot be accessed by the current transaction.
  • If the value of the trx_id attribute of the accessed version is between min_trx_id and max_trx_id of the ReadView, then you need to check whether the trx_id attribute value is in the m_IDS list. If so, the transaction that generated the ReadView was active when it was created. The version is not accessible; If not, the transaction that generated the version of the ReadView when it was created has been committed and the version can be accessed.

One very big difference between the Read Commited and Repeatable Read isolation levels in Mysql is when a ReadView is generated.

READ COMMITTED – A ReadView is generated before each READ

There are two transaction ids 100 and 200 executing in the system

# Transaction 100
BEGIN;

UPDATE table SET name = 'xx1' WHERE id = 1;

UPDATE table SET name = 'xx2' WHERE id = 1;


# Transaction 200
BEGIN;

UPDATE table SET name = 'xx3' WHERE id = 1;

UPDATE table SET name = 'xx4' WHERE id = 1;
Copy the code

If a Transaction is now executed with READ COMMITTED isolation level: Transaction 100, 200 is not COMMITTED

SELECT * FROM table WHERE id = 1; # get column name with value 'XXX'Copy the code

Select1 is executed as follows:

  • A ReadView is generated when the SELECT statement is executed. The m_IDS list of the ReadView is [100, 200], min_trx_id is 100,max_trx_id is 201, and creator_trx_id is 0
  • Select visible records from the version chain, the contents of the latest version of the column name is xx2, the trx_id value of this version is 100, so it does not meet the visibility requirements, according to roll_pointer jump to the next version.
  • Trx_id = 100, trx_id = 100, trx_id = 100, trx_id = 100
  • The trx_id value of this version is 80, which is less than the min_trx_id value of 100 in ReadView. Therefore, this version is qualified and the final version returned to the user is the record of this column name= XXX

Commit the transaction where id=100

SELECT * FROM table WHERE id = 1; # name = 'xx2'Copy the code

Select2 is executed as follows:

  • When the select statement is executed, a ReadView is generated with the contents of the m_IDS list of the ReadView as [200] (the transaction id=100 has been committed, so it will not be available when the snapshot is created again), min_trx_id is 200, The value of max_trx_id is 201 and creator_trx_id is 0
  • Select visible records from the version chain, the latest version of the column name content is xx4, trx_id value of this version is 200, so it does not meet the visibility requirements, according to roll_pointer jump to the next version
  • Select * from m_IDS where trx_id = 200, trx_id = 200, trx_id = 200
  • The name of the next version column is xx2, and the trx_id value of this version is 100, which is less than the min_trx_id value of 200 in ReadView, so this version is qualified, and the last version returned is the name of this record xx2

Similarly, if the transaction ID =200 is also committed, the result is xx4 when the record of the table id=1 value is queried again.

REPEATABLE READ- Generates a ReadView when reading data for the first time
# Transaction 100
BEGIN;

UPDATE table SET name = 'xx1' WHERE id = 1;

UPDATE table SET name = 'xx2' WHERE id = 1;


# Transaction 200
BEGIN;

UPDATE table SET name = 'xx3' WHERE id = 1;

UPDATE table SET name = 'xx4' WHERE id = 1;
Copy the code

REPEATABLE READ Isolation level: Transaction 100, 200 not committed

SELECT * FROM table WHERE id = 1; # get column name with value 'XXX'Copy the code

Select1 is executed as follows:

  • A ReadView is generated when the SELECT statement is executed. The m_IDS list of the ReadView is [100, 200], min_trx_id is 100,max_trx_id is 201, and creator_trx_id is 0
  • Select visible records from the version chain, the contents of the latest version of the column name is xx2, the trx_id value of this version is 100, so it does not meet the visibility requirements, according to roll_pointer jump to the next version.
  • Trx_id = 100, trx_id = 100, trx_id = 100, trx_id = 100
  • The trx_id value of this version is 80, which is less than the min_trx_id value of 100 in ReadView. Therefore, this version is qualified and the final version returned to the user is the record of this column name= XXX

Commit the transaction where id=100

SELECT * FROM table WHERE id = 1; # get column name with value 'XXX'Copy the code

Select2 is executed as follows:

  • Transaction isolation level is REPEATABLE READ, ReadView was generated when select1 was executed, so ReadView will be used directly, m_IDS list of ReadView is [100,200]. The value of min_trx_id is 100, max_trx_id is 201, and creator_trx_id is 0
  • Select visible records from the version list, the latest version of the column name content is xx4, trx_id value of this version is 200, so it does not meet the visibility requirements, according to roll_pointer jump to the next version
  • Select * from m_IDS where trx_id = 200; select * from m_IDS where trx_id = 200
  • Similarly, the version of the next column does not meet the requirements, so skip to the next version
  • The next version of column name has the content XXX, and the trx_id value of this version is 80, which is less than the min_trx_id value of 100 in Read, so this version is valid. Finally returns the record with the column value XXX

Five, the summary

MVCC refers to a process in which transactions with Read Committed and Repeatable Read isolation levels can access the version chain of records when performing common SELECT operations. It enables read-write and read-read operations of different transactions to be executed concurrently, thus improving system performance.

Read Committed, Repeatable Read The difference between the isolation levels is: Read Committed generates a ReadView before each normal SELECT operation, while Repeatable Read generates a ReadView before the first normal SELECT operation. This ReadView is reused for subsequent query operations.