Hashtag: “We are all little frogs” public account article


preparation

For the story to run smoothly, we need to create a table:

CREATE TABLE t (
    id INT PRIMARY KEY,
    c VARCHAR(100)
) Engine=InnoDB CHARSET=utf8;
Copy the code

Then insert a row of data into the table:

INSERT INTO t VALUES(1, 'liu bei');
Copy the code

Now the table looks like this:

mysql> SELECT * FROM t; + - + -- -- -- -- -- -- -- -- + | | | id c + + -- -- -- -- -- -- -- -- -- + | | | liu bei 1 + - + -- -- -- -- -- -- -- -- + 1 rowin set (0.01 sec)
Copy the code

Isolation level

MySQL is a server/client architecture software. For a server, there can be several clients connected to it. After each client is connected to the server, it can be called a Session. We can enter various statements in different sessions at the same time and these statements can be processed as part of a transaction. Different sessions can send requests at the same time, meaning that the server may be processing multiple transactions at the same time, which can result in different transactions accessing the same record at the same time. As mentioned earlier, one of the properties of transactions is called isolation. In theory, when a transaction accesses a data, other transactions should queue up, and when the transaction commits, other transactions can continue to access the data. However, this has a significant impact on performance, so the database designer proposes various isolation levels to maximize the system’s ability to process transactions concurrently, but this is achieved at the expense of some isolation.

READ UNCOMMITTED

If a transaction reads data that was modified by another UNCOMMITTED transaction, the isolation level is called READ UNCOMMITTED.

As shown in the figure above, Session A and Session B each start A transaction. The transaction in Session B first updates column C of the record with ID 1 to ‘Guan Yu’, and then the transaction in Session A queries the record with ID 1. Then the query result is’ Guan Yu ‘under the isolation level of uncommitted reads. That is, a transaction reads a record modified by another uncommitted transaction. However, if the transaction in Session B is later rolled back, then the transaction in Session A reads data that does not exist. This phenomenon is called dirty reads, and it looks like this:

Dirty reads violate real-world business meaning, so READ UNCOMMITTED is a very unsafe level of isolation.

READ COMMITTED

If a transaction can only READ data that was modified by another COMMITTED transaction, and the transaction can be queried to the latest value every time the data is modified and COMMITTED, then this isolation level is said to be READ COMMITTED, as shown in the following figure:

As can be seen from the figure, at step 4, the result of the transaction query in Session A is only ‘liu’ because the transaction in Session B has not been committed, while at step 6, the result of the transaction query in Session B is’ Liu ‘because the transaction in Session B has been committed.

A transaction in committed read isolation will read the latest value of the data as soon as other transactions change the value and commit it later, for example:

We committed several implicit transactions in Session B that changed the value of column C of the record with id 1. After each transaction committed, transactions in Session A could see the latest value. This phenomenon is also known as unrepeatable reading.

REPEATABLE READ

In some business scenario, a transaction can only be read another has submitted the affairs of the modified data, but the first read after a record, even if the other affairs changed the value of the record and submit the transaction record, then read this reading is still the first read value, rather than read different data every time. This isolation level is called REPEATABLE READ, as shown in the figure below:

As can be seen from the figure, when the transaction in Session A reads the record with ID 1 for the first time, the value of column C is’ Liu Bei ‘. After that, although multiple transactions are implicitly committed in Session B and each transaction changes the record, the value of column C read by the transaction in Session A is still ‘Liu Bei’. Is the same value as the first read.

SERIALIZABLE

The SERIALIZABLE isolation level allows read-read, read-write, and write-read concurrent operations on the same record. If read-write and write-read concurrent operations are not allowed, the SERIALIZABLE isolation level can be used, as shown in the following diagram:

As shown in the figure, when the transaction in Session B updates the record with id 1, the subsequent transaction in Session A attempts to access the record, and the transaction in Session A cannot obtain the query result until the transaction in Session B commits.

Version of the chain

For tables using InnoDB storage engine, there are two necessary hidden columns in the clustered index record (row_id is not necessary, we create tables with primary keys or non-null unique keys that do not contain row_ID columns) :

  • Trx_id: Each time a change is made to a cluster index record, the corresponding transaction ID is assigned to the trx_ID hidden column.

  • Roll_ pointer: Every time a change is made to a clustered index record, the old version is written to the Undo log. The hidden column is then used as a pointer to the previous record.

For example, our table T now contains only one record:

mysql> SELECT * FROM t; + - + -- -- -- -- -- -- -- -- + | | | id c + + -- -- -- -- -- -- -- -- -- + | | | liu bei 1 + - + -- -- -- -- -- -- -- -- + 1 rowin set (0.01 sec)
Copy the code

Suppose the transaction ID of the inserted record is 80, then the diagram for the record at this point looks like this:

Suppose two transactions with id 100 and ID 200 UPDATE this record as follows:

Tip: Can I cross-update the same record in two transactions? The first transaction updates a record and then locks it. The second transaction updates a record and then waits for the first transaction to commit and releases the lock. This article is not about locks, but more details about locks will come later.

An undo log is logged every time a change is made to the record, and each undo log has a roll_pointer attribute (the undo log for INSERT does not have this attribute because there is no earlier version of the record). You can concatenate these undo logs into a linked list. So now it looks like this:

After each update to the record, the old value will be put into an undo log. As the number of updates increases, all versions of the record will be connected into a linked list by the roll_pointer attribute. This list is called the version chain, and the first node of the version chain is the latest value of the current record. In addition, each version also contains the transaction ID for which the version was generated, which is important information that we will use later.

ReadView

For transactions using the READ UNCOMMITTED isolation level, the latest version of the record is READ directly, and for transactions using the SERIALIZABLE isolation level, the record is accessed by locking. For transactions that use READ COMMITTED and REPEATABLE READ isolation levels, we need to use the version chain above. The core problem is to determine which version in the version chain is visible to the current transaction. So the architect of InnoDB came up with the concept of ReadView. This ReadView contains the active read and write transactions in the system and puts their transaction ids in a list. We named this list m_IDS. In this way, when accessing a record, you only need to perform the following steps to determine whether a version of the record is visible:

  • If the value of the accessed version’s trx_ID attribute is less than the minimum transaction ID in the M_IDS list, it indicates that the transaction that generated the version was committed before the ReadView was generated, so the version can be accessed by the current transaction.

  • If the value of the accessed version’s trx_ID attribute is greater than the maximum transaction ID in the M_IDS list, the transaction that generated the version was generated after the ReadView was generated, so the version cannot be accessed by the current transaction.

  • If the value of the trx_id attribute is between the maximum transaction ID in m_IDS and the minimum transaction ID in m_IDS, then it is necessary to check whether the value of trx_id attribute is in m_IDS. If it is, then the transaction generated by ReadView is still active, and this version cannot be accessed. If not, the transaction that generated the version of the ReadView when it was created has been committed and the version can be accessed.

If a particular version of the data for the current transaction is not visible, then along the chain to find the next version version of the data, continue to follow the steps above to determine visibility, and so on, until the version in the chain the final version, if the final version is not visible, it means that this record is not visible to the transaction, the query results will not include the record.

One big difference between READ COMMITTED and REPEATABLE READ isolation levels in MySQL is when they generate readViews.

READ COMMITTED – A ReadView is generated before each READ

For example, the system has two transactions with id 100 and id 200 executing:

# Transaction 100
BEGIN;

UPDATE t SET c = 'guan yu' WHERE id = 1;

UPDATE t SET c = 'zhang fei' WHERE id = 1;
Copy the code
# Transaction 200
BEGIN;

# update some other table records.Copy the code

Tip: Only the first time a record is actually modified during a transaction (such as with INSERT, DELETE, or UPDATE statements) is a single transaction ID assigned, which is incremented.

At this point, the version linked list for the record with ID 1 in table T looks like this:

Suppose a transaction now starts executing with the READ COMMITTED isolation level:

# Use READ COMMITTED isolation level transactions
BEGIN;

Transaction 100, 200 not committed
SELECT * FROM t WHERE id = 1; # the value of column C is' Liu Bei '.
Copy the code

SELECT1 is executed as follows:

  • A ReadView is created when the SELECT statement is executed, and the contents of the m_IDS list of ReadView are [100, 200].

  • Select visible records from the version chain, as can be seen from the figure, the content of the latest version of column C is’ Zhang Fly ‘, the version of trx_id is 100, so it does not meet the visibility requirements, according to roll_pointer jump to the next version.

  • Trx_id = 100, m_IDS = 100, trx_id = 100, trx_id = 100, m_IDS = 100, trx_id = 100

  • The contents of the next version of column C are ‘Liu Bei’. The trx_id value of this version is 80, which is less than the smallest transaction id100 in the m_IDS list, so this version is eligible and the last version returned to the user is the record of the column C with ‘Liu Bei’.

After that, we commit the transaction with transaction ID 100, like this:

# Transaction 100
BEGIN;

UPDATE t SET c = 'guan yu' WHERE id = 1;

UPDATE t SET c = 'zhang fei' WHERE id = 1;

COMMIT;
Copy the code

Select * from t where id = 1 and id = 200; select * from t where id = 1 and id = 200;

# Transaction 200
BEGIN;

# update some other table records. UPDATE t SET c ='zhaoyun' WHERE id = 1;

UPDATE t SET c = 'Zhuge Liang' WHERE id = 1;
Copy the code

At the moment, the version chain of the record with ID 1 in table T looks like this:

Select * from the transaction where THE COMMITTED isolation level was used and continue to search for the record with ID 1 as follows:

# Use READ COMMITTED isolation level transactions
BEGIN;

# SELECT1: Transaction 100, Transaction 200 are not committed
SELECT * FROM t WHERE id = 1; # the value of column C is' Liu Bei '.

Transaction 100 is committed, Transaction 200 is not committed
SELECT * FROM t WHERE id = 1; # the value of column C is' Zhang Fei '
Copy the code

This SELECT2 execution is as follows:

  • When the SELECT statement is executed, a ReadView will be created, and the contents of the m_IDS list of ReadView will be [200].

  • Then select the visible record from the version chain, as can be seen from the figure, the content of column C of the latest version is’ Zhuge Liang ‘, the trx_id value of this version is 200, so it does not meet the visibility requirements, according to roll_pointer jump to the next version.

  • Trx_id = 200, trx_id = 200, trx_id = 200, trx_id = 200

  • The contents of the next version of column C are ‘Zhang Fei’. The trx_id value of this version is 100, which is smaller than the smallest transaction id200 in the m_IDS list, so this version is valid and the final version returned to the user is the record of the column C with ‘Zhang Fei’.

Select * from t where ID 1 and ID 200 are COMMITTED, and then select * from T where ID 1 is COMMITTED. To summarize, transactions with READ COMMITTED isolation level generate a separate ReadView at the start of each query.

REPEATABLE READGenerate a ReadView on the first read

For transactions with REPEATABLE READ isolation, only a ReadView is generated when the query statement is executed for the first time and subsequent queries are not generated repeatedly. So let’s use an example to see what happens.

For example, the system has two transactions with id 100 and id 200 executing:

# Transaction 100
BEGIN;

UPDATE t SET c = 'guan yu' WHERE id = 1;

UPDATE t SET c = 'zhang fei' WHERE id = 1;
Copy the code
# Transaction 200
BEGIN;

# update some other table records.Copy the code

At this point, the version linked list for the record with ID 1 in table T looks like this:

Suppose a transaction with REPEATABLE READ isolation level is now executed:

# Use REPEATABLE READ to isolate transactions
BEGIN;

Transaction 100, 200 not committed
SELECT * FROM t WHERE id = 1; # the value of column C is' Liu Bei '.
Copy the code

SELECT1 is executed as follows:

  • A ReadView is created when the SELECT statement is executed, and the contents of the m_IDS list of ReadView are [100, 200].

  • Select visible records from the version chain, as can be seen from the figure, the content of the latest version of column C is’ Zhang Fly ‘, the version of trx_id is 100, so it does not meet the visibility requirements, according to roll_pointer jump to the next version.

  • Trx_id = 100, m_IDS = 100, trx_id = 100, trx_id = 100, m_IDS = 100, trx_id = 100

  • The contents of the next version of column C are ‘Liu Bei’. The trx_id value of this version is 80, which is less than the smallest transaction id100 in the m_IDS list, so this version is eligible and the last version returned to the user is the record of the column C with ‘Liu Bei’.

After that, we commit the transaction with transaction ID 100, like this:

# Transaction 100
BEGIN;

UPDATE t SET c = 'guan yu' WHERE id = 1;

UPDATE t SET c = 'zhang fei' WHERE id = 1;

COMMIT;
Copy the code

Select * from t where id = 1 and id = 200; select * from t where id = 1 and id = 200;

# Transaction 200
BEGIN;

# update some other table records. UPDATE t SET c ='zhaoyun' WHERE id = 1;

UPDATE t SET c = 'Zhuge Liang' WHERE id = 1;
Copy the code

At the moment, the version chain of the record with ID 1 in table T looks like this:

REPEATABLE READ isolation level = 1; REPEATABLE READ isolation level = 1;

# Use REPEATABLE READ to isolate transactions
BEGIN;

# SELECT1: Transaction 100, Transaction 200 are not committed
SELECT * FROM t WHERE id = 1; # the value of column C is' Liu Bei '.

Transaction 100 is committed, Transaction 200 is not committed
SELECT * FROM t WHERE id = 1; The value of column C is still 'Liu Bei'.
Copy the code

This SELECT2 execution is as follows:

  • The m_IDS list in the ReadView is [100, 200].

  • Then select the visible record from the version chain, as can be seen from the figure, the content of column C of the latest version is’ Zhuge Liang ‘, the trx_id value of this version is 200, so it does not meet the visibility requirements, according to roll_pointer jump to the next version.

  • Trx_id = 200, trx_id = 200, trx_id = 200, trx_id = 200

  • The next version of column C contains’ Zhang Fei ‘, trx_id is 100, and m_IDS contains transaction ID 100, so this version is also not qualified, and the next version of column C contains’ Guan Yu ‘is also not qualified. Continue to the next version.

  • The next version of column c has the content of ‘liu bei’, the trx_id value of this version is 80,80 is less than the smallest transaction id100 in m_ids list, so this version is eligible, and the last version returned to the user is the record of the column c with ‘liu bei’.

The value of column C is’ liu ‘, which is the meaning of repeatable read. If we commit the record with transaction ID 200 and then search for the record with ID 1 in the transaction with REPEATABLE READ isolation level just now, the result will still be ‘liu’, the specific execution process can be analyzed by ourselves.

MVCC summary

Concurrency Control for Multi-version Concurrency Control Multi-version concurrency control (MULTI-version concurrency control) refers to the process of accessing the version chain of records when ordinary SEELCT operations are performed by transactions with READ COMMITTD and REPEATABLE READ isolation levels. In this way, read-write and read-read operations of different transactions can be executed concurrently, thus improving system performance. READ COMMITTD, REPEATABLE READ One big difference between the READ COMMITTD and REPEATABLE READ isolation levels is when a ReadView is generated. READ COMMITTD generates a ReadView before a normal SELECT operation. REPEATABLE READ generates a ReadView only before the first normal SELECT operation and repeats this ReadView for all subsequent query operations.

Small volumes

If you want to learn more about MySQL, you can find a link to how MySQL works: Finding ways to Understand MySQL. The content of the volume is mainly from the perspective of small white, using popular language to explain some core concepts about MySQL advanced, such as record, index, page, table space, query optimization, transaction and lock, a total of about 300,000 or 400,000 words, with hundreds of original illustrations. Mainly want to reduce the ordinary programmer learning MySQL advanced difficulty, let the learning curve a little smoother ~