preface

This is a common interview question, and you know what?

Reading this article, you will learn:

What isolation level is used for Internet project mysql?

The principle of MVCC

Does MVCC solve illusionary problems?

Concurrency Control We know that Concurrency Concurrency Control (MVCC) improves system performance when read/write operations of different things run concurrently. Let’s start with the problem of Concurrency.

Problems encountered with concurrent execution of transactions

The body of Part 2 begins here. Mysql is a C/S server architecture software, each client connection is a session, normally speaking, if the session queue execution, then there is no problem. But the performance is also poor. If you execute multiple sessions concurrently, you give up some isolation. There are generally the following isolation levels:

  • Dirty Write

  • If one transaction modifies data that has been modified by another uncommitted transaction, a dirty write has occurred

  • Dirty Read

  • If a transaction reads data that has been modified by another uncommitted transaction, a dirty read has occurred

  • Non-repeatable Read

  • If a transaction can only read data that has been modified by another committed transaction, and the transaction can query for the latest value each time the data is modified and committed by another transaction, then an unrepeatable read has occurred

  • Phantom

  • If one transaction first queries records based on certain conditions, and then another transaction inserts records that meet those conditions, the original transaction can read the records inserted by the other transaction when it queries according to those conditions, it means that a phantom read has occurred

The four isolation levels in the SQL standard

  • READ UNCOMMITTED: Indicates that the READ is not committed. (RU)

  • READ COMMITTED: Indicates that the READ is COMMITTED. (RC)

  • REPEATABLE READ: REPEATABLE READ. (RR)

  • SERIALIZABLE: SERIALIZABLE.

At the RU isolation level, dirty read, unrepeatable read, and phantom read problems may occur.

At the RC isolation level, unrepeatable and phantom read problems may occur, but dirty read problems cannot occur.

Phantom read problems may occur in RR isolation, but dirty read and unrepeatable read problems cannot occur.

None of the problems can occur at the SERIALIZABLE isolation level. Worst performance.

Must databases implement SQL standards?

The four isolation levels specified in the SQL standard are supported by different database vendors. For example, Oracle supports only READ COMMITTED and SERIALIZABLE isolation levels.

The default isolation level of MySQL is REPEATABLE READ (RR).

What isolation level is used for Internet project mysql?

Read Commited isolation levels are commonly used in Internet projects. Read UnCommitted and Serializable isolation levels are not used in projects for the following two reasons:

  • In Read UnCommitted, a transaction reads data when another transaction does not commit Read data.

  • Serializable: Locks are added to each read operation and snapshot reads fail. This isolation level is only available when mysql is equipped with distributed transaction functionality. Is a strong consistent transaction, poor performance! Internet distributed scheme, the use of the final consistency of the transaction solution!

Why select Read Commited as transaction isolation level?

  • In Internet projects, it’s basically understandable to read about other things that have been submitted.

  • In RR isolation level, a table is locked if a condition column does not match an index! At the RC isolation level, only rows are locked.

  • Because MySQL RR requires gap lock to solve phantom reading problems. The RC isolation level allows non-repeatable and phantom reads. So RC concurrency is generally better than RR concurrency. And the resulting deadlock is much more likely than RC!

  • At RC isolation levels, the semi-consistent feature increases the concurrency of UPDATE operations! “Semi-consistent reading: At the RC transaction isolation level, the Update statement can take advantage of the semi-consistent read feature, which makes an extra judgment. If the records matched by the WHERE condition do not conflict with the records in the transaction that currently holds the lock, the InnoDB lock will be released early, although this violates the two-phase locking protocol. But it can reduce lock conflicts and improve transaction concurrency, which is a good optimization behavior.”

    We’ll share about locks later

The principle of MVCC

Finally, MVCC, mysql is so widely used, and its good performance has a certain relationship, performance and MVCC. MVCC applies to RR and RC transaction isolation levels.

Version of the chain

To understand MVCC, you must know version chains.

For tables using InnoDB storage engine, there are two necessary hidden columns in the clustered index record (row_ID is not necessary, we create tables with primary keys or non-null UNIQUE keys that do not contain row_ID columns) :

  • Trx_id: Each time a transaction changes a cluster index record, the transaction ID of that transaction is assigned to the trx_ID hidden column.

  • Roll_ pointer: Every time a change is made to a clustered index record, the old version is written to the Undo log. The hidden column is then used as a pointer to the previous record.

An undo log is logged every time a change is made to the record, and each undo log has a roll_pointer attribute (the undo log for INSERT does not have this attribute because there is no earlier version of the record). You can concatenate these undo logs into a linked list. So now it looks like this:

ReadView data structure

Why do I think MVCC mainly works on the isolation levels of RC and RR?

For transactions that use the RU isolation level, since records modified by uncommitted transactions can be read, it is ok to simply read the latest version of the record.

For SERIALIZABLE isolation level, Innodb uses locking to access records.

The ReadView data structure has several important properties:

  • M_ids: represents a list of transaction ids of the read and write transactions currently active in the system at the time the ReadView was generated.

  • Min_trx_id: Indicates the smallest transaction ID of the active read/write transaction in the system at the time the ReadView was generated, that is, the minimum value in m_IDS.

  • Max_trx_id: indicates the id value in the system that should be assigned to the next transaction when the ReadView is generated.

  • Creator_trx_id: indicates the transaction ID of the transaction that generated this ReadView.

Check whether version data is visible:

  1. Trx_id = creator_trx_id; If the value of the accessed version’s trx_id attribute is the same as the creator_trx_id value in ReadView, it means that the current transaction is accessing its own modified record, so the version can be accessed by the current transaction.

  2. Trx_id < min_trx_id; If the value of the accessed version’s trx_id attribute is less than the value of min_trx_id in ReadView, the transaction that generated the version was committed before the current transaction generated the ReadView, so the version can be accessed by the current transaction.

  3. Trx_id > max_trx_id; If the value of the accessed version’s trx_id attribute is greater than or equal to the value of max_trx_id in ReadView, the transaction that generated the version was started after the current transaction generated the ReadView, so the version cannot be accessed by the current transaction.

  4. Min_trx_id < trx_id < max_trx_id; If the value of the trx_id attribute of the accessed version is between min_trx_id and max_trx_id of the ReadView, then you need to check whether the trx_id attribute value is in the m_IDS list. If so, the transaction that generated the ReadView was active when it was created. The version is not accessible; If not, the transaction that generated the version of the ReadView when it was created has been committed and the version can be accessed.

Differences between RR and RC isolation levels:

Readviews are generated at different times:

  • RC generates a ReadView before each normal SELECT operation, with new m_IDS, MIN_trx_ID, max_trx_id, and creator_trx_id attributes generated each time. So it’s not repeatable.

  • RR only generates a ReadView before the first normal SELECT operation, and this ReadView is reused for subsequent query operations. M_ids, MIN_trx_id, max_trx_id, and creator_trx_id are the same. So repeatable read is implemented.

MVCC does not solve phantom problems

MVCC does not solve the illusory problem, otherwise RR isolation level would be illusory. For example,

begin; Select * from users where user_id = 1; select * from users where user_id = 1; Select * from users where id=1; Update users set name='mysql' where id=1; Select * from users; select * from users; Select * from MVCC where id = 1;Copy the code

It can be seen that the data line is inconsistent before and after the detection, there is a phantom read. So MVCC alone can not solve the illusion problem, solve the illusion problem by gap locking. Next time we’ll talk about locks. So to solve the illusionary problem at RR level, we need to explicitly lock.

It can be seen that InnoDB uses MVCC to solve the problem of read and write conflicts, greatly improving the concurrency of the database. Do you understand MVCC like this?

I am Xiao Wang Ge, welcome to follow, like, leave a message, pat brick