Four major properties (ACID)

  1. Atomicity
  2. Consistency
  3. Isolation
  4. “Durability”

atomic

All operations in a transaction either complete or do not complete and do not end somewhere in between. If a transaction fails during execution, it will be rolled back to the state before the transaction began, as if the transaction had never been executed.

Simply put: a transaction either completes or fails

The principle of

InnoDB storage engine provides undo log(rollback log). In the Undo log, records that are contrary to the operation are saved. If the transaction fails, rollback will be performed. Example: Write two INSERTS, which will record the corresponding two DELETES.

persistence

After a transaction, changes to the data are permanent and will not be lost even if the system fails.

redo log

Data is stored in disk, and data needs to be read from disk every time, which is very inefficient. Therefore, innoDB provides cache. Every time data is written into the cache, data in the cache will be periodically refreshed on disk.

This raises the question, what if the server goes down after the data is written to the cache? Is the data lost? If real data were lost, no one would use mysql. Write to redo log before writing to cache. In the event of a downtime, data can be recovered from the redo log and the cache can be updated, ensuring data persistence.

consistency

The integrity of the database is not compromised before and after a transaction. This means that the data written must conform to all the preset rules, including the accuracy of the data, the concatenation of the data, and the ability of the subsequent database to do its predetermined work spontaneously.

Isolation,

The ability of a database to allow multiple concurrent transactions to read, write, and modify its data at the same time. Isolation prevents data inconsistencies due to cross-execution when multiple transactions are executed concurrently.

To put it simply: isolation pursues non-interference between transactions in concurrent situations.

Four levels of isolation

The lower the isolation level, the lower the system overhead, the higher the concurrency that can be supported, but the less isolation.

  1. Read uncommitted
  2. Read Committed
  3. Repeatable read
  4. Serializable

Understanding the four isolation levels starts with understanding dirty reads, unrepeatable reads, and phantom reads

  1. Dirty read: Uncommitted data can be read from other transactions (B) in the current transaction (A).
time A transaction Transaction B
T1 start start
T2 Select * from math where math score is 59
T3 Change zhang SAN’s math score to 60
T4 Select * from math where math score is 60
T5 Commit the transaction

We can clearly see that transaction B commits at time T5, while transaction A has read the contents of transaction B’s commit at time T4.

  1. Non-repeatable read: the same data is read twice in transaction A, and the two reads have different results.
time A transaction Transaction B
T1 start start
T2 Select * from math where math score is 59
T3 Change zhang SAN’s math score to 60
T4 Commit the transaction
T5 The math score for reading Zhang SAN is 60
As you can clearly see, transaction A gets different results when it reads The math score of John two times.

The difference between a non-repeatable read and a dirty read is that the data is read after being committed and that the data is read without being committed.

  1. Unreal: Query the database twice in transaction A for A certain condition, and the results of the two queries are different.
time A transaction Transaction B
T1 start start
T2 Select * from student where math > 60
T3 Insert Zhang SAN English score, score is 59
T4 Commit the transaction
T5 Select * from math, English, math where score < 60
In the previous two queries, the number of data retrieved is different.

Unreal read differs from unrepeatable read in the number of records queried, while unrepeatable read differs in the number of records queried.

The difference between unrepeatable read and phantom read can be commonly understood as: the former is that the data changes, while the latter is that the number of rows of the data changes.

Problems with each isolation level

Isolation Level Level The problem
Read uncommitted Dirty reads, unrepeatable reads, and phantom reads
Reading has been submitted Non-repeatable read and phantom read
Repeatable read Phantom read
serialization no problem

In practice, read uncommit causes a lot of problems in concurrency, and is used less often because of limited performance gains relative to other isolation levels. Serializability forces transactions to be serialized, and concurrency efficiency is very low. It is only used when data consistency requirements are very high and no concurrency can be accepted, so it is rarely used. Therefore, on most database systems, the default isolation level is read committed or repeatable read. Mysql defaults to repeatable read.

MVCC

Concurrency Control Is a multi-version Concurrency Control protocol that addresses dirty reads, unrepeatable reads, and magic reads.

The biggest advantage of MVCC is that the read is not locked, so there is no conflict between read and write, and the concurrency performance is good. InnoDB implements MVCC, multiple versions of data can coexist, mainly based on the following technologies and data structures:

  1. Hidden columns: Each row in InnoDB has a hidden column, which contains the transaction ID of the row, a pointer to undo log, and so on.

  2. Undo log-based version chain: The hidden column of each row contains a pointer to the Undo log, and each Undo log points to the undo log of an earlier version, thus forming a version chain.

  3. ReadView: MySQL can restore data to the specified version by hiding columns and version chains; But which version to restore to depends on ReadView. The so-called ReadView refers to that the transaction takes A snapshot of the whole transaction system (TRX_SYS) at A certain moment, and then compares the transaction ID in the read data with the TRX_SYS snapshot to determine whether the data is visible to the ReadView, that is, whether it is visible to transaction A.

The main contents of trX_SYS and the methods used to determine visibility are as follows:

  1. Low_limit_id: indicates the ID that should be assigned to the next transaction in the system when ReadView is generated. If the transaction ID of the data is greater than or equal to low_limit_ID, the transaction has not been executed and is not visible to the ReadView.
  2. Up_limit_id: indicates the minimum transaction ID among active read/write transactions in the system when the ReadView is generated. If the transaction ID of the data is less than up_limit_ID, the transaction has been completed and is visible to the ReadView.
  3. Rw_trx_ids: represents the transaction ids of the active read and write transactions in the system when the ReadView is generated. If the transaction ID of the data is between LOW_limit_ID and up_limit_ID, check whether the transaction ID is in the RW_trx_IDS. If yes, the transaction is still active when the ReadView is generated, so the data is not visible to the ReadView. If not, the transaction was committed when the ReadView was generated, so the data is visible to the ReadView. As follows:

When the ReadView is generated when the data is read, the transaction ID in the read data will be compared with the TRX_SYS snapshot to determine whether the data is visible to the ReadView, that is, whether the data is visible to transaction A.

reference

www.cnblogs.com/kismetv/p/1…