Disclaimer: The MySQL column learning series is basically my notes on the contents of the column “MySQL Practice 45” in geek time, and I will dig knowledge points on the basis of the column. Assault delete.

I am not a DBA, so please correct some mistakes, exchange with each other and make progress together!

☞ welfare resources | CSDN download service free of charge

MySQL column learning series

01 What does MySQL do when it executes a SELECT statement?

02 How can MySQL be restored to any second state?

If you have used a relational database, you probably know the concept of a transaction and know that a transaction has four basic properties of ACID: Atomicity, Consistency, Isolation, and Durability, today we focus on transaction Isolation.

What is a transaction?

Database transaction (referred to as transaction) is a logical unit in the execution of database management system. It consists of a limited sequence of database operations. — Wikipedia

The concept of transactions does not seem difficult, but there are a few points to note:

1. First, a transaction is a set of database operations that either all succeed or all fail.

2. In MySQL, transaction support is implemented at the engine level;

Not all engines support transactions. MyISAM does not support transactions. InnoDB does.


Today, we are going to focus on isolation. Isolation is when multiple users work on a database concurrently, and the database opens a different transaction for each user. These transactions do not interfere with each other and are isolated from each other.

Why isolation?

If transactions are not isolated from each other, the following problems may occur.

1, dirty reads

In simple terms, dirty reads are when a transaction reads uncommitted data from another transaction during processing.

This uncommitted data is called dirty data. Operations based on dirty data may be incorrect.

Remember when we talked about dirty Pages in the last section? This temporary processing of unsubmitted, is “dirty”.

For example,

However, if the transaction does not commit successfully, all operations will eventually be rolled back, and the editor will not see a penny. For example, if you give a penny to the small editor, the whole transaction takes two steps:

① Add a penny to the small make up account, then small make up saw, feel very pleased; ② Take a penny off your account;

sad

2. Do not read repeatedly

Non-repeatable read refers to that the same data is queried multiple times within a transaction scope but different results are obtained.

Between the two reads in the first transaction, the data read by the first transaction may not be the same because of changes made by the second transaction.

For example,

Following the last example, suppose you did give a penny to the editor. The editor happily went to cash out. When he checked, he found a penny more.

At the same time, before I have not successfully withdraw cash, xiaobian’s wife has advanced the money to go, xiaobian at this time to check the accounts again, found that there is no money.

Cry again

Confused about dirty and unrepeatable reads?

The difference between dirty reads and uncommitted reads is that one transaction reads uncommitted data from another transaction, while unrepeatable reads read committed data from another transaction.

In fact, in some cases, can not repeat reading is not a problem, for example, during the small series of cash withdrawal, a penny by his wife, this is not a problem!

Dirty reads can be avoided by setting isolation levels.

3, phantom read

Phantom read is a phenomenon that occurs when transactions are not executed independently.

For example, transaction T1 changes a data item of all rows in a table from “1” to “2”, and then transaction T2 inserts a row of data item “1” into the table and submits it to the database.

If a user of transaction T1 looks at the data he just modified, he will find that the data is still 1. In fact, this row was added from transaction T2, as if it were hallucinating, so this is hallucination happening.

For example,

In fact, the above explanation is already an example, but let me give you an example.

For example, xiaobian is ready to extract your reward of a penny, over extraction, then there are other enthusiastic users reward a penny, xiaobian a look, clearly has taken out, how there is a penny! ?

Xiaobian at this time think like a dream, I think it can also be called “dream reading”, ha ha.

Unreal and unrepeatable reads both read another transaction that has already been committed (dirty reads are different). The difference is that unrepeatable reads query for the same data item, whereas unreal reads are for a batch of data as a whole (such as the number of data).

The isolation level of the transaction

In order to solve the above problems, we need to set the isolation level, which is the rules by which transactions are isolated from each other, and to what extent transactions are isolated.

First, it is important to understand that the more isolated the transaction, the less efficient it will be.

ANSI/ISO SQL defines four standard isolation levels:

Serializable: the most expensive but reliable transaction isolation level.

Write will be added write lock, read will be added read lock. When a read-write lock conflict occurs, the last accessed transaction must wait for the previous transaction to complete before execution can continue.

Transactions are 100% isolated to avoid dirty reads, unrepeatable reads, and phantom reads.

② Repeatable Read (default level) : Reading the same range of data for many times will return the snapshot of the first query even if the data is updated and modified by other transactions. The data that a transaction sees during execution must be consistent.

However, if this transaction reads a record in a range and another transaction inserts a new record in that range, a phantom row will occur when the previous transaction reads the record in that range again. This is a phantom read.

Avoid dirty and unrepeatable reads. But illusions can occur.

③ Read COMMITTED: A transaction is guaranteed to be committed before it can be Read by another transaction. Another transaction cannot read uncommitted data from that transaction.

Dirty reads can be avoided, but may cause unrepeatable reads.

Read Committed is the default level for most databases, such as Sql Server and Oracle.

④ Read uncommitted: The lowest transaction isolation level, in which changes made by a transaction can be seen by other transactions before a transaction is committed.

There are no guarantees.

Isolation level

The following is a good example of the results that transaction B can read at each of the four transaction isolation levels.

The sample

Do you look at it or are you confused? Let’s do another example.

A, B two transactions, respectively made some operation, the operation process, look at the value of the variable under different isolation levels:

If the isolation level is serialized, it will be locked when transaction B performs “change 1 to 2”. Transaction B cannot continue execution until transaction A commits.

To summarize again

Read uncommitted: someone else’s transaction has not been committed, and I can read it in my transaction. Read Committed: someone else’s transaction has been committed so I can read it in my transaction. Repeatable read: someone else’s transaction has been committed and I don’t read it in my transaction. Serial: No one else can change data until my transaction is committed.

For each of the four isolation levels, parallel performance decreases and security improves.

In general, the higher the transaction isolation level is, the more data integrity and consistency can be guaranteed, but at the cost of low concurrent execution efficiency.


Implementation of isolation levels

The mechanism for transactions is concurrent versioning (MVCC) through read-views, which are created at different points in time for different transaction isolation levels.

  • Repeatable reads are a per-transaction rebuild read view that is used for the entire life of the transaction.

  • Read committed is created for each SQL create read view at the start of each SQL statement execution. The isolation scope is limited to this SQL statement only.

  • Read uncommitted is not created and returns the latest value on the record

  • The serialization isolation level uses locking directly to avoid parallel access.

A view can be regarded as a data copy. Each time a view is created, a copy of the current persistent data is created and the data is read directly from the copy to achieve data isolation.

Implementation of isolation levels

Every time we modify, we don’t operate directly on the row data.

For example, if we set row A with ID 3 to 10, instead of modifying the table directly, we add A new row.

At the same time, the data table actually has some hidden attributes, such as the transaction ID of each row, so each row of data may have multiple versions, each transaction that has modified it will have a row, and there will be associated undo log, indicating the original data of this operation, which can be used for rollback.

So why do it?

Because if we modify the data directly, other transactions will not use the original value, violating the consistency of the transaction.

So what does a transaction return when it reads a row?

Depending on the isolation level, Read Committed returns the value of the latest Committed transaction, so uncommitted transactions will not be Read. This is how the Read Committed implementation works.

If the value is Read Repeatable, only the commit value of transactions initiated earlier than the current transaction and the value of deleted transactions removed later than the current transaction can be returned. This is the MVCC approach.

undo log

The undo log stores the old version data. SQL > alter table id=2; alter table Name=’B’; alter table Name=’ B2′;

When an old transaction needs to read data, in order to read the old version of data, it needs to follow the undo chain to find records that meet its visibility. This can often be considered a time-consuming operation when the version chain is long.

If a value is changed from 1 to 2, 3, and 4 in order, the rollback log will have something like the following.

Roll back log

The current value is 4, but when querying this record, transactions started at different times will have different Read-views.

As you can see, in views A, B, and C, the value of this record is 1, 2, and 4, respectively. The same record can have multiple versions in the system, which is the database multi-version Concurrency Control (MVCC). For read-view A, to get to 1, the current value must be rolled back by all of the operations in the graph.

At the same time, you can see that even if there is another transaction changing 4 to 5, this transaction will not conflict with the corresponding transaction of read-view A, B, and C.


In addition, the undo log in the rollback segment is divided into: insert undo log and update undo log:

  • Insert undolog: Undolog generated when a transaction inserts a new record. It is only needed when a transaction is rolled back and can be discarded immediately after a transaction is committed. (Who needs visibility for just inserted data!!)

  • Update undo log: The undo log generated when a transaction deletes or updates a record. Not only is it required for transaction rollback, but it is also required for consistent reads, so it cannot be deleted. The purge thread removes the rollback log only if it is not involved in the snapshot used by the database.

When to delete?

Delete only when you don’t need to. That is, the system determines that the rollback logs will be deleted when no transactions need them anymore.

This is when there is no read-view older than the rollback log in the system.

Long transaction

A Long Transaction is a Transaction that takes a Long time to complete.

Long transactions are often encountered by database users and are very troublesome. Long transaction processing needs to be carried out properly. If it is not handled properly, the database may crash and bring unnecessary losses to users.

According to the above discussion, long transactions mean that there are old transaction views in the system.

Since these transactions can access any data in the database at any time, any undo logs that may be used in the database must be retained until the transaction commits, which can lead to a large storage footprint.

In MySQL 5.5 and earlier, rollback logs are stored in ibData files along with the data dictionary, and the file does not get smaller even if long transactions are eventually committed and rollback segments are cleaned.

In addition to the impact on rollback segments, long transactions also take up lock resources and can drag down entire libraries, which we’ll cover later when we talk about locking.

Therefore, we should try to avoid long transactions.

summary

This section focuses on the isolation levels of transactions, so keep a few of them in mind and learn how to implement them.

I feel things are a little messy. The author did not expand the MVCC stuff. I have limited ability, so I did not dig further. Follow-up, when the author involves relevant knowledge points, we will discuss again.

In addition, I have limited time, can not keep up with the column update speed, so interested or suggested to buy the column to learn:

Past wonderful

Share my Python learning resources

Python or Java?

Python captures the knowledge planet essence and generates an ebook

What is the last slash/in the url?

Teach you to read web pages in Python

Are you still using format to format strings?

Python to spin the Excel

↓↓↓ More than technology ↓↓↓

Add a drumstick to show my support!