Please go to DobbyKim’s Question of the Day for more questions
MySQL > MySQL > MySQL > MySQL > MySQL > MySQL > MySQL
What is a transaction
In computer terms, a Transaction is a unit of program execution that accesses and possibly updates various data items in a database.
Why do we need transactions?
Transaction is a solution to secure data operation. The control of transaction is actually to control the secure access and isolation of data.
Here’s a simple example:
In bank transfer, account A transfers its 1000 yuan to B, then the logic of business implementation is to reduce the balance of A by 1000 and then increase the balance of B by 1000. If there is an accident in this process, the process is interrupted. A has successfully deducted the money, but B has not had time to increase. It’s going to cause B to lose $1,000. So we must make the control to A account transfer service to cancel, it can ensure the correctness of the business, to complete this operation requires the transaction, will increase A account less money and B accounts on the same transaction, or execute completely successful, or fail, so as to ensure the safety of data.
MySQL is a multi-engine system, but not all engines support transactions. The MyISAM engine, for example, does not support transactions, which is one of the reasons why MyISAM was replaced by InnoDB.
Four features of MySQL transactions
MySQL transactions contain four major properties (ACID), Atomicity, Consistency, Isolation, Durability.
Atomicity
What is atomicity?
Atomicity refers to the fact that a transaction is the smallest and indivisible body of execution in an application. In other words, a transaction contains a series of operations that either all succeed or all fail. There is absolutely no partial success or partial failure.
Consistency
Consistency can be understood as adherence to heap data integrity constraints. These constraints may include primary key constraints, unique index constraints, foreign key constraints, and so on. The data is legal before and after the transaction and does not violate the integrity of any data.
For example: Take transfer, A and B add up to 5000 yuan, no matter how A and B transfer, several times, A and B add up to 5000 yuan forever.
Isolation
Isolation refers to the fact that when multiple users operate on the database in a concurrent manner, such as on the same table, the things that the database opens for each user cannot be disturbed or affected by other transactions, and transactions are isolated from each other.
They want to be permanent.
Permanence means that once a transaction is committed, its changes to the data in the database are permanent. Even if the database fails, the committed data will not be lost.
Isolation and isolation level
When multiple transactions are executed on the database at the same time, dirty reads, non-repeatable reads, phantom reads and other problems may occur. To solve these problems, the concept of MySQL “isolation level” is developed.
Before we get into MySQL’s “isolation level”, let’s take a look at dirty reads, phantom reads, and unrepeatable reads.
- Dirty read
Dirty read refers to that when transaction A is accessing data and modifying the data, but the modification has not been committed to the database, another transaction B also accesses the data and uses the data. As A result, transaction A rolls back, and transaction B reads dirty data.
Schematic diagram:
- Unrepeatable read
Non-repeatable reads refer to multiple reads of the same data within a transaction. For example, transaction B reads some data, and before transaction B finishes, another transaction A accesses the same data and modifies the data. Therefore, between the two reads of transaction B, the data read by transaction B may be different due to the modification of the data by transaction A, which is called unrepeatable read. For example, an editor reads the same document twice, but in between reads, the author overwrites the document. When the editor reads the document the second time, the document has changed. The original read cannot be repeated. This problem can be avoided if the editor can only read the document after the author has finished writing it.
Schematic diagram:
- Phantom read
Phantom reading is when the same query is executed multiple times throughout the transaction with different results.
For example, if transaction B updates all the records in the table, and transaction A inserts another record before committing, then when transaction B reads the database again, another record (the newly inserted record of transaction A) is not updated.
Schematic diagram:
The isolation level of the transaction
Before talking about the isolation level of a transaction, you need to know that the higher the isolation level of a transaction, the less efficient it becomes. So a lot of times, we need to find a balance between the two.
The transaction isolation levels of THE SQL standard include read uncommitted, Read Committed, Repeatable Read, and serializable.
-
Read uncommitted means that changes made by a transaction can be seen by other transactions before it is committed.
-
Read commit means that after a transaction commits, its changes are seen by other transactions.
-
Repeatable read means that the data seen during the execution of a transaction is always the same as the data seen at the start of the transaction. Of course, at the repeatable read isolation level, uncommitted changes are also invisible to other transactions.
-
Serialization, as the name implies, is that for the same row, “write” will be added “write lock”, “read” will be added “read lock”. When a read-write lock conflict occurs, the last accessed transaction must wait for the previous transaction to complete before execution can continue.
Example:
Table T has only one column, and one row has the value 1;
create table T(c int) engine=InnoDB;
insert into T(c) values(1);
Copy the code
Let’s see what transaction A returns at different isolation levels, namely the values of V1, V2, and V3 in the figure.
- If the isolation level is read uncommitted, the value of V1 is 2. Transaction B has not committed yet, but the result is already seen by A. This leads to dirty reads, so V2 and V3 are also 2.
- If the isolation level is “read Commit”, dirty reads can be avoided and V1 has a value of 1. However, after transaction B commits, the value V2 obtained by transaction A query is 2, which causes the original read to be unrepeatable, i.e., unrepeatable read. V3 also has a value of 2.
- If the isolation level is Repeatable, V1, V2 are 1, and V3 is 2. Because the isolation level is set to repeatable reads, the data seen during the transaction must be consistent, so the value of the V2 query is 1.
- If the isolation level is serialized, transaction B will be locked when it performs “change 1 to 2”. Transaction B cannot continue execution until transaction A commits. So from A’s point of view, V1, V2 is 1, V3 is 2.
In implementation, a view is created in the database and accessed based on the logical result of the view.
- RU (read uncommitted) isolation level returns the latest value on the record directly, without the concept of view
- At the RC isolation level, the MVCC view is created before each statement, so at the RC level, a transaction can see what another transaction has committed because it creates a new MVCC view with the latest data before each query
- In RR isolation, the MVCC view is created at the beginning of a transaction and is used until the end of the transaction.
- There is also no view under the Serializable isolation level, which isolates data access through locks.
It is worth mentioning that the default isolation level of Oracle database is RC, so if some applications are migrated from Oracle to MySQL, the isolation level of MySQL needs to be set to RC (read commit) to ensure the consistency of data isolation level.
Implementation of transaction isolation
In MySQL, each update record is recorded in the redo log, and a rollback record of the reverse change is recorded in the Undo log.
Suppose a value is changed from 1 to 2,3,4 in order. In the undo log, there is a record similar to the following:
The current value is 4, but when querying this record, transactions started at different times will have different Read-views. As you can see, in views A, B, and C, the value of this record is 1, 2, and 4, respectively. The same record can have multiple versions in the system, which is the database multi-version Concurrency Control (MVCC).
For read-view A, to get to 1, the current value must be rolled back by all of the operations in the graph. At the same time, you can see that even if there is another transaction changing 4 to 5, this transaction will not conflict with the corresponding transaction of read-view A, B, and C. Rollback logs cannot be kept forever, you may ask. When should they be deleted?
The answer is, delete when you don’t need to. That is, the system determines that the rollback logs will be deleted when no transactions need them anymore. When does it stop? This is when there is no read-view older than the rollback log in the system.
Why not use long transactions if possible?
The answer is: long transactions cause the undo log to remain undeleted.
Long transactions mean that there are old transaction views in the system. Since these transactions can access any data in the database at any time, any rollback records that may be used in the database must be retained until the transaction commits, which can lead to a large storage footprint.
How a transaction is started
MySQL can start transactions in the following ways:
- Displays the start transaction statement, BEGIN or Start transaction. The corresponding commit statements are COMMIT and rollback statements are rollback.
- Set autocommit = 0, this command will turn off autocommit for this thread. This means that if you only execute a SELECT statement, the transaction is started and will not commit automatically. The transaction persists until you actively execute a COMMIT or ROLLBACK statement, or disconnect.