The transaction

What is a transaction?

Before we know anything about a transaction, we need to understand: What is a transaction?

A transaction is a set of the smallest unit of execution that is not separable. That is, the data in a transaction must be either completely executed or not executed.

Transactions can only be used by storage engines in Mysql that support transactions, such as InnoDB.

Four characteristics of transactions


atomic

The atomicity of a transaction requires that a set of operations within a transaction be atomic and non-breakable, and if an interruption occurs, all changes are rolled back. That is, the transaction succeeds only if all operations are successful.

For example, A transfer operation is divided into two parts: the amount deducted from account A and the corresponding amount added to account B. The transaction is successful and the transfer is complete only if both parts of the operation succeed. If the first part succeeds and the second part fails, the whole transaction fails and the amount of account A does not decrease and account B does not increase.

consistency

Consistency refers to the movement of the database from one consistent state to another consistent state before and after a transaction is executed. That is, the consistency of the database is not compromised by the execution of a transaction.

For example turn money, if before turn money, the account total of A and B is 100 yuan, the account total of A and B after turn money also should be 100 yuan, won’t increase, also won’t decrease.

Isolation,

Isolation means that when multiple transactions are executed in parallel, they do not affect each other. That is, the operation of transaction A is transparent to the operation of transaction B.

For example, if account A transfers money to account B, the transfer from account C to account A cannot be affected.

persistence

Persistence means that the result of an operation after the execution of a transaction is visible to subsequent operations and does not become invisible when the server restarts.

Transaction implementation

atomic

Atomicity of transactions is implemented by redo logs. When to commit the transaction, will use to send the data, two-phase commit if in the process of submitting, server outage had an accident, when the server starts, will through the redo log to restore data in memory, if it is found that the data in the redo log has not fully submitted, will directly rolled back before the operation.

consistency

Mysql uses undo log to ensure data consistency. If the transaction fails during execution, or if an unknown error occurs, undo log is used to roll back previous operations. In the transaction isolation level, undo log is also used to implement Mysql MVCC mechanism to achieve repeatable read transaction isolation level.

Isolation,

If multiple transactions modify the same data, locks and MVCC will be used to ensure that the data modified by the current transaction has no impact on other transactions.

For example, if a transaction is reading data that is being deleted or modified by another transaction, Mysql will not wait for the lock to be released, but will read its snapshot version to retrieve the data.

persistence

The redo log is also used to implement persistence. When Mysql makes changes to data, it does not immediately dump the changes to disk. Instead, it writes redo logs to ensure that the changes are stored in the redo log.

Why are transactions needed?

Transactions are used when a set of operations is required to either succeed or fail at the same time. For example, in the transfer process, if the amount is deducted from A, but before the amount is added to B, the server breaks down unexpectedly, resulting in the interruption of operation. If the transaction is not used at this time, the amount of account A will be reduced, but the amount of account B will not increase, resulting in an error.

When using transactions, you can make use of the four features of transactions to ensure data consistency and integrity and avoid the generation of dirty data.

The isolation level of the transaction

Problems with concurrent transactions

Data update loss: If two transactions are unaware of each other or unaware of each other’s existence, the second transaction will overwrite the data updated by the first transaction during update, resulting in data loss.

Dirty read: A dirty read indicates that the data read for the second time is inconsistent with the data read for the first time. Because the read transaction is an uncommitted transaction, a rollback of the transaction will result in a dirty read if the second read is different from the first read.

Non-repeatable read: A non-repeatable read is a transaction in which the results of two queries are inconsistent. For example, if transaction A reads normal data for the first time, and transaction B modifies it at this time, the data detected by transaction A is the modified reference of transaction B, resulting in data inconsistency between the two queries.

Phantom read: Phantom read means that at the beginning of A transaction to query data, the query cannot be, but when A transaction to query data for the second time, because other transactions insert A new data, A transaction can query data, resulting in phantom read.

  • Read uncommitted

Read uncommitted means that one transaction can read data modified in another uncommitted transaction. For example, transaction A can read data modified in transaction B, but transaction B has not committed yet.

Because the B transaction has not yet committed, it causes dirty reads, phantom reads, and unrepeatable reads.

  • Reading has been submitted

Read committed indicates that a transaction can read only the data that has been committed by other transactions. Data that has not been committed cannot be queried. Dirty reads can be solved, but unrepeatable reads and phantom reads may occur.

  • Repeatable read

Repeatable read: Repeatable read can only read the submitted data, but cannot read the unsubmitted data. Then with the help of MVCC mechanism, add the version chain and modify the creation period of ReadView, to solve the situation of non-repeatable read, refer to MVCC mechanism.

  • serialization

One of the highest transaction isolation levels requires that all transactions must be executed serially, so concurrency is particularly inefficient.

Classification of transactions

Flat transaction

A flat transaction is the simplest transaction in which all operations are performed at one level, starting with BEGIN work and ending with rollback or COMMIT. So everything in between is either done or not done.

Flat transactions with savepoints

Flat transaction with savepoint refers to the concept of savepoint added to the basis of flat transaction. The concept of savepoint can be used when a transaction has a large amount of execution data and an error occurs when the transaction is executed to a certain part, and the cost of rollback is high if all of the rollback is accepted. If the execution can be rolled to any of the previously set savepoints, all operations between the rollback savepoint and the rollback command are rolled back. Other operations have no effect. When finally committed, only unrolled operations are committed.

Chain business

Linked transactions are more like a variation of flat transactions with savepoints. When a flat transaction with savepoints is executed, if the system crashes, all savepoints disappear and the transaction needs to be re-executed. Chain transactions treat savepoints more like a link location, passing necessary data to the next transaction when the previous transaction commits, and the end of the previous transaction and the beginning of the next transaction will execute atomically, more like in the same transaction. The pattern of chain transactions is shown below

The nested transaction

Nested transactions refer to a transaction that can contain sub-transactions, and sub-transactions can continue to contain sub-transactions, forming a transaction tree. When a child transaction is rolled back, only operations in the child transaction are rolled back. When the parent transaction is rolled back, the parent transaction and its children are rolled back. The child transaction is committed without actually committing, but only together when the parent transaction commits.

Distributed transaction

Distributed transactions are generally flat transactions that run in a distributed environment. In a distributed environment, operations on and data may be distributed on different nodes. If a transaction needs to be rolled back, data on both nodes needs to be rolled back at the same time.