MySQL transaction

Transaction basics —-ACID

A transaction is a logical processing unit consisting of a set of SQL statements with four properties, often referred to simply as the ACID property of the transaction.

  • A (Atomicity) : All operations in the whole transaction are either completed or not completed, and cannot be stopped at some intermediate stage. If a transaction fails during execution, it will be rolled back to the state before the transaction began, as if the transaction had never been executed.
  • C (Consistency) Consistency: The integrity constraint of the database is not broken before and after the transaction.
  • I (Isolation) Isolation: The execution of a transaction cannot be interfered by other transactions. That is, the operations and data used within a transaction are isolated from other concurrent transactions. Concurrent transactions cannot interfere with each other.
  • D (Durability) : Changes made to the database by the transaction persist in the database and do not get rolled back after the transaction completes.

Problems with concurrent transaction processing

  • Lost Update: When transaction A and transaction B select the same row and then Update the row based on the value originally selected, the Lost Update problem occurs because both transactions are unaware of each other’s existence
  • Dirty Reads: When transaction A Reads the data updated by transaction B and then transaction B rolls back the data, A Reads Dirty data
  • Non-repeatable Reads: When transaction A Reads the same data for many times, transaction B updates and commits the data during the process of reading the same data for many times, resulting in inconsistent results when transaction A Reads the same data for many times.
  • Phantom Reads: Phantom Reads are similar to unrepeatable Reads. It occurs when one transaction, A, reads A few rows of data, and then another concurrent transaction, B, inserts some data. In subsequent queries, transaction A will find more records that did not originally exist, as if an illusion occurred, so it is called phantom read.

The difference between phantom and unrepeatable reads:

  • The point of non-repeatable reads is to modify: in the same transaction, the data read for the first time is different from the data read for the second time under the same conditions. (Because other transactions committed changes in the middle)
  • The key point of magic reading is to add or delete: in the same transaction, under the same conditions, the first and second read records are not the same. (Because other transactions committed insert/delete in the middle)

Solutions to the problems associated with concurrent transactions:

  • “Update lost” is usually something that should be avoided entirely. However, preventing update loss cannot be solved by the database transaction controller alone, but requires the application to add necessary locks to the data to be updated. Therefore, preventing update loss should be the responsibility of the application.

  • “Dirty reads”, “unrepeatable reads”, and “phantom reads” are actually database read consistency problems, which must be solved by the database to provide certain transaction isolation mechanism:

Transaction isolation level

There are four isolation levels for database transactions, from lowest to highest

  • Read-uncommitted: The lowest isolation level that allows UNCOMMITTED data changes to be READ, potentially resulting in dirty, illusory, or unrepeatable reads.
  • Read-committed: Allows concurrent transactions to be READ, preventing dirty reads, but magic or unrepeatable reads can still occur.
  • REPEATABLE-READ:Multiple reads of the same field give the same result, unless the data was modified by the transaction itself.Dirty and unrepeatable reads can be prevented, but phantom reads can still occur.MySQL default transaction isolation level.
  • SERIALIZABLE: Highest isolation level, fully subject to ACID isolation level. All transactions are executed one by one so that interference between transactions is completely impossible. That is, this level prevents dirty reads, unrepeatable reads, and phantom reads.

It should be noted that the transaction isolation level and the concurrency of data access are antithetical, and the higher the transaction isolation level, the worse the concurrency. The appropriate transaction isolation level depends on the specific application. The default isolation level supported by MySQL InnoDB storage engine is REPEATABLE-READ.

Here are some things to note: Unlike the SQL standard, InnoDB storage engine uses the next-key Lock algorithm in REPEATABLE READ transaction isolation level, thus avoiding phantom reads, unlike other database systems such as SQL Server. So the default isolation level of InnoDB storage engine is REPEATABLE READ, which can fully guarantee the isolation requirements of transactions, that is, meet the SERIALIZABLE isolation level of SQL standard, and retain good concurrency performance.

MVCC multi-version concurrency control

Most of MySQL’s transactional storage engine implementations are not simple row-level locking. In order to improve concurrency, multi-version concurrency control (MVCC) is generally implemented at the same time, but the implementation mechanism is different.

MVCC can be considered a variant of row-level locking, but it avoids locking in many cases and is therefore less expensive. Implementation mechanisms vary, but most implement non-blocking reads, and writes lock only the necessary rows.

MVCC is implemented by saving a snapshot of the data at a point in time. This means that no matter how long it takes to execute, everything will see the same data.

MVCC implementation principle

The purpose of MVCC is multi-version concurrency control. The realization of MVCC in the database is to solve Read and write conflicts. Its realization principle is mainly based on the three implicit fields in the record, undo log and Read View. So let’s take a look at this three-point concept

Three implicit fields

DB_TRX_ID, DB_ROLL_PTR, and DB_ROW_ID are implicitly defined by the database

  • DB_TRX_ID: recently modified (modified/inserted) Transaction ID: Specifies the ID of the transaction in which the record was created or last modified
  • DB_ROLL_PTR: rollback pointer with undo log pointing to the previous version of this record
  • DB_ROW_ID: implicit increment ID (hide primary key)

The undo log

Undo logs are classified into two types:

  • insert undo log

The undo log, which represents the transaction generated when a new record is inserted, is only needed when a transaction is rolled back and can be discarded immediately after a transaction is committed

  • update undo log

The undo log generated when the transaction is update or DELETE; Not only when a transaction is rolled back, but also when a snapshot is read

Read View Read View

When a snapshot is Read, a snapshot of the current database system is generated. The ID of the current active transaction is recorded and maintained. (When each transaction is started, Each transaction is assigned an ID, which is incremented, so the latest transaction has a larger ID.)

What are current and snapshot reads?

Current read: it reads the latest version of the record. When reading, it locks the read record to prevent it from being modified by other concurrent transactions.

select lock in share mode , select for update; update; insert; Delete These operations are current reads

Snapshot read: The unlocked SELECT operation is snapshot read, that is, unlocked non-blocking read. The implementation of snapshot read is based on multi-version concurrency control. Since snapshot read is based on multi-version, snapshot read may not necessarily be the latest version of data, but may be the previous historical version

We can simply think of a Read View as having three global properties

  • The list:Use to maintain Read View to generate a list of transaction ids that are active on the system at any time
  • Up_limit_id:Minimum value of the current active transaction ID
  • Low_limit_id:The next transaction ID that has not been assigned by the system
  1. DB_TRX_ID < up_limit_id = DB_TRX_ID = up_limit_id = up_limit_id = up_limit_id = up_limit_id = up_limit_id = up_limit_id = up_limit_id = up_limit_id = up_limit_id
  2. DB_TRX_ID >= low_limit_id if the value is greater than or equal to DB_TRX_ID is not visible until the Read View is generated
  3. DB_TRX_ID specifies whether the transaction is active at the time the Read View is generated. If the transaction is active at the time the Read View is generated, the transaction has not been committed. The result of the modification is visible

MVCC only works at COMMITTED READ and REPEATABLE READ isolation levels.

What are the differences between RC and RR snapshot reads?

RC and RR use the MVCC mechanism. Why RC can avoid dirty read problems but RR can avoid duplicate read problems? Because the Read View is generated at different times!

  • RC: A new Read View is generated each time a snapshot Read is performed

  • RR: The Read View is generated when a snapshot Read is performed for the first time, and this version remains for the rest of the time

Locking mechanism

A lock is a mechanism by which a computer coordinates concurrent access to a resource by multiple processes or threads. In simple terms, database locking mechanism is a rule designed by the database to ensure the consistency of data and make all kinds of shared resources in order to be accessed concurrently.

Mysql > select * from user where id = 1;

According to the granularity of locks: row locks, table locks, page locks

By lock usage: shared locks, exclusive locks (an implementation of pessimistic locks)

There are also two kinds of locks on the mind: the pessimistic lock and the optimistic lock.

Row locks

Row-level locking is the most fine-grained lock in Mysql. It only locks the current row. Row-level locking can greatly reduce conflicts in database operations. Its locking particle size is the smallest, but the locking cost is also the largest. Deadlock situations can occur. Row-level locks are divided into shared locks and exclusive locks according to their usage.

Shared lock usage:

select … lock in share mode;

Exclusive lock usage:

select … for update

Table locks

Table level locks are the largest type of mysql locks, indicating that the current operation locks the entire table. The resource cost is lower than that of row locks, and deadlocks do not occur, but lock conflicts are likely to occur. Table locking is supported by most mysql engines. MyISAM and InnoDB both support table locking, but InnoDB uses row locking by default.

Shared lock usage:

LOCK TABLE table_name READ

Exclusive lock usage:

LOCK TABLE table_name WRITE

Unlock usage:

unlock tables;

Page locks

The overhead and locking time are between table and row locks. Deadlocks occur; The locking granularity is between table locks and row locks, and the concurrency is average.

Application: From the lock point of view, table-level lock is more suitable for the query based, only a small amount of data update according to the index conditions of the application, such as Web applications; Row-level locking is more suitable for applications with a large number of concurrent updates of a small amount of different data based on index conditions and concurrent queries.

Row locks Table locks Page locks
MyISAM Square root
BDB Square root Square root
InnoDB Square root Square root
Memory Square root

Optimistic and pessimistic locks

Both pessimistic lock and optimistic lock are concepts defined by people and can be considered as an idea.

  • Pessimistic locking: To put it simply, concurrency control is carried out from the database level. For example, row lock and table lock of InnoDB engine are pessimistic locks.
  • Optimistic locking: Performs concurrency control from the application level to the system level. Generally, there are few concurrent data conflicts by default. Therefore, it is common to add the version number field to the data table and verify the version number before updating the data each time

InnoDB row locks

InnoDB implements two types of row locks: shared lock and exclusive lock.

InnoDB also has two types of Intention Locks for internal use, both of which are table Locks, in order to allow row and table Locks to coexist and achieve multi-grained locking:

  • Intended shared lock (IS) : a transaction that intends to assign a shared lock to a row must acquire an IS lock on that table before assigning a shared lock to a row.
  • Intentional exclusive lock (IX) : a transaction that intends to lock a row exclusively must acquire an IX lock on the table before it can lock a row exclusively.

Intent locks are not mutually exclusive with row-level shared/exclusive locks!!

  • Record Locks

Locks on a single row record. Locks index entries (unique or primary key) to lock eligible rows. Other transactions cannot modify or delete locked entries;

  • Gap locks

When we retrieve data using a range condition rather than an equality condition and request a shared or exclusive lock, InnoDB locks the index entries of existing data records that meet the condition. Records whose key values are in the condition range but do not exist are called gaps.

InnoDB also locks this “gap”. This locking mechanism is called gap locking.

Gap locking is based on non-unique indexes and locks index records within a range. Gap Locking is based on the next-key Locking algorithm that will be mentioned below. Please bear in mind that gap Locking is used to lock an interval, not just every data in the interval.

  • Next-key locks

Next-key can be understood as a special kind of gap lock, or as a special algorithm. The illusion problem can be solved by temporary lock. A key lock exists on a non-unique index column of each data row. When a transaction holds a key lock on that data row, it will lock the data in a range of open and closed data. It is important to note that InnoDB row-level locking is index-based. Temporary locks are only associated with non-unique index columns. There are no temporary locks on unique index columns (including primary key columns).

When InnoDB performs UPDATE \ FOR UPDATE \ LOCK IN SHARE MODE on a row based on a non-unique index, InnoDB obtains a temporary key LOCK FOR the row and a gap LOCK FOR the next interval of the row.