sequence
This blog is the first in a series of “Seeing is Believing” blog posts. It mainly discusses the theoretical problems that can occur in database concurrency, the techniques to solve the concurrency problem – blocking, the rules of blocking convention – blocking protocol. Then the relationship between database transaction isolation level and blocking protocol is briefly explained. The next few blogs explore the implementation details of the InnoDB engine at various isolation levels in a hands-on manner.
Seeing is believing Database concurrency issues Lockdown protocol isolation level
Seeing is believing reading UNCOMMITED && SERIALIZABLE
【 Seeing is believing 】 READ COMMITTED && MVCC
REPEATABLE READ && next-key Lock
Several major problems of database concurrency
Lost Update
Two transactions, T1 and T2, read and modify the same data at the same time. The result of T2’s commit destroys the result of T1’s commit, causing the change of T1 to be lost (the second type of lost update).
There is also a special kind of loss modification (the first type of loss update), as shown below. Because such missing changes do not occur at the [READ UNCOMMITED] isolation level, they are not discussed.
② Non-repeatable Read
After transaction T1 reads the data, transaction T2 performs an update operation so that transaction T1 cannot reproduce the results of the previous read. (1) After transaction T1 reads a certain data, transaction T2 modifies it. When transaction T1 reads the data again, it gets a different value from the previous one.
(2) After transaction T1 reads some data records according to certain conditions, transaction T2 deletes some of them. When T1 queries data according to the same conditions again, it finds that some records have disappeared. (3) After transaction T1 reads some data records according to certain conditions, transaction T2 inserts some records. When T1 queries data according to the same conditions again, it finds some more records.
## 3 Phantom Read
Phantom reading is actually a special case of unrepeatable reading. Non-repeatable reads (2) and (3) are also known as illusory reads. Non-repeatable reads are generated when data is modified or updated. Phantoms are created by inserting or deleting data.
④ Reading Dirty Data
Transaction T1 modifies some data and writes it back to the disk. After transaction T2 reads the same data, T1 rolls back due to some reasons. At this time, the modified data on T1 is restored to its original value, and the data read by T2 is inconsistent with the data in the database.
The main technique for concurrency control is blocking
Exclusive Locks are also called write Locks. If transaction T locks X on data object A, only T is allowed to modify and read A. No other transaction can lock A of any type until T releases the lock on A. This ensures that no other transaction can read or modify A until T releases the lock on A. Shared Locks are also known as read Locks. If transaction T locks data object A with S, transaction T can read but not modify A. Other transactions can only lock A S again, not X, until T releases the S lock on A. This ensures that other transactions can read A, but cannot make any changes to A until T releases the S lock on A.
The compatibility matrix of exclusive and shared locks
The blockade agreement
There are also some rules that need to be agreed upon when locking data objects using X and S locks, the two basic types of locking. For example, when to apply X lock and S lock, hold the lock time, when to release. These specifications are called lockdown protocols.
Lockdown protocol
Level 1 lockdown protocol: transaction T must place an X lock on data A before modifying it until the end of the transaction. Transaction end includes normal end (Commit) and abnormal end (RollBack). A level 1 lockdown protocol prevents lost modifications. The coverage loss problem in Figure 1 is solved using the first-level blocking protocol. Transaction T1 adds the X lock to A before modifying it on reading A. When T2 requests to add the X lock to A again, it is rejected. T2 can only wait for T1 to release the lock on A and then T2 obtains the X lock on A. This avoids missing T1 updates.
Secondary lockdown protocol
Second-level lockdown protocol: The first-level lockdown protocol plus transaction T must lock data A before reading it, and release the s-lock after reading it. In addition to preventing lost changes, the second-level lockdown protocol further prevents “dirty” data from being read. The dirty read problem in Figure 2 was solved using the second-level lockdown protocol. Transaction T1 locks C with X before it reads C, changes its value, and writes it back to disk. At this point, T2 requests an S lock on C, because T1 already has an X lock on C, so T2 can only wait. T1 is cancelled for some reason, and C is restored to 100. T1 releases the X lock on C, T2 acquires the S lock on C, and reads C=100. This avoids reading “dirty” data.
Tertiary lockdown protocol
Three-level lockdown protocol: The first-level lockdown protocol plus transaction T must lock data A before it can be read until the end of the transaction. In addition to preventing the loss of modified and read “dirty” data, the three-level lockdown protocol further prevents non-repeatable reads. The non-repeatable read problem in Figure 3 is solved using the three-level blocking protocol. Transaction T1 locks data A and B with S before reading them. Other transactions can only lock data A and B with S, but not with X. In this way, other transactions can only read data A and B, but cannot change data A and B. At this point, T2 requests to lock X on B, because T1 has already locked S on B, so T2 can only wait. T1 reads the values of A and B again to check the calculation result. Since other transactions cannot modify the values of A and B, the result is still 150 and can be read repeatedly. T1 releases the S lock on A and B, and T2 acquires the X lock on B. This avoids unrepeatable reads.
Live locks and deadlocks
Lockouts can cause live locks and live deadlocks.
Live lock
If transaction T1 blocks data R, transaction T2 requests to block data R, and T2 waits. Transaction T3 also requests to block R. When transaction T1 releases the block on data R, the system first approves the block request of transaction T3, while T2 still waits. Then T4 applied for blockade R, and the system approved the blockade request of T4 after T3 released the blockade of R. T2 can wait forever, and that’s a live lock.
The way to avoid live locks is a first come, first served policy. When multiple transaction requests are blocked for the same data object, the blocking subsystem queues transactions according to the order of request. Once the lock on the data object is released, the first transaction in the application queue is granted the lock.
# # deadlocks
If transaction T1 blocks data R1, transaction T2 blocks data R2, and then T1 requests to block data R2 because T2 already blocks data R2, T1 waits for T2 to release the lock on R2. T2 then applies to block R1, because T1 has already blocked data R1, and T2 can only wait for T1 to release the lock on R1. Thus, T1 is waiting for T2, and T2 is waiting for T1. The two transactions can never end, forming a deadlock.
Deadlock prevention:
① A blockade method
The one-time lock method requires that the transaction must lock all the data to be used at one time, or the execution cannot continue. For example, transaction T1 in the figure above locks data R1 and R2 at once, and T1 can continue while T2 waits. T1 releases locks on R1 and R2, and T2 continues. This will not cause deadlocks.
The one-time lock method can prevent deadlocks, but its disadvantages are obvious. Locking the data that will be used in the future at one time inevitably expands the scope of the blockade, thus reducing the concurrency of the system.
(2) Sequential blockade method
Sequential blocking is to specify a blocking order for data objects in advance, and all transactions are blocked in accordance with this order.
Sequential locking is an effective way to avoid deadlocks, but the problem is obvious. First, the database system blocks a very large number of data objects, and with the operation of data insertion, deletion and other changes, it is very difficult to maintain the blocking sequence of such resources, the cost is very high. Second, the blocking requests for a transaction can be determined dynamically as the transaction executes, making it difficult to implement the blocking in a prescribed order.
It can be seen that preventing the occurrence of deadlocks is not suitable for the characteristics of the database, so in solving the problem of deadlock is commonly used to diagnose and unlock deadlocks.
Deadlock diagnosis and release:
(1) method of overtime
If a transaction waits longer than the default time, a deadlock is considered to have occurred.
② Wait graph method
Once a deadlock is detected in the system, try to remove it. The usual solution is to select a transaction that is least costly to deal with deadlocks and then undo it, releasing all locks held by the transaction and resuming the data modification operations performed by the transaction so that other transactions can continue to run.
Two – segment lock protocol
The so-called two-stage locking protocol means that all transactions must be locked and unlocked in two phases.
-
Before any data is read or written, apply for and obtain a lock on the data.
-
After releasing a block, a transaction does not apply for and obtain other blocks.
That is, the transaction is divided into two phases. The first stage is acquiring the blockade, also known as the expansion stage. At this stage, a transaction can acquire any type of lock on any data item, but cannot release any lock. The second stage is the release of the blockade, also known as the contraction stage. At this stage, a transaction can release any type of lock on any data item, but no more locks can be applied.
Transaction compliance with the two-lock protocol is a sufficient condition for serializable scheduling, but not a necessary one. That is to say, the two – segment locking protocol must be serializable schedulable, and the serializable schedulable is not necessarily the two – segment locking protocol.
The similarities and differences between two – stage lock protocol and one – time lock method
The one-time lock method requires that a transaction must lock all data to be used, or it cannot continue. Therefore, the one-time lockdown law follows the two-lock protocol.
However, the two-segment locking protocol does not require all the data to be used in a transaction to be locked at one time, so deadlock may occur in the two-segment locking protocol. As shown in figure:
Database isolation level
Lockdown protocols and isolation levels do not correspond strictly.
Concurrency problems that can be avoided at various isolation levels
【 wechat official account 】Thank you for your support. Please indicate the source of reprint and quotation.