Concept of premise
Three scenarios of database concurrency
There are three database concurrency scenarios:
-
Read-read: There are no problems and no concurrency control is required
-
Read-write: Has isolation problems, may encounter dirty read, unreal read, non repeatable read
-
Write – write: There may be update loss problems, such as type 1 update loss, type 2 update loss
Clarification of optimistic locks and pessimistic locks
Pessimistic or optimistic locks, they are not specific lock concepts in the database, but ideas that we define to describe both types of locks. So with the classification of design, we can use this classification to classify the specific locks in the database;
However, optimistic locks in databases tend to be called optimistic concurrency control (OCC), pessimistic locks are called pessimistic concurrency control (PCC), and there is a control that is different from optimistic and pessimistic locks called MVCC, multi-version concurrency control
Optimistic and pessimistic locks are not to be confused with row, table, exclusive, and shared locks in a database. The former is a locking thought, and the latter can be classified according to whether the thoughts approaching optimistic or pessimistic locking are carried out
The concept of optimistic lock and pessimistic lock not only exists in the field of database, it can be said that there are thread safety, there are concurrent scenarios almost have optimistic lock and pessimistic lock applicable scenarios, such as Java also has the concrete implementation of optimistic lock and pessimistic lock ideas; However, the specific implementation of optimistic and pessimistic locks varies from field to field, and the problems to be solved may also vary
So if someone asks you what optimistic lock and pessimistic lock are, don’t say it’s a specific lock, it’s just a design idea of a lock, it can have a lot of specific implementation classes, right
Pessimistic locking
What is pessimism lock?
In relational database management systems, Pessimistic Concurrency Control (also known as “Pessimistic lock”) is an approach to Concurrency Control. Pessimistic locking refers to the adoption of a pessimistic and negative attitude, default data access, will inevitably conflict, so in the whole process of data processing are locked state, to ensure that at the same time, only one thread can access the data, data exclusivity; In general, pessimistic database locking is implemented by using the locking mechanism provided by the database itself.
Pessimistic concurrency control for databases can resolve read-write conflicts and write-write conflicts, which are resolved by locking
Pessimistic lock implementation
In general, the pessimistic lock of the database is realized by using the lock provided by the database itself
- To access a piece of data, the outside world first requests a lock (some kind of lock) on that data from the database.
- If it succeeds, it can manipulate the data, and no other client can manipulate the data while it does so
- If the lock fails, another client has acquired the lock at the same time and must wait for another client to release the lock
Of course, databases provide many locks, and the locks provided by each database are not always the same, so it depends on the type of lock, such as row lock, table lock, etc.
Advantages and Disadvantages
Pessimistic concurrency control is actually a conservative strategy of “lock first, access later”, which guarantees the security of data processing. However, in terms of efficiency, the mechanism for handling locking can impose additional overhead on the database and increase the chance of deadlocks. In addition, in read-only transactions, there is no conflict and no need to use locks, which can only increase the system load; In addition, parallelism is reduced. If a transaction locks a row, other transactions must wait for the transaction to complete before processing that row.
Advantages: suitable for writing more read less concurrent environment, although can’t maintain a very high performance, but in optimistic locking is unable to carry on the premise of better performance, can achieve the data security weakness: the lock can increase system overhead, although can guarantee the security of the data, but data processing throughput is low, not suitable for reading writing fewer situations
Optimistic locking
What is optimism lock?
Optimistic Concurrency Control (also known as Optimistic Concurrency Control, or OCC) is an approach to Concurrency Control in relational database management systems. Optimistic Locking is relative to pessimistic Locking. Optimistic Locking assumes that even in concurrent environment, external operations on data will not cause conflicts, so it will not be locked (Optimistic Locking is not a lock), but when data is submitted for update. If a conflict is found, the conflict information is returned so that the user can decide how to do the next step, such as retry, until the data is successfully detected. The optimistic lock of database is not realized by using the lock of database itself, but by using some realization logic to realize the idea of optimistic lock
The optimistic concurrency control of database is to solve the write – write conflict in the database concurrency scenario, which is solved in a lock – free way
CAS thought
In fact, the concrete implementation of optimistic database lock is almost the same as the CAS algorithm used in Java optimistic lock, so we can learn the design of optimistic database lock from CAS algorithm: CAS instruction is fully called Compare and Swap, which is the instruction set of the system. The whole CAS operation is an atomic operation, which is indivisible. From the specific description, we can view CAS operation as follows:
The CAS instruction requires three operands, which are the memory location V, the old expected value A, and the new value B. When the CAS instruction is executed, the processor will update the value of the built-in position V with the new value B when the present value of the built-in position V we read is equal to the old expected value A. Otherwise it does not update, but returns the old value of V whether or not it updates the value of V.
I = 2; I++, that is:
First thread 1 reads the value from memory location V and saves it as the old expected value A. (V = 2, A = 2)
And then since I is going to ++, the system will compare the present value of memory location V with the old expected value A, V =? A.
If it’s equal, B = i++ = 3, the new value B will update memory location V, so the value of memory location V will become the value of B, 3
If the value is not equal, then another thread has changed the value of memory location V. For example, thread 2 updated the value of I before thread 1 changed the value of I. , so thread 1 fails to update variable I. But instead of hanging, the thread returns to a failed state and waits for the calling thread to decide whether to retry or do something else. (Usually retries until successful)
Optimistic lock implementation at the database level is similar to implementation at the code level
Implementation of optimistic lock in database
There are two common implementations of optimistic locking, but both are inherently CAS minded:
- A:Using the data version (
version
) implementation
This is one of the most common implementations of optimistic locking. What is a data version? Add a field to the table as the version identifier for the record, for example, version. Each write to the record will give version+ 1.
Therefore, when we read the data (including version) and make an update, we will compare the obtained version with the version in the database to see if it is consistent. If it is consistent, it represents the time period. There is no other thread that has also modified the data and updated it with version + 1. If not, it means that the record has been modified by another thread during this period of time, and it is considered to be expired. The conflict information is returned, and the user can decide the next action, such as retry (re-reading the latest data and updating again).
update table set num = num + 1 , version = version + 1 where version = #{version} and id = #{id}
- Method 2:Use the timestamp (
timestamp
) implementation
Add a column to the table named update_time (timestamp)
Submitted a consistent principle and way, also is in the update check when the current data in the database update timestamp and his former to timestamp is consistent, if the agreement represents this moment without conflict, you can submit update, update timestamp to the current time at the same time, otherwise it is the time period has other threads also update submission, return conflict information, waiting for the user’s next move.
update table set num = num + 1 ,update_time = unix_timestamp(now()) where id = #{id} and update_time = #{updateTime}
But we notice that in order to realize the idea of optimistic locking at the same time, we have to make sure the CAS multiple operating atomicity, namely for the version of the database data, get before he took the database data version and the version comparison, and update the data such as execution of these operations must be coherent, with composite atomic operations; So if it is database SQL, then we need to ensure that multiple SQL operations are in the same transaction
Advantages and Disadvantages
Advantages: Avoid the overhead of database locking and improve the response performance of Dao layer in concurrent scenarios with more read and less write. In fact, in many cases, our ORM tools have implementation with optimistic locking, so these methods do not need us to artificially implement the disadvantages: In the concurrent scenario with many writes and few reads, that is, in the case of fierce write operations, CAS retries are caused. The conflict frequency is too high, resulting in higher cost than pessimistic lock
MVCC multi-version concurrency control
What is a MVCC?
Concurrency Control MVCC is a multi-version Concurrency Control. MVCC is a method of concurrency control, which can realize concurrent access to database in database management system and transaction memory in programming language.
The implementation of MVCC in MySQL InnoDB is mainly to improve the performance of database concurrency, with a better way to deal with read-write conflict, even if there is read and write conflict, can also do not lock, non-blocking concurrent read
What are current and snapshot reads?
What are current and snapshot reads under MySQL InnoDB?
The current reading
Like select lock in share mode, select for update; Update, INSERT,delete(exclusive lock) operations are a type of current read, why are they called current read? It reads the latest version of the record and locks the read record to ensure that other concurrent transactions cannot modify the current record
Read the snapshot
For example, an unlocked SELECT operation is a snapshot read, that is, an unlocked non-blocking read. The prerequisite for snapshot reads is that the isolation level is not serial. Snapshot reads at the serial level are degraded to current reads. The reason for the occurrence of snapshot read is to improve concurrency performance. The implementation of snapshot read is based on multi-version concurrency control, namely MVCC. MVCC can be considered as a variant of row lock, but it avoids locking operations and reduces overhead in many cases. Since it is based on multiple versions, snapshot reads may not necessarily read the latest version of data, but may read the previous historical version
To put it simply, snapshot read is the specific non-blocking read function of MVCC in MySQL. The purpose of MVCC multi-concurrent control is to achieve read-write conflicts without locking and improve concurrent read and write performance, and this read refers toRead the snapshot
, rather than current read, which is actually a locking operation and an implementation of pessimistic locking
conclusion
The choice between optimistic and pessimistic locks
The choice between optimistic lock and pessimistic lock is mainly reflected in the writing-writing in the choice between pessimistic lock and optimistic lock, we can consider from the following three factors:
Response time: If the Dao layer needs very high response time, especially if there are more reads and fewer writes, then we can use an optimistic locking scheme to reduce the overhead of database locking and provide concurrency
Conflict frequency: If the conflict frequency is very high, then we can use pessimistic lock to ensure the success rate; After all, if the conflict frequency is high, optimistic locking will require multiple retries to succeed, which can increase the cost significantly
Retry cost: If the retry is expensive, for example if the code execution of the retry process is too time-consuming, then I would not recommend using optimistic locking, but rather pessimistic locking
So we know:
- When there are more reads and fewer writes and CAS competition is not so fierce, we can adopt optimistic locking strategy to reduce the cost of database locking and improve the concurrent response of database
- In the scenario of more write and less read, we do not recommend using optimistic locking because of the large number of CAS contention and the high retry cost. Instead, we recommend using pessimistic locking
OCC,PCC, AND MVCC
Pessimistic concurrency control (PCC) is a kind of locking concurrency control used to solve read-write conflicts and writer-write conflicts. Each operation is locked. At the same time, only the transaction that obtains the lock can have the right to operate on the data, and the transaction that does not obtain the lock can only wait for other transactions to release the lock. So it can solve the problem of dirty read, unreal read, unrepeatable read, first type update lost, second type update lost
Optimistic Concurrency control (OCC) is a lockless concurrency control that is used to resolve write-write conflicts. It is considered that there is not much contention between transactions, so it should be modified first. Before committing a transaction, check for new commit changes since the transaction has started. Optimistic concurrency control is similar to spin locks. Optimistic concurrency control is suitable for the environment with low data contention and few write conflicts. Can’t solve dirty read, unreal read, unrepeatable read, but can solve update loss problem
Multi-version concurrency control (MVCC) is a lock-free concurrency control used to resolve read-write conflicts. This means that a transaction is assigned a time-stamp of unidirectional growth, a version is saved for each change, and the version is associated with the transaction timestamp. The read operation reads only the snapshot of the database before the transaction started. In this way, read operations do not block write operations, and write operations do not block read operations; It can not only improve the concurrency performance, but also solve the transaction problems such as dirty read, unreal read, and unrepeatable read. Except for lost updates
In general, the emergence of MVCC is the database dissatisfaction with the pessimistic lock to solve the read-write conflict problem, because of the low performance of the solution, so in the database, we can form two combinations:
MVCC + Pessimistic lock
MVCC resolves read/write conflicts, pessimistic locking resolves write conflictsMVCC + Optimistic lock
MVCC resolves read/write conflicts, optimistic locking resolves write conflicts
This combined approach maximizes database concurrency performance and solves problems caused by read/write conflicts and write conflicts.