Wechat public number: The way to grow up
Hi, I’m Leo, and the last article was about why it’s slow to query a record. Today we introduce some related knowledge of illusion, and illusion related gap lock, gap lock deadlock solution.
concept
Repeatable read
The two transactions conduct data operations without interfering with each other. Transaction A conducts data query first, transaction B conducts A transaction modification and data submission, and transaction A conducts A query again. The data remains unchanged
Submit to read
Two transactions conduct data operations. Transaction A conducts data query first, transaction B conducts A transaction modification and data submission, and transaction A conducts A query again with the data modified by B.
case
What is illusory
As shown in the picture below, let’s analyze it together.
- SessionA starts a transaction and locks d = 5 at time T1
- SessionB did not start the transaction. Alter data 0, change d to 5
- SessionA continues to execute d=5 data with write lock
- SessionC inserted a data 115
- SessionA looked up the data again and found that the data kept changing and growing
This is done from the start of a transaction to the end of a transaction if the same data sees different results. That’s what we call phantasmagoria.
For update add write lock to current read. The current read rule is to see all the data that has been submitted.
What’s wrong with illusory reading
As shown in the figure below, let’s continue our analysis
Update t set c=5 where id=0; update t set C =5 where id=0; update T set C =5 where id= 5;
Because at time T1, session A is still locking the row id=5, not the row id=0. Therefore, session B can execute both update statements at time T2. This breaks the lock declaration for session A Q1 statement to lock all d=5 rows.
Session C does the same thing. The lock declaration for Q1 is broken by changing the id=1 line.
So that’s a semantic question. Then there is the problem of data consistency
As we know, locks are designed to ensure data consistency. This consistency is not only the consistency of the data state within the database at the moment, but also the logical consistency of the data and logs.
As shown below, we continue to analyze what could go wrong.
Update t set d=100 where d=5
Update has the same meaning as for update. Lock d = 5. And then I’m going to change d to be 100
- SessionA locks data whose d is 5 at time T1. And change d to 100 (do not commit)
- SessionB at T2 changes the id of data 0 to D and c to 5. (submitted)
- Return to time T3 of sessionA, query write lock again
- SessionC at T4, insert statement is executed, alter id 1 c to 5. (commit)
There seems to be no logic or consistency problem. Take a look at the binlog
update t set d=5 where id=0; / * (0,0,5) * /
update t set c=5 where id=0; / * (0,5,5) * /
insert into t values(1.1.5); / * (1,1,5) * /
update t set c=5 where id=1; / * (1,5,5) * /
update t set d=100 where d=5;/* for all rows where d=5, d becomes 100*/
Copy the code
You will notice that the results become (0,5,100), (1,5,100) and (5,5,100) after executing the three rows. That means two numbers have been changed.
So what should we do? As shown below, with a lock
Session A locks all rows, so session B is locked on the first UPDATE statement. Session B can be executed only after Session A is submitted at time T6.
So for the row id=0, the final result in the database is (0,5,5). In binlog, the execution sequence looks like this:
insert into t values(1.1.5); / * (1,1,5) * /
update t set c=5 where id=1; / * (1,5,5) * /
update t set d=100 where d=5;/* for all rows where d=5, d becomes 100*/
update t set d=5 where id=0; / * (0,0,5) * /
update t set c=5 where id=0; / * (0,5,5) * /
Copy the code
The inconsistencies in the binlog above are resolved. And the numbers are right. So one more question!
Locking all addresses the correctness of each data, so new data cannot be guaranteed to be correct. Now it’s not a read-write lock.
How to solve phantom reading? Clearance lock!
Today we’re talking about gap locks. Just a quick introduction. For example, a table has six rows of data. Then 7 gap locks will be added. These seven locks are placed before and after each record.
Select * from t where d=5 for update Not only are row locks added to the six records already in the database, but also seven gap locks are added simultaneously. This ensures that no new records can be inserted.
Rows of data are entities that can be locked, and gaps between rows of data are entities that can be locked.
Row lock, gap lock, read lock, write lock
Row locks are classified into read locks and write locks.
A gap lock is a separate lock.
In other words, the conflict with row lock is “another row lock”.
A conflict with gap locking is the operation “insert a record into the gap”. There is no conflict between gap locks.
Let me give you an example
- SessionA starts a transaction and assigns a read lock to c = 7.
- Session B is not blocked. Session A adds A gap lock (5,10) because there is no record of c=7 in table t. Session B also uses a gap lock on this gap. They have a common goal: to protect this gap and not allow values to be inserted. However, there is no conflict between them.
A gap lock and a row lock are collectively called next-key locks. Each next-key lock is a front-open and back-closed interval
For update (-∞,0], (0,5], (5,10], (10,15], (15,20], (20, 25], (25, +supremum])
Supremum: Because +∞ is the open interval. InnoDB implements a non-existent maximum value, supremum, for each index.
Back to the case
The introduction of gap locking and next-key lock helped us solve the illusion problem, but it also brought some “headaches”.
Let’s start with a logic and continue the theory!
The ** business logic ** looks like this: lock any row, insert it if it doesn’t exist, and update its data if it does
begin;
select * from t where id=N for update;
/* If the row does not exist */
insert into t values(N,N,N);
/* If the row exists */
update t set d=N set id=N;
commit;
Copy the code
Once this logic has concurrency, it encounters a deadlock. This logic is locked with a “for UPDATE” mode before each operation.
Let’s say N is 9
- Session A Run select… For update statement, because id=9 line does not exist, add gap lock (5,10);
- Session B Execute select… The for update statement will also have a gap lock (5,10). The gap lock will not conflict, so this statement will execute successfully.
- Session B tries to insert A row (9,9,9), but is blocked by session A’s gap lock.
- Session A’s attempt to insert A row (9,9,9) is blocked by session B’s gap lock.
At this point, the two sessions enter the mutual wait state, resulting in a deadlock. Of course, InnoDB’s deadlock detection immediately detected this deadlock relationship, causing session A’s INSERT statement to return an error.
Conclusion: The introduction of gap locking may cause the same statement to be locked in a larger range, which actually affects concurrency
Business balance
As we mentioned at the beginning, illusions only occur at repeatable isolation levels. Gap locking is only effective at the repeatable read isolation level.
So, if you set the isolation level to read commit, there will be no gap locks. At the same time, however, you need to resolve possible data and log inconsistencies by setting the binlog format to row. This, also is the configuration combination that many companies use now.
conclusion
Deadlocks caused by gap locks often occur in production repositories. The introduction of gap lock will affect the concurrency of the system and increase the complexity of lock analysis, but there are also rules to follow