Recently, we are going to do a technology sharing in our company. I have been thinking about what to share. Finally, I am ready to share the transaction magic reading problem and solution –LBCC&MVCC under MySQL’s InnoDB engine with my friend’s prompt. After several days of staying up all night, I finally got this part of the content clear. Why InnoDB? The MyISAM engine does not support transactions.
The transaction
concept
A transaction consists of n units that either succeed or fail at the same time during execution, putting n units into a transaction. Take a simple example: under the premise of not considering whether the test questions are correct or not, a test paper is composed of multiple questions. When you finish answering the questions and give them to the teacher, you give the whole test paper to the teacher, instead of giving each question to the teacher separately. Here, the test paper can be understood as a business.
Properties of transactions: ACID
A: Atomicity. Atomicity refers to the fact that A transaction is an indivisible unit of work, in which all operations either occur or none occur.
Example: Suppose you add two items of clothing to your shopping cart: a top and pants. When you submit two items of clothing for payment as a single order, either both items are paid successfully or both items fail. It is impossible for the top to be paid for and the pants to be paid for, and vice versa.
C: Consistency. In a transaction, data integrity must be consistent before and after the transaction.
For example, if user A and user B have A total of 200, then no matter how many transfers are made between user A and user B, the sum of the two users’ money should still be 200 at the end of the transaction. This is the consistency of the transaction.
I: Isolation, which exists in multiple transactions. The Isolation of transactions means that when multiple users concurrently access the database, the transactions of one user cannot be disturbed by the transactions of other users, and the data of multiple concurrent transactions must be isolated from each other.
Example: For any two concurrent transactions T1 and T2, from the point of view of transaction T1, T2 either ends before T1 begins or starts after T1 ends, so that each transaction does not feel that any other transaction is executing concurrently.
D: Durability, Durability means that once a transaction is committed, its changes to the data in the database are permanent and then they should not be affected if the database fails.
Example: When we operate on a database, either a transaction commit or a rollback directly changes the value in the database.
Operation of transaction
Before using a transaction, first we need to start a transaction. We can start a transaction with the start or begin command. If we want to commit the transaction, we can execute the commit command manually. If we want to rollback the transaction, we can execute the rollback command.
In MySQL, transaction commit is enabled by default. Run the show variables like ‘autocommit’ command to check.
Concurrency problems caused by isolation
1) Dirty read: transaction B reads data that has not been committed by transaction A;
2) Non-repeatable read: transaction B reads the data submitted by transaction A, that is, the data read by transaction B before and after the submission of transaction A is inconsistent (transaction AB operates on the same data);
3) Phantom read/virtual read: transaction B reads the data already committed by transaction A, that is, transaction A performs the insert operation, and transaction B reads inconsistent data before and after transaction A.
The isolation level of the transaction
To solve the concurrency problems caused by the above isolation, the database provides a mechanism for isolating things.
- Read Uncommitted: When a transaction is not committed, its changes can be seen by other transactions, and uncommitted data can be read, solving no problem;
- Read COMMITTED: After a transaction commits, its changes can only be seen by other transactions. Reading committed data can solve dirty reads —-
oracle
The default; - Repeatable read: The data seen during the execution of a transaction is always the same as the data seen at the start of the transaction, which can solve dirty and unrepeatable reads
mysql
The default; - Serializable: As the name implies, for the same row, “write” will add “write lock”, “read” will add “read lock”. When a read-write lock conflict occurs, the last accessed transaction must wait for the previous transaction to complete before execution can continue. It can handle dirty reads, unrepeatable reads, and virtual reads – equivalent to locking tables.
Although the Serializable level solves all database concurrency problems, it locks every row of data read, which can lead to a large number of timeouts and lock contention issues, resulting in reduced efficiency. As a result, serializable is rarely used in practice, only in cases where there is a strong need to ensure data consistency and where no concurrency is acceptable.
LBCC&MVCC
The default transaction isolation level of InnoDB is REPEATable Read (hereafter referred to as RR). In order to solve the concurrency problem of phantom reads under this isolation level, two schemes LBCC and MVCC are proposed. LBCC solves the illusion of the current read, while MVCC solves the illusion of the ordinary read (snapshot read). The answer to what is a current read and what is a snapshot read will be given below.
LBCC
LBCC is short for Lock-based Concurrent Control, which stands for lock-based concurrency Control. InnoDB can be divided into shared lock (S), exclusive lock (X) and intentional lock. Intentional lock IS divided into intentional shared lock (IS) and intentional exclusive lock (IX). (I will not introduce InnoDB and Myisam engine locks in the future.) By locking algorithm, there are Record Locks, Gap Locks, and next-key Locks. Among them, the temporary key lock can be used to solve the illusory reading problem under RR. So what is a keylock? Keep reading.
We call each row of data stored in the database a record. In the figure above, 1, 5, 9, and 11 respectively represent the records whose IDS are the current numbers. Records whose key values are in the condition range but do not exist are called gaps. In the figure above, (-∞, 1), (1, 5)… (11, +∞) is the gap existing in the database. And (-∞, 1], (1, 5]… (11, plus infinity) we call it the adjacent bond, the set of open and closed.
Record Locks
Locking a row record in a table is called a record lock, or row lock for short. You can use the SQL statement select… For update, the select statement must be an exact match (=), not a range match, and the matching column field must be a unique index or primary key column. Record locks can also be added by updating rows that are indexed by the primary key or unique index of the query condition.
A record lock exists in a unique index, including a primary key index, locking a single index record.
GAP Locks
The clearance lock mentioned above is the clearance lock. A gap lock locks a range, but does not include existing index entries. You can use the SQL statement select… For update, the select statement is the range query, the matching column field is the index item, and no data is returned; Or the SELECT statement is an equivalent query with matching fields as unique indexes and no data returned.
A fatal weakness of gap locking is that when a range key is locked, even some non-existent keys will be locked innocently, resulting in the lock cannot insert any data within the range of the locked key. In some scenarios this can be detrimental to performance. Here is an example of an insert operation after locking:
select * from user where id > 15 for update;
// Insert failed because id20 is greater than 15
insert into user values(20.'20');
// Insert failed because gap lock locks record gap, not SQL, that is, 'SELECT' statement lock range is (11, +∞), and 13 is in this range, so also failed.
insert into user values(13.'13');
Copy the code
GAP Locks exist only at the RR isolation level and lock data within the GAP. After the lock is added, no other records can be inserted into the gap, and the gap is locked, not the SQL statement. There is no conflict between gap locks.
Turn on the gap lock setting: run the show variables like ‘innodb_locks_unsafe_for_binlog’ command; To check whether Innodb_locks_unsafe_for_binlog is disabled. Innodb_locks_unsafe_for_binlog The default value is OFF, that is, gap locking is enabled. Because this parameter is in read-only mode, if you want to disable gap locking, you need to modify my.cnf (Windows is my.ini) to restart.
# add to [mysqld] in my.cnf
[mysqld]
innodb_locks_unsafe_for_binlog = 1
Copy the code
Next-key Locks
When we lock both the record and the gap above, we add an adjacent key lock (open left close set lock). To prevent phantoms, a temporary key lock prevents the insertion of a new record for a specific condition, because the intent to insert is acquired during insertion and conflicts with an existing temporary key lock. You can use the SQL statement select… For update, the select statement is the range query, the matching column field is the index item, and the data return; Or the SELECT statement is an equivalent query, matching column fields as index entries, regardless of whether data is returned.
Insert intention lock is not intention lock, but a special gap lock.
conclusion
- If the query does not hit the index, it degrades to a table lock;
- If the equivalent query has a unique index and hits a unique record, it degrades to a row lock.
- If the equivalent query has a unique index and no record is hit, it degenerates into a gap lock of adjacent nodes.
- If the equivalent query is not unique index and no record is hit, it degenerates into gap lock of adjacent nodes (including nodes are also locked). If a record is hit, the temporary key locks for all hit rows are locked, along with the gap locks for the interval next to the maximum record row.
- If a range query is unique or non-unique and hits a record, the temporary key locks are locked for all hit rows and the gap locks are also locked for the interval next to the maximum record row.
- If the range index is queried and no record is hit, it degrades to a gap lock on a nearby node (including a node that is also locked).
The current reading
Locking Read, also known as Locking Read, reads the latest version of the current data. After reading the data, the data is locked to prevent other transaction changes, that is, to solve the current Read problem through next-key lock (row lock +gap lock). Read the latest version of the data record, including the following SQL types: SELECT… Lock in share mode, select… For update, update, delete, insert.
MVCC
LBCC is concurrency control based on lock. Because the granularity of lock is too large, it will lead to the decline of performance. Therefore, MVCC, which is superior to LBCC, is proposed. MVCC is short for Multi-version Concurremt Control, which is a concurrency Control protocol based on multiple versions. By Version number, the same data can be avoided from competing between different transactions. It only exists in the InnoDB engine. It is mainly to improve the performance of database concurrent read and write, without locking multiple transactions concurrent read and write. The implementation of MVCC relies on three hidden fields, Undo log, and Read View. The core idea is to find only rows whose transaction ID is less than or equal to the current transaction ID. You can only find rows whose deletion time is greater than or equal to the current transaction ID, or rows that have not been deleted. Let’s look at MVCC at the source level.
Hidden columns
MySQL generates hidden columns for each row, so let’s take a look at some of these hidden columns.
DB_TRX_ID: specifies the transaction ID, which is automatically incremented based on the transaction occurrence time. InnoDB storage engine records the ID of a transaction if the transaction is added, deleted, or modified.
DB_ROLL_PTR: A rollback pointer, essentially a pointer to the undo log corresponding to the record. InnoDB uses this pointer to find the previous version of data. All the old versions on the row are organized in a linked list in undo log.
DB_ROW_ID: row ID (hidden monomonotonically increment ID). InnoDB automatically generates a hidden primary key if the table has no primary key. The size is 6 bytes. If the data table is not set with a primary key, the cluster index is generated with it.
(4) In fact, there is a hidden deletion flag field. If the record is updated or deleted, it does not mean that it is really deleted, but that the deletion flag has changed.
undo log
Whenever we need to make a change to a record (INSERT, DELETE, UPDATE), we need to record what we need for the rollback, such as:
- Insert undo log: When inserting a record, at least write down the primary key value of the record, and delete the corresponding primary key value when rolling back.
- Delete undo log: When deleting a record, at least the contents of the record should be written down so that the records composed of these contents can be inserted into the table when the rollback is performed.
- Update undo log: When modifying a record, record at least the old value before modifying the record, so that the record can be updated to the old value when rolling back.
InnoDB calls these things recorded for rollback undo logs. One thing to note here is that since the query operation (SELECT) does not modify any user records, the corresponding Undo log does not need to be logged when the query operation is executed.
An undo log is logged every time a change is made to the record. Each undo log has a DB_ROLL_PTR attribute. You can link these undo logs together into a linked list to form a version chain. The head node of the version chain is the current record value.
case
Let’s insert a record with a transaction ID of 80 that looks like this at the momentIn factinsert undo
This type of undo log is used only when the transaction is rolled back. After the transaction is committed, this type of undo log is uselessUndo Log Segment
It’s also recycled by the system. The SQL operation is then continuedThe version chain is as follows
Many people think that undo log is used to physically restore the database to its original state before executing statements or transactions. In fact, it is not the case. Undo log is a logical log, which only restores the database logic to its original state. Because in a multi-concurrent system, you can physically restore data on one page to its original state, which may affect other transactions.
Read View
At the repeatable read isolation level, we can read each ordinary SELECT query (without the for UPDATE statement) as a snapshot, and the snapshot is the moment of the select. The generated array of all uncommitted transaction ids in the current database system (the smallest id in the array is MIN_id) and the created maximum transaction ID (max_id) is called the consistency view readView. In the process of snapshot reading, the transaction ID of each version in the version chain should be matched with readView according to certain rules to query the results we need.
Snapshot reads do not see data inserted by other transactions. Therefore, illusion only appears under “current read”. The implementation of snapshot reading is based on multi-version concurrency control (MVCC). MVCC can be considered as a variant of line locking, but it avoids locking operation and reduces overhead in many cases. Since it is based on multiple versions, snapshot reads may not necessarily read the latest version of data, but may read the previous historical version. MVCC only works at READ COMMITTED and REPEATABLE READ isolation levels. The other two isolation levels are not compatible with MVCC. Because READ UNCOMMITTED always reads the latest rows, not rows that match the current version of the transaction, SERIALIZABLE locks all READ rows. The snapshot point of the transaction (the generation time of the Read View, as described below) is confirmed by the first SELECT. So even if the transaction starts first and the SELECT is performed after a statement such as update from a later transaction, it can retrieve the data corresponding to the previous transaction.
Snapshot read and current read at the RC and RR isolation levels: In the RC isolation level, snapshot read results are the same as current read results. In the RR isolation level, the current read result is the latest committed result of another transaction, and the snapshot read result is the read result before the current transaction. The read time of a snapshot in an RR determines the read version.
For transactions using RC and RR isolation levels, it is necessary to ensure that the records modified by committed transactions are read, meaning that if another transaction has modified the record but has not committed it, the latest version of the record cannot be read directly. The core problem is that you need to determine which version in the version chain is visible to the current transaction. To this end, InnoDB proposes a concept called Read View.
A Read View is a consistent Read View produced when a transaction performs a snapshot Read (common SELECT query). A snapshot of the current database system is generated at the moment of the snapshot Read. It consists of an array of all uncommitted transaction ids (min_id is the smallest in the array) and the maximum transaction ID (max_id) that was created when the query was executed. The result of the query is compared to the read View to get the snapshot result.
Version chain alignment rules:
- If it falls in green (trx_id
- If it falls in red (trx_id>max_id), the version is generated by a future transaction and is definitely not visible.
- If it falls in yellow (min_id<=trx_id<=max_id), then there are two cases:
A. If the row’s trx_ID is in the array, it indicates that the version was generated by a transaction that has not yet committed. If it is your own transaction, it is visible; B. If the trx_id of row is not in the array, this version is generated by a committed transaction.
First of all, we need to prepare two tables, a test and an account table, then we use the undo log of account to draw the version chain, prepare the data and the original record as shown below
//test the data in the table
id=1,c1='11';
id=5,c1='22';
/ / account table data
id=1Name = 'lilei';Copy the code
As shown below, we will follow the order insidesql
When we get to line 7select
Is generated when theReadview [100200], 300
, version chain is shown in the figure:At this time, the data we query islilei300
. We need to get the latest version of the data firsttrx_id=300
toreadview
The data has already been submitted, so it is visible. Continue down, when line 10 is executedselect
Statement, becausetrx_id=100
Not committed, so the version chain is stillReadview [100200], 300
, version chain is shown in the figure:At this time, the data we query islilei300
. So let’s do it up here, match it from the latest version down, and we need to get the latest version data firsttrx_id=100
toreadview
The data is in the uncommitted array, and is not its own transaction, so it is not visible; Then we select the previous version of the data, the result is the same as above; Keep looking up and when you find ittrx_id=300
, will fall in the yellow range, and is submitted, so the data is visible. Continue down, when line 13 is executedselect
Statement, this time althoughtrx_id=100
It has been submitted because it isInnoDB
RR mode, soreadview
Will not change, remainsReadview [100200], 300
, version chain is shown in the figure:At this time, the data we query islilei300
. The reasons are the same as above and will not be repeated.
When an UPDATE statement is executed, it is read before it is written. This read, which is the current read, can only read the current value, which is distinct from the snapshot read that the readView looks up.
I just showed youInnoDB
In RC mode, the readView is generated in real time, and the execution statement is shown in the following figure:When we get to line 13select
Is generated when thereadview[200],300
, the version chain is the same as before, and the data we query at this time islilei2
. The reason is the same as the comparison rule in RR mode.
Update (trx_id); update (trx_id); update (trx_id); At the same time in the record header (deleted_flag) flag bit write true, to indicate that the current record has been deleted, in accordance with the above rules when querying the corresponding record, if the delete_flag flag bit is true, means that the record has been deleted, No data is returned.
You should also be concerned about when the undo log is deleted. The system determines that the undo log will be deleted if there is no older read view than the undo log. This is why we recommend that you avoid using long transactions. Long transactions mean that there are old transaction views in the system. Since these transactions can access any data in the database at any time, any rollback records that may be used in the database must be retained until the transaction commits, which can lead to a large storage footprint.
That’s all for today. If you are interested, you can follow the wechat official account “Ah Q Said code” or contact Ah Q: Qingqing-4132. Ah Q is looking forward to your arrival!