Column series: MySQL column series
Undo Log
The first property of a transaction is atomicity, which guarantees that all add, delete, or change operations in a transaction will either succeed or not be done at all. In this case, undo logs are required. Before making changes to the database, the corresponding Undo logs are recorded. Then, in case of transaction failure or rollback, these Undo logs can be used to roll back the data to the previous appearance.
Let’s take a quick look at the transaction ID and the hidden columns in the row record, because the rest of this is about those two things.
Transaction ID
InnoDB assigns a unique transaction ID when adding, deleting, or modifying a table during a transaction. If no add, delete, or delete operation is performed in a transaction, the transaction ID is not assigned.
InnoDB maintains a global variable in memory to represent the transaction ID. Whenever a transaction ID is assigned, InnoDB retrieves the value of this variable and increments it by 1. Whenever the value of this variable is a multiple of 256, the value of this variable is flushed to an attribute called Max Trx ID on page 5 of the system tablespace. When the system next restarts, the Max Trx ID attribute is loaded into memory and assigned to the global variable with the value 256 (this process is similar to the assignment of primary key row_id).
We can query information_schema.innodb_trx for information about transactions currently running in the system. The first field in the table trx_id is the transaction ID.
mysql> SELECT trx_id,trx_state,trx_started,trx_rows_locked,trx_isolation_level,trx_is_read_only FROM information_schema.INNODB_TRX;
+-----------+-----------+---------------------+-----------------+---------------------+------------------+
| trx_id | trx_state | trx_started | trx_rows_locked | trx_isolation_level | trx_is_read_only |
+-----------+-----------+---------------------+-----------------+---------------------+------------------+
| 164531720 | RUNNING | 2021- 05- 14 16:38:59 | 1 | REPEATABLE READ | 0 |
+-----------+-----------+---------------------+-----------------+---------------------+------------------+
Copy the code
Row records hide columns
In this article introducing InnoDB row record format, we learned that there are three hidden columns in the row record:
DB_ROW_ID
InnoDB automatically adds a primary key to a table if no primary key is explicitly defined for the table and no unique index is defined for the tablerow_id
The hidden column of theDB_TRX_ID
: When you add, delete, or change a record in a transaction, the transaction ID of the transaction is writtentrx_id
In the.DB_ROLL_PTR
: Rollback pointer, which is essentially a pointer to undo log.
The Undo Log type
Each change to a record generates one or two undo logs. A transaction may have multiple ADD, delete, or modify SQL statements. A SQL statement may generate multiple Undo logs. These undo logs in a transaction are numbered from 0, and the number is called undo NO.
Undo log is used to record the undo log of adding, deleting, or modifying a database.
Again, use the previous account table as an example to illustrate this.
CREATE TABLE `account` (
`id` bigint(11) NOT NULL AUTO_INCREMENT COMMENT 'primary key',
`card` varchar(60) NOT NULL COMMENT 'number',
`balance` int(11) NOT NULL DEFAULT '0' COMMENT 'the balance'.PRIMARY KEY (`id`),
UNIQUE KEY `account_u1` (`card`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Account sheet';
Copy the code
We can query information_schema.innodb_sys_tables to obtain the tablespace ID 13881 for this table.
mysql> SELECT * FROM information_schema.INNODB_SYS_TABLES WHERE name = 'test/account';
+----------+-------------+------+--------+-------+-------------+------------+---------------+------------+
| TABLE_ID | NAME | FLAG | N_COLS | SPACE | FILE_FORMAT | ROW_FORMAT | ZIP_PAGE_SIZE | SPACE_TYPE |
+----------+-------------+------+--------+-------+-------------+------------+---------------+------------+
| 13881 | test/account | 33 | 6 | 13935 | Barracuda | Dynamic | 0 | Single |
+----------+-------------+------+--------+-------+-------------+------------+---------------+------------+
Copy the code
insert undo
The undo operation corresponding to insert data actually deletes the data according to the primary key. Therefore, the undo log corresponding to insert is mainly used to record the primary key of this record.
The type of the undo log generated by INSERT is TRX_UNDO_INSERT_REC, and its general structure is as follows:
The start and end
: points to the start and end of the record.undo type
: Type of undo log, i.eTRX_UNDO_INSERT_REC
.undo no
: Indicates the undo log number in the current transaction.table id
: indicates the ID of the tablespace.Primary key column information
Select * from ‘INSERT’; select * from ‘INSERT’; select * from ‘INSERT’;
For example, we start a transaction and insert two pieces of data into the account:
BEGIN;
INSERT INTO account(id,card,balance) VALUES (1.'AA'.0), (2.'BB'.0);
Copy the code
Given that the transaction ID of this transaction is 100, the INSERT statement inserts two data, resulting in two Undo logs. When the record is inserted, the current transaction ID is written into the hidden column transaction ID of the row record and the undo log is generated. The rollback pointer in the record saves the address of the Undo log. Multiple records on the same page are linked together via next_record to form a single linked list, which can be referenced in the previous article on row record format and data page structure.
delete undo
Deleting a piece of data can be roughly divided into two stages:
- Phase one
First, when the user thread performs a delete, it marks the delete_mask in the record header as 1 rather than removing it directly from the page, because other concurrent transactions may also need to read this data. (We’ll see why later in MVCC)
- Phase two
After the transaction is committed, there is a Purge thread in the background that actually deletes the data.
The first thing to know is that the data in the page is a one-way linked list connected by Netx_Record in the record header information (assuming this list is called a data list). There is another linked list in the page, called the garbage list, where records are actually deleted, removed from the data list and then added to the head of the garbage list for reuse.
So phase two is to remove records from the data list and add them to the head of the garbage list.
That is, the delete operation only goes through phase one, which marks the record’s delete_mask as 1, before the transaction commits.
The undo log type corresponding to DELETE is TRX_UNDO_DEL_MARK_REC. Its structure is roughly shown in the figure below. Compared with TRX_UNDO_INSERT_REC, there are three main parts:
old trx_id
: This property saves the hidden columns in the recordtrx_id
This property is used when the MVCC reads concurrently.old roll_pointer
: This property holds hidden columns in the recordroll_pointer
, so that the previous undo log can be found through this property.Index column information
This section is mainly used to actually delete the record after the phase 2 transaction commits.
Select * from table_name where id=2; select * from table_name where id=2;
BEGIN;
INSERT INTO account(id,card,balance) VALUES (1.'AA'.0), (2.'BB'.0);
DELETE FROM account WHERE id = 2;
Copy the code
Because it is in the same transaction, the hidden column trx_id in the record is unchanged, and the delete_mask in the record header is marked as 1. A new undo log is then generated with the original trx_id and roll_pointer stored in the record, so the new undo log points to the old undo log, and the roll_pointer in the record points to the new undo log. Note that the transaction numbers in the undo log are also increasing.
update undo
InnoDB handles the UPDATE and non-update primary keys in different undo log types when executing an UPDATE statement.
The primary key is not updated
Without updating the primary key, it can be broken down into situations where the storage space occupied by the updated column does not change and changes.
- The storage space is not changed
When a record is updated, if the updated column takes up the same number of bytes as the original column for each column being updated, an in-place update can be performed, that is, the value of the corresponding column can be changed directly from the original record.
- The storage space is changed. Procedure
If any of the updated columns occupy an inconsistent number of bytes before and after the update, the old record is removed from the clustered index page and a new record is created based on the updated column value and inserted into the page. Note that the delete_mask is not marked as 1, but is actually deleted, removed from the data list and added to the head of the garbage list.
If the storage space occupied by the new record is no larger than the space occupied by the old record, you can directly reuse the space occupied by the old record that was just added to the head of the garbage list. Otherwise, you will apply for a new space on the page to use.
The undo_upd_exist_rec type of undo log generated in these two cases without updating the primary key is TRX_UNDO_UPD_EXIST_REC. The general structure is shown in the figure below. Compared with TRX_UNDO_DEL_MARK_REC, there is more information about updated columns.
Assume that the size of bytes occupied by each column is unchanged when the data with id=1 is updated:
BEGIN;
INSERT INTO account(id,card,balance) VALUES (1.'AA'.0), (2.'BB'.0);
DELETE FROM account WHERE id = 2;
UPDATE account SET card = 'CC' WHERE id = 1;
Copy the code
This record is then updated in place, again generating a new Undo log pointing to the original Undo log.
Update the primary key
Remember that records are linked in a one-way list by primary key size. If you update the primary key value of one record, the position of that record will change, and it may be updated to another page.
The update in this case is divided into two steps:
- First mark the original record to delete, that is
delete_mask
Instead of1
It hasn’t really been deleted. - A new record is then created based on the values of the updated columns and inserted into the cluster index.
In this case, two undo logs are generated:
- One is created when the tag is deleted in the first step
TRX_UNDO_DEL_MARK_REC
Type of undo log. - One is created when the record is inserted in the second step
TRX_UNDO_INSERT_REC
Type of undo log.
Both types of structures have been described previously.
Select * from primary key where id=1;
BEGIN;
INSERT INTO account(id,card,balance) VALUES (1.'AA'.0), (2.'BB'.0);
DELETE FROM account WHERE id = 2;
UPDATE account SET card = 'CC' WHERE id = 1;
UPDATE account SET id = 3 WHERE id = 1;
Copy the code
When the primary key is updated, the original record is marked for deletion and a TRX_UNDO_DEL_MARK_REC undo log is added. A new record with ID =3 is inserted and a new undo log of type TRX_UNDO_INSERT_REC is created.
The undo log rolled back
The previous series of undo logs generated by adding, deleting and modifying a transaction have undo no numbers. At rollback time, the undo log from the transaction can be applied to undo operations starting from the largest to the smallest undo no.
For example, in the example above, if the rollback ends:
- Undo log 5: delete id=3;
- Then undo log 4, untag delete, add id=1 of the data
delete_mask
Instead of0
; - Then undo log no. 3 will update the column
card='CC'
I’m going to go back to the originalcard='AA'
; - Then undo log number 2, untag delete, add id=2 of the data
delete_mask
Instead of0
; - Then undo log number 1, delete the data id=2;
- Then undo log 0, delete id=1;
As you can see, by performing undo log during the rollback, the data is restored to its original state.
However, it is important to note that undo log is a logical log that logically restores the database to its original state. All changes are logically cancelled, but the data structure and the page itself may be quite different after the rollback. Because many concurrent transactions may be making changes to the database at the same time, it is not possible to roll back a page to the beginning of a transaction because this would affect the work being done by other transactions.
The Undo Log storage
The undo log classification
The previous section described several types of undo logs, which are actually divided into two main categories:
TRX_UNDO_INSERT
Undo logs of type TRX_UNDO_INSERT_REC fall into this category and are typically generated by INSERT statements or when primary keys are updated by UPDATE.
TRX_UNDO_UPDATE
Except for undo logs of type TRX_UNDO_INSERT_REC, other undo logs fall into this category, such as TRX_UNDO_DEL_MARK_REC, TRX_UNDO_UPD_EXIST_REC, Typically generated by DELETE or UPDATE statements.
The reason for the two categories is that undo logs of different categories cannot be stored together. Undo logs of the TRX_UNDO_INSERT_REC type can be deleted directly after the transaction is committed, while undo logs of other types cannot be deleted directly because they need to provide MVCC function.
Undo page linked list
Undo logs are stored in FIL_PAGE_UNDO_LOG pages. Many undo logs may be generated in a transaction, and multiple undo pages may need to be applied. Therefore, InnoDB designs it as a linked list structure, connecting multiple undo pages in a transaction.
As mentioned above, undo logs are divided into two categories and cannot be stored together. Therefore, if the two types of Undo logs are generated in a transaction, two linked lists will be created, one for the TRX_UNDO_INSERT category undo logs. An undo log that stores the TRX_UNDO_UPDATE category.
If temporary tables are also modified in a transaction, InnoDB specifies that undo logs for regular tables and temporary tables are stored separately, so there may be up to 4 linked undo pages in a single transaction.
It is important to note that these lists are not allocated at the beginning of the transaction, but are allocated when a certain type of list is needed.
Reusing undo page
If multiple concurrent transactions are executed, the undo logs generated during the execution of different transactions will be written into different undo page linked lists to improve the efficiency of writing undo logs. This means that a single transaction may allocate up to 4 linked lists, and two transactions may allocate up to 8 linked lists.
However, most transactions are short transactions that generate very few Undo logs. These Undo logs only take up a small amount of storage space on a page, which can be very wasteful. So InnoDB is designed to reuse the linked list of undo pages for a transaction after it commits in some cases.
Undo linked lists can be reused under the following conditions:
- If only one undo page is included in the undo page list, the list can be reused by the next transaction. Because if a transaction generates many undo logs, the linked list may have multiple pages, and the new transaction may use only a small portion of the linked list space, which would be wasteful.
- Then the undo page already uses less space than the entire page space
Three quarters of
Can be reused.
For an INSERT undo page linked list of type TRX_UNDO_INSERT, these undo logs are useless after the transaction commits and can be cleared. Therefore, when the linked list is reused after a transaction is committed, the previous undo log can be directly overwritten.
For a list of update undo pages of type TRX_UNDO_UPDATE, these undo logs cannot be deleted immediately after the transaction is committed because they are used for MVCC. So when you reuse the linked list, you can only append undo logs to it, which means that multiple sets of Undo logs can be written to a page.
Rollback segments
The redo log is stored in a redo log file. By default, the undo log is stored in a special segment in the system tablespace. This segment is called the Rollback segment.
In order to better manage the undo page list, InnoDB designed a page of Rollback Segment Header. Each page of Rollback Segment Header has a corresponding Rollback Segment. A Rollback Segment Header page contains 1024 undo slots. Each undo slot stores the page number of the undo page in the Header of the undo list.
A Rollback Segment Header contains only 1024 undo slots. If only one undo Segment is allocated in a transaction, the maximum number of concurrent transactions that can be executed at the same time is 1024. This is not sufficient in today’s high concurrency situation.
InnoDB provides 128 Rollback Segment headers, 128*1024=131072 undo slots, and a maximum of 131072 concurrent transactions.
The 128 Rollback Segment Header page addresses are stored in page 5 of the system tablespace.
You can configure the rollback segment using the following parameters:
-
Innodb_undo_directory: undo log is stored in the system tablespace by default, or can be configured as a separate tablespace. You can use this parameter to set the directory of the independent table space, which is the data directory by default.
-
Innodb_undo_logs: Sets the number of rollback segments. Default is 128. Note that the number of rollback segments for temporary tables is fixed at 32, so the number of rollback segments for ordinary tables is the value of this parameter minus 32. If the value is less than 32, there is only one rollback segment for ordinary tables.
-
Innodb_undo_tablespaces: Sets the number of undo tablespace files so that rollback segments are evenly distributed across multiple files. The default value of this parameter is 0, indicating that the undo independent tablespace is not created.
mysql> SHOW VARIABLES LIKE 'innodb_undo%';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| innodb_undo_directory |. \|
| innodb_undo_logs | 128 |
| innodb_undo_tablespaces | 0 |
+--------------------------+-------+
Copy the code
Restore the undo
After the undo log is written to the undo page, the page becomes dirty, is added to the Flush list, and is flushed to disk at some point.
When a transaction commits, the undo log is put into a linked list. Whether the undo log and the page on which the undo log exists can be deleted is determined by a purge thread in the background.
Last but not least, redo logs are also generated when undo logs are written to undo pages, because undo logs also require persistence protection.
This is really about an unsolved problem with the redo log.
In the same graph of T1 and T2 concurrent transactions, at the arrow, if THE T1 transaction completes the commit transaction, the redo log is flushed. The T2 transaction has not yet completed, but its mtr_T2_1 has already been flushed to disk. If the database goes down at this point, the T2 transaction actually fails. After restarting the database, mtr_T2_1 is read to recover the data, and the T2 transaction is actually incomplete, so restoring the data here will cause problems with the data.
When redo log is restored, undo pages are also redone. When redo log mtr_T2_1 redo data pages, T2 transactions are not committed, and undo logs are used to undo those operations. That solved the problem.