Redo and undo in MySQL transactions

We all know that transactions have four characteristics: atomicity, consistency, isolation and persistence. All operations in a transaction are either performed or not performed at all. This is the purpose of a transaction. Transactions are isolated by locking mechanisms, and atomicity, consistency, and persistence are guaranteed by redo and Undo logs. So this article will discuss several issues concerning redo and undo in transactions:

What are redo logs and undo logs?
How does Redo ensure transaction persistence?
Is undo log the reverse of redo log?

redo log

The type of Redo

Redo logs are used to ensure the persistence of transactions. D in transaction ACID. It can actually be divided into the following two types:

Physical Redo Log
Logical Redo log

In the InnoDB storage engine, Redo is mostly a physical log, which records physical changes to data pages. Logical Redo logs, on the other hand, do not record the actual changes to the page. They record the type of operations that change the page, such as when creating a new data page. The logical Redo log is more low-level, but it is important to remember that in most cases Redo logs are physical logs, and DML page changes require Redo logs.

The role of Redo

The Redo log is used for database crash recovery

The composition of Redo

Redo log can be divided into the following two parts:

One is the in-memory redo log buffer, which is volatile in memory
The second is redo log files, which are persistent and stored on disk

When do I do Redo?

When to write Redo:

Write to the redo log after the data pages are modified and before the dirty pages are flushed out of disk. Note that the data is modified first and the log is written later
Redo logs are written back to disk before data pages
Log Redo indexes, secondary indexes, and undo page changes.

The overall flow of Redo

Take a redo log transaction as an example to get a general view of the redo log flow, as shown in the following figure:

Step 1: Read raw data from disk into memory, modify the memory copy of the data
Step 2: Generate a redo log and write to the redo log buffer
Step 3: When the transaction is committed, refresh the redo log buffer to the redo log file and append to the redo log file
Step 4: Periodically flush the modified data in memory to disk

How does Redo ensure transaction persistence?

InnoDB is a transaction storage engine, which implements transaction persistence through the Force Log at Commit mechanism. When a transaction is committed, the redo Log buffer is written to the redo Log file for persistence until the transaction is committed. This practice, also known as write-ahead Log, persists the corresponding Log page in memory before persisting a data page.

In order to ensure that every log is written to the redo log file, InnoDB storage engine needs to call fsync by default after every redo buffer is written to the redo log file, because redo log is not enabled with the O_DIRECT option. So redo logs are written to the file system cache first. To ensure that the redo log is written to disk, a fsync operation must be performed. Fsync is a system call operation, and the efficiency of fsync depends on the performance of the disk, so the performance of the disk also affects the performance of the transaction commit, that is, the database. (O_DIRECT is an option in Linux. After this option is used, files can be directly written to disks without being cached by the file system.)

Force Log at Commit innodb_flush_log_at_trx_COMMIT controls the redo Log flushing policy. Setting this parameter also allows the user to set non-persistence as follows:

When the parameter is set to 1 (the default is 1), it means that the transaction must be called once when committingfsyncOperation, the most secure configuration, guarantee persistence
When the parameter is set to 2, only write is performed during transaction submission, and only the redo log buffer is written to the system page cache. Fsync is not performed. Therefore, transactions are not lost if the MySQL database is down, but transactions may be lost if the operating system is down
If the value is set to 0, no redo log is written to the master thread. The master thread performs fsync on the redo log every second. Therefore, a maximum of 1 second transactions can be lost. (Master threads are responsible for asynchronously refreshing data from the buffer pool to disks to ensure data consistency.)

The fsync and write operations are actually system call functions that are used in many persistence scenarios, such as Redis’ AOF persistence. The fsync operation will send data to the hard disk and force hard disk synchronization. The fsync operation will block until the data is written to the hard disk and returns after the data is written to the hard disk. The fsync operation will cause performance bottlenecks, while the write operation will return immediately after the data is written to the page cache. Then, the system’s scheduling mechanism flusher the cached data to the disk in the order of user buffer — > Page cache — >disk.

How is Redo implemented in InnoDB? Connection to Mini-Transaction?

The implementation of Redo is closely related to mini-Transaction. Mini-transaction is an internal mechanism used by InnoDB to ensure consistency between concurrent transactions and data pages in the event of a database exception, but it is not a transaction.

In order for Mini-Transaction to ensure consistency of data on a data page, mini-Transaction must follow three protocols:

The FIX Rules
Write-Ahead Log
Force-log-at-commit

The FIX Rules

To modify a data page, the X-latch (exclusive lock) of the page is required. To acquire a data page, the S-latch (read lock or shared lock) or X-latch of the page is required. The lock of the page is held until the modification or access of the page is complete.

Write-Ahead Log

Write-ahead Log was mentioned in the previous explanation. Before persisting a data page, the corresponding log page in memory must be persisted. Each page has a log sequence number (LSN), which represents the log sequence number (LSN occupies 8 bytes and increases monotonically). Before a data page needs to be written to the persistence device, logs smaller than the LSN of the page must be written to the persistence device first. Log writing is sequential in append mode, which makes better use of disk performance than random log writing.

Force-log-at-commit

This point is the same as the previous one about how to ensure transaction persistence, which is summarized here again and echoes the previous one. Multiple pages can be modified in a transaction. Write-ahead Log can guarantee the consistency of a single data page, but cannot guarantee the persistence of a transaction. Force-log-at-commit requires that when a transaction commits, All mini-Transaction logs generated by the database must be flushed to disk. If the database breaks down before the pages in the buffer pool are flushed to the persistent storage device, the logs can be used to ensure data integrity during the database restart.

Write flow of redo logs

The figure above shows the redo log writing process. Each mini-transaction corresponds to each DML operation, such as an UPDATE statement, which is guaranteed by a Mini-transaction. After modifying data, reDO1 is generated. Redo1 is first written to the mini-Transaction private Buffer, and then copied from the private Buffer to the public Log Buffer after the update statement. When the entire external transaction commits, the redo log buffer is flushed into the redo log file.

undo log

Definition of undo log

The Undo log records logical changes of data. To roll back previous operations when errors occur, you need to record all previous operations and roll back when errors occur.

The function of undo log

Undo is a logical log with two functions:

For the rollback of transactions
MVCC

I won’t talk more about MVCC(Multi-version Concurrency Control) here, but this article focuses on undo Log for rolling back transactions.

Undo logging, which only logically restores the database to its original state, actually does the opposite when rolled back, such as an INSERT for a DELETE, and a reverse UPDATE for each UPDATE, putting back the modified row. Undo logs are used to roll back transactions and thus ensure atomicity of transactions.

Write time of undo log

The undo log was recorded before the cluster index was modified by DML operation
The undo log is not recorded when the secondary index records are modified

Note that redo logs are also required for undo page modifications.

Storage location of undo

In InnoDB storage engine, undo is stored in Rollback Segment. Each Rollback Segment records 1024 undo log segments, and undo pages are applied in each undo log Segment. Before 5.6, Rollback Segment is in a shared tablespace. After 5.6.3, innodb_undo_TABLESPACE can be used to set undo storage location.

The types of undo

In InnoDB storage engine, undo log is divided into:

insert undo log
update undo log

Insert undo log refers to the undo log generated during the INSERT operation, because the records of the INSERT operation are visible only to the transaction itself and not to other transactions. Therefore, the undo log can be deleted directly after the transaction is committed without the purge operation.

The update undo log records the undo log generated for delete and update operations. The undo log may need to provide MVCC mechanism, so it cannot be deleted when the transaction commits. Commit to the Undo log linked list and wait for the Purge thread to make the final deletion.

The purge thread has two main functions: to clean up undo pages and to purge data rows marked with Delete_Bit. In InnoDB, a Delete operation in a transaction does not actually Delete a row. Instead, it is a Delete Mark operation that identifies a Delete_Bit on a record without deleting the record. It is a “fake delete” that is marked as a “purge” thread in the background.

Is undo log the reverse of redo log?

Is undo log the reverse of redo log? Undo log is a logical log. When a transaction is rolled back, the database is logically restored to its original state. Redo log is a physical log that records physical changes to data pages.

Redo & Undo summary

The following is a simplification of redo log + undo log to understand both types of logging:

Suppose there are two data, A and B, with values of 1,2. 1. Record A=1 to undolog4. Log A=3 to redo loglog5. Record B=2 to undolog4. Log log B=4 to redo filelog8. Will redologWrite to disk 9. Transaction commitCopy the code

In fact, in insert/update/delete operations, redo and undo log different amounts of data. In InnoDB memory, the general order is as follows:

Redo undo
Write the undo
Modify data page
Write a Redo

summary

This paper analyzes redo and undo logs in transactions, which may be unclear in some places. Please point out any mistakes.

References & acknowledgements

InnoDB Storage Engine (Version 2)
MySQL kernel: InnoDB storage engine volume 1
InnoDB log/rollback section/crash recovery implementation details
MySQL · Engine features · InnoDB Redo log roaming
MySQL undo,redo, two-stage commit mind map