1. Undolog guarantees atomicity of transactions

  • What does an NSERT undo log look like?

The undo log of the INSERT statement is of type TRX_UNDO_INSERT_REC. This undo log contains the following items:

Start position Of the log Column Length and value table ID of the primary key Undo log Log number Undo log Log type End position of the log

Redo logs are physical logs

MysQL solves this problem using WAL technology, which stands for write-Ahead Logging, and its key point is to Write to the log first and then to the disk.

When a record needs to be updated, the InnoDB engine writes the record to the redo log and updates the memory. Meanwhile, the InnoDB engine updates this operation record to disk when appropriate, and this update is usually done when the system is idle.

InnoDB’s redo log is fixed in size. For example, it can be configured as a group of 4 1GB files, so the redo log records 4GB of operations. Start at the beginning, then go back to the beginning at the end, as shown in the figure below.

Write pos is the current position of the record, moving backwards as you write, returning to the beginning of file 0 after the end of file 3. Checkpoint is the current point of erasure, which is also iterated backwards, updating the record to the data file before erasure.

Between write pos and checkpoint is the empty portion of the redo log that records new operations. If write POS catches up to checkpoint, the redo log is full and no new updates can be performed.

With redo log, InnoDB guarantees that all previously committed records will not be lost in the event of an unexpected database restart. This capability is called crash-safe.

Benefits:

  • Redo logs take up very little space

  • The storage required to store table space ids, page numbers, offsets, and values that need to be updated is small.

  • Redo logs are written sequentially to disk

  • During a transaction, several redo logs may be generated for each statement executed, and these logs are written to disk in the order in which they were created, using sequential IO.

Redo log flush time

  • The log buffer space is insufficient

  • The size of the log buffer is finite (specified by the system variable innodb_log_buffer_size), and if you keep filling this finite size with logs, it will soon fill up. InnoDB was designed to flush logs to disk if the number of redo logs currently written to the log buffer has reached about half of the total log buffer capacity.

  • When a transaction commits

  • As mentioned earlier, the main reason for using redo logs is that they take up less space and are written sequentially. Modified Buffer Pool pages are not flushed to disk during transaction commits, but redo logs must be flushed to disk for durability.

  • Background threads keep brushing

  • There is a thread in the background that flusher the redo logs in the log buffer to disk about once every second.

  • Shut down the server properly

  • When we do something called checkpoint

binlog

Binlog is implemented in the Server layer of MySQL and is available to all engines.

Analysis of the disk flushing policy of binlog

There is a sync_binlog parameter that controls the binlog flushing policy. The default value is 0. When you write the binlog to disk, it is not directly into the disk file, but into the OS cache. So, as previously analyzed, if the machine is down at this time, the binlog in OS cache will be lost.

If the sync_binlog parameter is set to 1, it forces the binlog file to be written directly to the disk file when the transaction is committed, so that the binlog will not be lost after the transaction is committed, even if the machine goes down.

Submit two pieces

Since redo log and binlog are separate logics. if you do not commit in two phases, either redo log is written first and then binlog is written, or the reverse order is used. So let’s see what happens with both of these approaches.

Again, use the previous update statement as an example. If the current row ID=2 and the value of field C is 0, then if the update statement is executed and the first log is written before the second log is written, what happens?

Write redo log and then binlog. MySQL restarts unexpectedly when the redo log is complete. Because the system can recover data from a redo log crash, the value of c is 1. But because the binlog didn’t finish writing the crash, the binlog didn’t record the statement. Therefore, when the log is backed up later, the saved binlog does not contain this statement. Then you will notice that if you need to restore the temporary library with this binlog, the temporary library will miss the update because the binlog of this statement is lost, and the restored line of C will have a value of 0, which is different from the original library. Write binlog and redo log. If there is a crash after the binlog has written, the transaction is invalid after the crash because the redo log has not been written, so the value of c in this row is 0. But binlog already records “change c from 0 to 1”. The value of c in the restored row is 1, which is different from that in the original library. As you can see, if two-phase commit is not used, the state of the database may be inconsistent with the state of the library recovered from its logs. Two-phase commit is a common solution for maintaining logical consistency of data across systems.

In short, both redo log and binlog can be used to represent the commit state of a transaction, and two-phase commit is to keep the two states logically consistent.

When a transaction is committed, redo log is flushed to disk, and binlog is flushed to disk to complete the commit mark in redo log. Finally, the background I/O thread will randomly flush the dirty data from the buffer pool to disk.

Reference: blog.csdn.net/ggh0314/art…