I heard that MySQL can be restored to any second in half a month

To start with an update statement, if the value of ID=2 is +1, the SQL statement could say:

mysql> update T set c=c+1 where ID=2;
Copy the code

Executing an update statement also goes through the query process:

  • Connecting to a Database
  • Clear the cache involved in the table
  • The parser resolves that this is an update statement using lexical and syntax and determines that tables and fields are involved
  • The optimizerDecided to use"ID"The index
  • The executor finds this row and performs an update. Unlike the query process, the update involves two log modules: redo log and bin log.

Temporary log: Redo log

The innkeeper kept a ledger and a small blackboard to keep track of credit. There are two options:

  1. Keep the books open for each transaction, and when someone pays, find the corresponding credit record and modify the status of the record
  2. First, record the operations on the blackboard. After closing, check the accounts according to the records on the blackboard

When business is booming, customers stream in an endless stream, the efficiency of the first scheme is really low, the shopkeeper must be in accordance with the second scheme to bookkeeping. Similarly, if every update operation of MySQL needs to write to disk, find the corresponding record in disk, and then update, the IO cost and search cost of this process are too high. To solve this problem, MySQL uses something like a blackboard-ledger model to improve efficiency. This mode is known as WAL technology, and the whole process is write-ahead Logging. The key point is: Write log first, then Write disk, that is, Write blackboard first, then Write ledger.

InnoDB writes records to the redo log and updates memory. The update is complete. The innoDB engine will update the operation record to disk when appropriate, which is usually done when the system is idle. The redo log is a fixed size of 4 files, 1 gb each. Logically, you can think of the four files as a circle, starting at the beginning and then starting a new round at the end, as shown below






"crash-safe"

Archive logs: bin log

Differences between redo log and bin log:

  1. The redo log belongs to innoDB engine. The bin log is provided by the server. All engines can use it
  2. Redo log Indicates physical logs"What changes have been made to a certain data page?"Bin log records the logical log of the statement"Add one to the value of c in the row where ID=2." [1]
  3. The redo log file is used in a circular manner. The bin log file is appended and does not overwrite previous records

That is, the server does not have redo logs with other engines, and therefore does not have crash-safe capabilities

Update specific process

Take a closer look at the update process based on your knowledge of the two log files

  1. The executor uses tree search to find the row ID=2 through the engine. If the data page on which the record is located is already in memory, it returns the executor directly. Otherwise, it reads from disk and returns
  2. The actuator takes the data, increments the value of C by one, and writes the modified data through the engine’s write interface
  3. Engine J updates the new data to memory and records the change in the redo log. The redo log sets the status to prepare and tells the executor that the update is complete and the transaction is ready to commit
  4. The executor generates the bin log of this operation and writes the bin log to the disk
  5. The executor calls the commit transaction interface of the engine, the engine sets the redo log to commit, and the update ends

Here is a flow chart from MySQL In Action:

Two-phase commit

As you can see, the redo log is committed in two phases. This is to keep the two logs logically consistent. What happens if you don’t do a two-phase log?

  1. If the redo log is not recorded, c changes to 1. When the MySQL server crashes and restarts, use the redo log to restore the database. The value of C in the restored data is 1. Since this change is not recorded in the bin log, the value of C is still 0 when the bin log is backed up later. If a standby database needs to be restored from the bin log one day, the c value recovered from the bin log is still 0, which is inconsistent with the value in the original database
  2. When the bin log crashed and the redo log was not written, the transaction was invalid, so the value of c was 0, but the bin log added c +1. Therefore, there is an extra transaction in the data recovered by bin log, making the value of C 1, which is inconsistent with the data in the original library.
  3. After the redo log was updated, the bin log crashed when the bin log was not recorded. The status of the redo log was prepare, the transaction was not committed, and the bin log was not recorded. Therefore, due to the crash-safe mechanism, the record is not restored, and the value of C is still 0. Since there is no record in bin log, the value of C is still 0 when the data is recovered from bin log, which is consistent with the data in the original database. Another case: C =1; c=1; c=1; c=1; When the new database is recovered directly from the bin log, the value of C +1 has been recorded, so it is the same as the original database value

The summary is as follows:

There are two ways to determine the completeness of records:

  1. The redo log status is commit
  2. Redo log: Prepare and bin log complete (before commit)

conclusion

This section focuses on the use of log files redo log to ensure crash-safe capability and bin log to restore data integrity

  • Innodb_flush_log_at_trx_commit if set to 1, the redo log of each transaction is persisted directly to disk. In this way, data will not be lost after MySQL restarts abnormally.
  • When sync_binlog is set to 1, the binlog of each transaction is persisted to disk. This ensures that the binlog will not be lost after MySQL restarts abnormally.

Comment section:

  • Binlog is not used for crash recovery. You can disable binlog if you have permission to “set SQL_log_bin =0”. So relying only on binlog for recovery is unreliable

  • Gao Zhen’s comments simply and succinct express the working state under the two-phase submission mechanism: There are three processes in logging:

    1. Prepare stage
    2. Write the binlog
    3. commit
    • If a crash occurs before 2, restart recovery: then find no COMMIT, rollback. Backup and restore: no binlog. The backup is consistent with the original library
    • If the system crashes before 3, the system will automatically commit after the restart. Backup: there is a binlog. The backup is the same as the original library
  • From @gold sun

    • Q:
    1. The redo log is also a file. The redo log is also a disk write. How is that different from offline writing?
    2. Do you want to operate on two log files simultaneously? So write to disk twice, right?
    • The author replies:
    1. Redo logs are written sequentially and do not have to “locate” data. Updating data requires location, so redo logs are written faster

    2. It’s actually 3 times (redolog twice binlog once). However, it merges writes in concurrent updates

ACID I: Transaction isolation


  1. The redo log does not record the “updated state” of a data page. Instead, it records what changes were made to the page. There are two modes of bin log. Statements are used to record SQL statements. The row format records the contents of the row, both before and after the update. ↩ ︎