Please go to DobbyKim’s Question of the Day for more questions

MySQL > MySQL > MySQL > MySQL > MySQL > MySQL > MySQL

Let’s take a look at what a redo log two-phase commit is.

MySQL > update MySQL > update MySQL

update T set c=c+1 where ID=2;
Copy the code

Server layer

  1. Before we execute this statement, we first need to connect to the database, which is the job of the connector.

  2. Since this statement is an update statement, the query cache related to the table will be invalidated when an update operation is performed on the table, so this statement will clear all cached results on table T.

  3. Knowing from lexical analysis and parsing that this is an update statement, the optimizer decides to use the index ID.

Storage engine layer

In the figure, the light-colored boxes represent what is executed inside the InnoDB engine and the dark boxes represent what is executed inside the executor.

  1. The executor first finds the engine and fetches the line ID=2. ID is the primary key, and the engine uses the tree search directly to find this row. If the row ID=2 is already in memory, it is returned directly to the executor. Otherwise, you need to read into the memory from disk and then return.

  2. The executor takes the row given by the engine, adds 1 to it, for example, N+1 to get a new row of data, and then calls the engine interface to write the new row of data.

  3. The engine updates the data to memory and logs the update to the redo log. The redo log is prepared. The executor is then told that the execution is complete and the transaction can be committed at any time.

  4. The executor generates a binlog of this operation and writes the binlog to disk.

  5. The executor calls the commit transaction interface of the engine, and the engine changes the redo log to the commit state.

We saw that the redo log write was split into two parts: prepare and commit. This is called “two-phase commit”.

Why are redo logs committed in two phases

To think about it in contradiction, redo logs and redo logs are separate logics. if redo logs were not committed in two phases, either redo logs were written before redo logs were written, or redo logs were written before redo logs. Let’s see what goes wrong with both approaches.

SQL > alter table update (ID=2, c = 0); SQL > alter table update (ID=2);

  • Write redo log and then binlog.

MySQL restarts unexpectedly when the redo log is complete. Because the system can recover data from a redo log crash, the value of c is 1. But because the binlog didn’t finish writing the crash, the binlog didn’t record the statement. Therefore, when the log is backed up later, the saved binlog does not contain this statement. Then you will notice that if you need to restore the temporary library with this binlog, the temporary library will miss the update because the binlog of this statement is lost, and the restored line of C will have a value of 0, which is different from the original library.

  • Write binlog and redo log.

If there is a crash after the binlog has written, the transaction is invalid after the crash because the redo log has not been written, so the value of c in this row is 0. But binlog already records “change c from 0 to 1”. The value of c in the restored row is 1, which is different from that in the original library.

So, why do redo logs need to be committed in two phases? This is because two-phase commit is consistent with the redo log and binlog logic that represent the commit status of transactions.

The redo log is associated with the binlog. The redo log is used to restore unupdated physical data when a host fails, and the binlog is used to back up data. Log operations at each stage are recorded on the disk. During data restoration, if the redo log status is COMMIT, the binlog is successfully restored. If the redo log is prepare, check whether the corresponding binlog transaction is successful and decide whether to roll back or execute the transaction.