SQL execution link

Unlike the query process, the update process also involves two important logging modules, which are the main ones we will discuss today

Corner: Redo log and binlog. If you’re dealing with MySQL, you can’t get around these two words

And I’ll keep telling you that later on. Anyway, redo logs and binlogs are a lot of things in design

These design ideas can also be applied to your own programs.

Important log

MySQL as a whole, in fact, there are two: one is the Server layer, which mainly does the function of MySQL

Level things; There is also the engine layer, which takes care of the details of storage. The redo log we talked about above is

InnoDB engine specific log, and Server layer also has its own log, called binlog (archive log).

Why are there two logs, I’m sure you ask?

Because there was no InnoDB engine in MySQL at first. MySQL comes with MyISAM, but MyISAM doesn’t

With crash-safe capability, binlog can only be used for archiving. InnoDB is another company that introduced MySQL as a plug-in

InnoDB uses another log system, the redo log, to implement crash-safe functionality, since binlogs alone do not provide crash-safe functionality.

There are three differences between the two types of logs.

  1. Redo logs are unique to InnoDB; Binlog is implemented in the Server layer of MySQL and is available to all engines.

  2. A redo log is a physical log of what was changed on a data page. Binlog is a logical log

Is the original logic of the statement, such as “add 1 to the C field on the line ID=2”.

  1. Redo logs are written in a redo log cycle. Binlog can be appended. “Append” refers to binlog files

After writing to a certain size, it switches to the next log without overwriting the previous log.

With a conceptual understanding of these two logs, let’s look at the internal flow of the executor and InnoDB engine as they execute this simple UPDATE statement

  1. The executor first finds the engine and fetches the line ID=2. ID is the primary key, and the engine uses the tree search directly to find this row. If ID is equal to 2

The data page of the row is already in memory and is returned directly to the executor; Otherwise, you need to read into memory from disk first, however

And then come back.

  1. The actuator takes the row given by the engine and adds 1 to it, for example, N+1, to get a new row

Data, and then call the engine interface to write this new line of data.

  1. The engine updates the new row to memory and logs the update to the redo log at the redo log

In the prepare state. The executor is then told that the execution is complete and the transaction can be committed at any time.

  1. The executor generates a binlog of this operation and writes the binlog to disk.

  2. The executor calls the commit transaction interface of the engine, and the engine changes the redo log to the commit state

The new finish.

Here I show the flow chart of the update statement. The light boxes indicate that the update statement is executed inside InnoDB

The display is executed in the actuator.

Redo log and binlog are two separate logics. if you do not commit in two phases, either redo log is written first and then binlog is written, or the reverse order is used. Let’s see what’s wrong with these two approaches

Again, use the previous update statement as an example. Assume the current row with ID=2 and the value of field C is 0

In the process of writing the first log, crash occurred before writing the second log. What would happen?

  1. Write redo log and then binlog. MySQL process changes when redo log is complete and binlog is not

Often to restart. As we mentioned earlier, after a redo log is written, the system can still recover data even if it crashes

So the value of c in this restored row is 1.

But because the binlog didn’t finish writing the crash, the binlog didn’t record the statement. Therefore, back up later

When logging, the stored binlog does not contain this statement.

Then you’ll notice that if you need this binlog to restore the temporary library, since the binlog of this statement is missing, this

The value of c in the restored row is 0, which is different from the value in the original library.

  1. Write binlog and redo log. If a crash occurs after a binlog has been written, the redo log has not been written

The last transaction is invalid, so the value of c in this row is 0. But the date “change c from 0 to 1” is already recorded in binlog

Tzu chi. So, there’s one more transaction that comes out when you use binlog to restore, and the value of the restored row is c

1, different from the value of the original library.