series

  • A select statement is executed in MySQL like this.
  • MySQL series 2: Redo log

preface

Kaka to chat

Last period according to a query statement query process analysis of MySQL overall architecture. Again, this issue uses a query SQL statement as a lead. To be sure, the process update statement executed by the query statement is also executed.

So the focus of this article is not on the MySQL architecture diagram, but on the redo log and binlog.

First, redo log

The first step is to create a table user with the primary key id. The following is the create statement.

CREATE TABLE `user` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(255) NOT NULL,
 `age` tinyint(4) NOT NULL,
 `time` int(11) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
Copy the code

Insert a piece of data

Insert into user (' name ', 'age', 'time') VALUES (" k_id ","25",unix_timestamp(now()))Copy the code

To change the age of the inserted data to 26, you need to execute the statement

update user set age = 26 where id = 1;
Copy the code

The first article mentioned a query statement execution flow that is the same as the update statement. So let’s take that picture and familiarize ourselves with it.

The functionality of each module can be reviewed back in the first article.

In MySQL8.0, redo log and binlog files are located in /var/lib/mysql, as shown in the figure

The ib_logfile file is a redo log, and the undo file is a rollback log. The latter part of the rollback log is discussed in detail.

The redo log is a necessary element for transaction persistence. When a transaction is committed, it does not directly modify the database data. Instead, it guarantees that the related operations are recorded in the redo log.

The redo log size in the Innodb storage engine is fixed. The figure above shows a set of two files, each with a default size of 48 MB. The innodb_log_file_size parameter is used to control the size of individual files.

The redo log records 48M operations. The redo log is a closed loop. The specified number of files and file size do not increase.

Write pos will record the current position while moving backwards, after the end of the ib-log-file-3 file, and then return to the ib-logFILg-0 file to start writing.

The check point records the current erasure position, which must be erased while the file is being written to the loop. The prerequisite for clarity is to update records to data files.

The green part above is the writable part. What if writepos catches up to checkpoint?

You must understand that write POS is pushed because an update operation is being performed, so updates cannot be performed until the record is updated to the data file and then erased by the Check Point.

Innodb_log_file_size also has some calculation rules, which are described below.

If innodb_log_file_size is too small, the redo log file frequently switches, triggering database check points. This increases the number of records updated to data files, affecting I/O performance.

Similarly, if there is a large transaction and all redo logs are full, but not yet complete, the log cannot be switched, causing MySQL to block.

Innodb_log_file_size is too large, which greatly improves I/O performance. However, in the event of MySQL restart or downtime, the recovery time is prolonged due to the excessively large redo log file. And that recovery time is often out of control.

By setting the appropriate size and number of redo logs, Innodb guarantees that previously committed records will not be lost even if the database restarts unexpectedly. This is also known as crash-safe.

Crash-safe is understood here without mentioning what it is. Later articles will make it clear.

How to set innodb_log_file_size according to project

For innodb_log_FILES_IN_group, 3 or 4 values are sufficient. No optimization is required.

Focus on innodb_log_file_size size Settings or optimization Settings.

Prior to MySQL8.0, it was common to calculate the size of transaction logs generated over a period of time, and MySQL log files should hold a minimum of one hour of business logs.

The duration depends on the service status. You can calculate the log volume of one minute or one hour.

Pager is a command on the MySQL client. In daily operation of MySQL, you can greatly improve work efficiency by setting the display mode of pager.

Mysql > show engine Innodb status\ G select sleep (60); show engine innodbstatus\ G; The result returned.

Pager disables the nopager command. If the command is not executed, the command becomes invalid only after the next restart.

Here click is done on the virtual machine, you can see that there is no operation for a minute, so the values are the same, you can test on the test server.

Select (back data – front data)/1024/1024*60 asMB_per_hour; The value is the size of the redo log after one hour

However, it is definitely not appropriate to use this method to calculate the value of busy or idle time within a minute will produce a large error.

The proper approach is to determine a few points in the day, use a script to execute at regular intervals, record the corresponding values, and then average them to minimize the error in the calculation.

What is a sequece? As each binlog is generated, the value starts at 1 and increases, with sequenumber adding 1 for each additional transaction.

Second, the binlog

You can see that the MySQL architecture is divided into two layers, one is the Server layer and the other is the storage engine layer.

The Server layer is of course responsible for the functional aspects, while the storage engine layer handles storage-related operations.

The redo log is unique to the Innodb storage engine. Other storage engines do not have the redo log, and the Server has its own log, which is the binlog.

Redo log and binlog

Redo log is Innodb engine specific, binlog is MySQLserver layer specific, all engines can use.

The redo log is a physical log that records the changes made in an update operation. The binlog is a logical log that records the execution logic of an update statement

Redo logs are written in a circular manner. For example, there are 4 1GB redo log files. Binlogs are appended. And that’s why you often see that as long as you have a full binlog file it will give you the data that you want.

Why does MySQL have two logs?

Before Innodb storage engine, MySQL default storage engine was MyIsam, but MyIsam has no restart recovery capability and binlog is only used for archiving.

Innodb is another company that introduced Mysql as a plugin. Since binlog does not have the ability to restart and restore, I used redo log to implement the restart and restore function.

This results in two logs when you use the Innodb storage engine.

What is two-phase commit

Now that you know about redo logs and binlogs, take a look at how an update statement executes.

update user set age = age + 1 where id = 1;

  • The executor first finds the row id = 1 in the engine layer. Since ID is the primary key, it will find the row in the primary key index tree. If the data page on which the line ID=2 is located is already in memory, it is returned directly to the actuator. Otherwise, you need to read memory from disk and then return.
  • When the executor gets the result of id = 2 from the storage engine, it adds 1 to age (originally 25, now 26) and writes the new row to the call engine interface.
  • The engine first updates the data to memory and logs the update to the redo log, which is in the prepare state. The executor is then told that the execution is complete and the transaction can be committed at any time.
  • The executor then generates a binlog of the operation and writes the binlog to disk.
  • The executor calls the commit transaction interface of the engine, and the engine changes the redo log to commit.

It should be clear at this point that an update to SQL will write redo log first and then binlog, which is why the title is called Lifelong Best Friend Redo log, binlog.

Why is two-phase commit required

To make the redo log logical and the binlog logical, look at the following two cases.

Write redo log and then binlog

  • Update statement age = age +1
  • Symptom The MySQL process abnormally restarts after data was written to the redo log
  • The binlog hasn’t even started yet
  • After the system restarts, the value is 26
  • When setting up a slave database, you need to use binlog to restore data, but the operation of age = age +1 is not recorded in binlog
  • At this time, there will be less updates in the slave library, and the age recovered is still 25, resulting in data inconsistency in the master library.

Write binlog and redo log

  • Update statement age = age +1
  • MySQL restarts unexpectedly when data is written to binlog
  • I haven’t written the redo log yet
  • MySQL restarts. This update does not exist for the redo log, so the value is still 25
  • But the value in the binlog is going to be 26
  • When the secondary database needs to be set up, the value of the secondary database is 26, and that of the primary database is 25, causing data inconsistency between the primary database and the secondary database

Therefore, if two-phase commit is not used, the library data recovered from the original library is inconsistent with its binlog.

What does Kong Yiji Teach you about redo Log

Look at a junior high school grade nine language class article “Kong Yiji” this article, even if I do not remember the content, the title always remember ha!

This case was also mentioned in Ding’s article. Why did Ding use this case flexibly to talk about redo log while we didn’t think of it?

The essential reason is not to understand the knowledge thoroughly, using life cases to explain the technology is the easiest to understand and not hard to forget.

The protagonist in Kong Yiji is called the hotel manager, who has two magic weapons that make him work much more efficiently than other bosses. One is a small blackboard and the other is a ledger.

When a customer asks for credit, is it more efficient to write to the board, or is it faster to flip through a book full of books?

The shopkeeper will choose to write on the blackboard first, and then write the blackboard notes into the account book when there are fewer people or they are not busy.

On the other hand, if the boss does not have a blackboard, he can only find the name of the credit person in the dense ledger. If there is an additional credit record, he can find it again and add it.

This process is not only tedious but also unacceptably inefficient. If the hotel has many guests, the owner cannot record them.

This problem also exists in MySQL. Every time you execute an update statement, you need to find that record first and then update it. The whole process costs a lot of IO and search costs. So MySQL has also taken advantage of the wisdom of hotel owners to use blackboards to improve execution efficiency.

Draw a picture to give you a better understanding of the relationship between the owner, the blackboard, and MySQL.

The relationship between the hotel manager and MySQL

Redo log parameter description

Transaction persistence is achieved through redo logging.

After a transaction is committed, the database data is not directly modified, but the related operations are logged to the redo log.

The database flushes the dirty page data from memory to disk according to the corresponding mechanism.

Redo log writing process

The figure above shows a simple redo log writing process.

There are two strange concepts mentioned in the figure above: Buffer pool and redo log Buffer, both of which are part of the memory area of Innodb storage engine.

The redo log file is located on disk.

This means that when DML (INSERT, UPDATE, delete) operations are performed, data is written to the Buffer pool first and then to the redo log Buffer.

The redo log buffer is written to the redo log according to the flush mechanism.

This mechanism sets innodb_flush_log_at_trx_COMMIT to 0,1, and 2 respectively

Brush set strategy

The figure above shows the write strategy for redo logs.

  • When the value is 0, data is stored in the redo log buffer after a transaction is committed and written to disk every second
  • When the value is 1, the redo log buffer must be flushed from memory to disk after the transaction is committed. Once the transaction is committed, the redo log must be stored on disk.
  • If the value of this parameter is 2, the redo log buffer is written to the OS cache after the transaction is committed, not to the disk file. The OS cache is written to the disk file after 1 second.

How does the server respond to an unexpected transaction stop (transaction write process)

  • When the parameter is 0, the previous second’s logs are stored in the log buffer, or memory, and if the machine goes down, 1 second of transaction data may be lost.
  • When the parameter is set to 1, the database has high REQUIREMENTS on I/O. If the IOPS provided by the underlying hardware is poor, the concurrency of the MySQL database will soon fail to improve due to hardware I/O problems.
  • When the parameter is 2, the data is directly written into the OS cache, which belongs to the operating system. If the operating system is damaged or powered off, transaction data within 1 second will be lost. Compared with the first strategy, this strategy is much safer and has less IO requirements.

summary

About performance: 0>2>1

On security: 1>2>0

In the MySQL database, the default flush policy is set to 1 to ensure that data will not be lost after a transaction is committed.

Insist on learning, insist on writing, insist on sharing is the belief that Kaka has been upholding since he started his career. May the article in the big Internet can give you a little help, I am kaka, see you next time.