In MySQL database and InnoDB storage engine, there are many kinds of files, such as: parameter file, log file, socket file, PID file, MySQL table structure file, storage engine file.
This section focuses on log files. Important functions of MySQL, such as replication and transactions, are related to log files. Log files include error log files, binary log files, slow query log files, query log files, and redo log files. The redo log files are InnoDB engine files.
1. Introduction to log files
1.1 error log
Error log files record the startup, running, and shutdown processes of MySQL and are the first key to locate MySQL problems.
1.2 slow Query log
The slow query log is used to record queries that have taken longer to execute than the length defined by the variable long_query_time. By slowly querying logs, you can find out which query statements are executed inefficiently for optimization.
1.3 General Log
The general query log records information about all requests to the MySQL database, whether or not the request was executed correctly.
By default, the general log function is disabled. Enabling the general log function increases disk I/O. Therefore, you are not advised to enable the general log function for debugging and troubleshooting purposes.
1.4 Binary Log (bin log)
With regard to binary logs, it records all DDL and DML statements executed by the database (except for data query statements SELECT, show, and so on) in the form of events and stored in binary files.
- Log_bin: specifies whether binlog is enabled and the file name.
- Server_id: specifies the unique ID of a server. This parameter must be set when binlog is enabled.
- Binlog_format: Specifies the binlog mode. ROW is recommended.
- Max_binlog_size: controls the size of a single binary log file. If the size of the current log file exceeds this variable, the switch is performed.
- Expire_logs_days: specifies the retention days of binary log files. The default value is 0, indicating that binary log files will not be automatically deleted. The value ranges from 0 to 99.
Binary log files are very important. You are advised to enable binary log files. Binary log files have the following functions:
restore
: Recovers data using binary files. The principle is to retrieve operation records of logs and perform them again
copy
: Similar to the recovery principle, it is generally divided into master and slave libraries.The audit
: By looking at the binary file, you can determine whether there are operations that are dangerous to the database.
Redo log
For InnoDB, redo logs are critical because they record transaction logs for InnoDB’s storage engine.
When an instance or medium fails, such as when a database instance fails due to a power outage on the host, InnoDB’s storage engine reverts to the time before the outage to ensure data integrity.
InnoDB updates data to a redo log and then updates the data to disk when the system is idle or according to a specified update policy. This is known as Write Ahead logging. This technology greatly reduces the frequency of I/O operations and improves the efficiency of data refreshing.
Instead of writing directly to the redo log file, a redo log buffer is first written and then the log files are written in a conditional order.
The size of the redo log is fixed. To enable continuous writing of the redo log, log checkpoint and write_pos indicate the position of the erased log and the position of the written log, respectively. Redo log data writing is shown in the following figure.
When the write_pos flag reaches the end of the log, it jumps from the end to the head of the log for re-writing. So the logical structure of the redo log is not linear, but rather can be viewed as a circular motion. The space between write_pos and checkpoint can be used to write new data. Writes and erasures are carried back and forth.
When write_pos catches checkpoint, the redo log is full. In this case, you cannot continue to execute new database update statements. You need to stop to delete some records and run the checkpoint rule to free up writable space.
Checkpoint rule: When checkpoint is triggered, dirty data pages and dirty log pages in the buffer are flushed to disks.
What is the difference between bin log and transaction log?
-
Bin log records all database logs, including InnoDB, MyISAM and other storage engine logs, while redo log only records InnoDB storage engine logs.
-
Different contents are recorded. Bin logs record specific operations on a transaction. That is, the log is a logical log. The redo log records physical changes made to each Page.
-
The write time is different. The bin log is committed only before the transaction is committed, that is, the disk is written only once. Redo ertries are written to redo logs during a transaction.
-
The redo log is a circular write and erase log. The bin log is an appending log that does not overwrite an already written file.
1.6 rolling back Logs (undo log)
Speaking of redo logs, let’s take a quick look at undo logs.
Rollback logs are also logs provided by InnoDB engine. As the name implies, rollback logs are used to roll back data. When a transaction changes a database, InnoDB logs redo logs and generates undo logs. If a transaction fails or a rollback is called and the transaction needs to be rolled back, you can use the information in the undo log to rollback the data to where it was before the modification.
The undo log is different from the redo log. It is a logical log. It logs information about the execution of SQL statements. When a rollback occurs, the InnoDB engine does the reverse based on the records in the undo log. For example, for each data insert operation (INSERT), a rollback will perform a data delete operation (DELETE); For each data delete operation (DELETE), the rollback will perform a data insert operation (INSERT); For each update, a reverse update is performed to change the data back. Undo log has two functions: one is to provide rollback, and the other is to implement MVCC.
Update statement execution
Let’s take a closer look at bin and redo logs with the execution of an update statement.
update t set c=c+1 where ID=2;
Copy the code
2.1. Update the statement execution process
Let’s look at the internal flow of the executor and InnoDB engine as it executes this simple UPDATE statement.
-
The executor first finds the engine and fetches the line ID=2. ID is the primary key, and the engine uses the tree search directly to find this row. If the row ID=2 is already in memory, it is returned directly to the executor. Otherwise, you need to read into the memory from disk and then return.
-
The executor takes the row given by the engine, adds 1 to it, for example, N+1 to get a new row of data, and then calls the engine interface to write the new row of data.
-
The engine updates the data to memory and logs the update to the redo log. The redo log is prepared. The executor is then told that the execution is complete and the transaction can be committed at any time.
-
The executor generates a binlog of this operation and writes the binlog to disk.
-
The executor calls the commit transaction interface of the engine, and the engine changes the redo log to the commit state.
The flow chart for executing the UPDATE statement is as follows:
2.2 two-stage submission
In this statement, the redo log is written in two stages: prepare and commit. Why two-phase commit?
Let’s start with the popular way:
For example, Xiao Ming goes to the supermarket to buy a bottle of Coke:
- Xiaoming: Boss give me a bottle of coke! The one with the cool heart.
- Boss: The machine scans the coke and tells Xiao Ming that the coke is $2.50. Give me the money.
- Take money and put it in the cash box (record the binlog, the fundamental basis for whether the transaction actually completed, in the commit phase)
- Then let Ming take the Coke away (redo log status commit indicates the closed loop). This is the end of a deal.
- Then synchronize the transaction information of the day’s sales to the database before accounting.
If the transaction is interrupted before the money is collected (in the prepare stage (Step 3)), the transaction fails and the record on the small board is deleted.
If received after money (commit or to commit phase, step 4 | | 5) transaction is interrupted, and then go back and found a record on the system (prepare) and coffers have the income (bin log), then the transaction effectively, complement commit state changes and updates to the inventory.
Use more formal language.
Instead of a two-phase commit, either redo log first and then binlog, or reverse the log order. So let’s see what happens with both of these approaches.
Again, use the previous update statement as an example. If the current row ID=2 and the value of field C is 0, then if the update statement is executed and the first log is written before the second log is written, what happens?
- Write redo log and then binlog. MySQL restarts unexpectedly when the redo log is complete. Because the system can recover data from a redo log crash, the value of c is 1. But because the binlog didn’t finish writing the crash, the binlog didn’t record the statement. Therefore, when the log is backed up later, the saved binlog does not contain this statement. Then you will notice that if you need to restore the temporary library with this binlog, the temporary library will miss the update because the binlog of this statement is lost, and the restored line of C will have a value of 0, which is different from the original library.
- Write binlog and redo log. If there is a crash after the binlog has written, the transaction is invalid after the crash because the redo log has not been written, so the value of c in this row is 0. But binlog already records “change c from 0 to 1”. The value of c in the restored row is 1, which is different from that in the original library.
As you can see, if two-phase commit is not used, the state of the database may be inconsistent with the state of the library recovered from its logs.
You might say, isn’t that a very low probability, and there aren’t many scenarios where temporary repositories need to be restored at every turn?
No, it’s not just a misoperation that requires this procedure to recover data. When you need to expand the capacity of the system, that is, to build more secondary databases to increase the read capacity of the system, the common practice is to use full backup and binlog to achieve this “inconsistency” will cause your online primary and secondary database inconsistency.
In short, both redo log and binlog can be used to represent the commit state of a transaction, and two-phase commit is to keep the two states logically consistent.
2.3. Logs fall down
To ensure that the transaction succeeds, the log must fall to the disk, so that the data of a transaction will not be lost after the database crashes
-
Innodb_flush_log_at_trx_commit if set to 1, the redo log of each transaction is persisted directly to disk. In this way, data will not be lost after MySQL restarts abnormally.
-
When sync_binlog is set to 1, the binlog of each transaction is persisted to disk. This ensures that the binlog will not be lost after MySQL restarts abnormally.
Reference:
[1] : High performance MySQL
[2] : geek time MySQL Practice 45 Lecture
[3] : MySQL Technology Insider InnoDB Storage Engine
[4] : After I finished talking about redo log and binlog, the interviewer blushed
[5] : IN order to make sure you understand the MySQL transaction log thoroughly, I spent all night reading this diagram!
[6] : MySQL several common log
[7] : Mysql learning notes — redo logs and archive logs
[8] : Analyze redo and undo in MySQL transaction