The introduction
In general terms, a transaction is a multi-step operation that either all succeed or all fail, ensuring a consistent final state. To simplify the application and allow it to ignore some potential error and concurrency issues, the DATABASE layer has uniform support for the ACID feature of transactions.
Basic characteristics of transactions
Transactions have four characteristics: Atomicity, Consistency, Isolation, Durability (commonly known as ACID)
- Atomicity: All operations are successful. If one operation fails, all operations are rolled back to the original state.
- Consistency: Ensuring that data moves from one valid state to another, as long as the data in these transition states meets the rules set up at the beginning.
Consistency ensures that a transaction can only bring the database from one valid state to another, maintaining database invariants: any data written to the database must be valid according to all defined rules, including constraints, cascades,triggers, and any combination thereof. This prevents database corruption by an illegal transaction, But does not guarantee that a transaction is correct.”
- Isolation: Changes in data are not visible to other transactions until a transaction is finally committed.
- Persistence: The final operation result of a transaction can be persisted.
MySQL transaction
Let’s look at how transactions in MySQL guarantee the ACID feature. (No special instructions below, InnoDB engine is the default)
Atomicity:
BEGIN/START TRANSACTION (START TRANSACTION can also specify read-only and read/write transactions, but this is less often used) COMMIT the TRANSACTION: COMMIT rollback the TRANSACTION: ROLLBACK from the consumer’s point of view, we start a transaction and execute the SQL statement. If the COMMIT succeeds, all the SQL statements will be executed successfully. If the COMMIT fails, all the statements will go back to the original state. MySQL always records the undo log for data change operations. The undo log records a chain of data modification. When a rollback operation needs to be performed, the rollback point is found according to the corresponding transaction ID to restore the data. If commit is performed, MySQL will flush the result to disk. In order to ensure that transactions can be rolled back due to abnormal outages during transaction execution, undo log data is persisted to disk. If some large transaction undo memory cache invalidates, the rollback operation may be accompanied by a large amount of disk IO.
Consistency:
Consistency in MySQL mainly includes that the state of data is consistent at all times, and how to ensure consistency in the case of an instance crash. MySQL mainly relies on Doublewrite and Crash Recover to ensure transaction consistency. InnoDB stores data in pages. The default Size is 16KB. The minimum IO unit for EXT family file systems is usually set to 1KB, 2KB or 4KB (portal). 512 bytes Therefore, when writing another page of data, there may be a 2KB computer abnormal power failure, this time the page data is damaged. To solve this problem MySQL writes dirty data to disk,
- First copy the data to the Doublewrite buffer
- Write data sequentially to a file in the shared tablespace (the first write is more efficient because it is sequential)
- Then write data to each tablespace separately (second write, random write is inefficient)
If an exception occurs when writing the tablespace of each table, the data of the shared tablespace can be used for data recovery. Crash Recover relies on MySQL’s binlog and InnoDB’s redo log to recover data after an instance is restarted. I won’t describe the process here, but if you’re interested, you can refer to the portal.
Isolation:
InnoDB supports four isolation levels:
- Read-uncommitted: Before a transaction is committed, its changes can be READ by other transactions
- READ-COMMITTED: Changes to a transaction are not seen until after the transaction is COMMITTED
- REPEATABLE READ: The data READ within a transaction is always consistent
- SERIALIZABLE: When read/write lock conflicts occur, transactions are executed sequentially
InnoDB engines default to repeatable read isolation level, which is the most used isolation level in business scenarios. Below we mainly introduce the implementation of transaction repeatable read. At the repeatable read isolation level, when a transaction starts, a “view” is created for the data, which means that the view does not exist physically, but is a logical snapshot. As mentioned earlier, undo log records are generated when we make changes to data, including INSERT/UPDATE/DELETE. The change record processing of INSERT is relatively simple, and can be deleted directly after the transaction commit. UPDATE and DELETE changelogs are slightly more complex, requiring multiple versions of information to be maintained. Each transaction in InnoDB also generates an increasing transaction ID (trx_ID) in sequence. Below is a diagram of a record with three versions in three transactions. Can be based on the current trx_ID in the unlog “list”, to find the state of the record in the current transaction, this is the implementation principle of MVCC.
Here’s a specific example: Let’s construct a curriculum:
id | teacher |
---|---|
1 | 1 |
3 | 3 |
5 | 5 |
Order of execution of transactions:
A transaction | Two things |
---|---|
START TRANSACTION WITH CONSISTENT SNAPSHOT | START TRANSACTION WITH CONSISTENT SNAPSHOT |
update curriculum set teacher=100 where id = 1; | |
commit; | |
select teacher from curriculum where id=1; | |
commit; |
At this time, the teacher value found in transaction 1 is 1. This is easy to understand, because we started the transaction and created the view. The record with id=1 was changed to 100 in transaction 2, but in transaction 1 we can still find the value of the current trx_ID record with ID =1 according to the “chain” in trx_ID and undo log.
Here’s another example:
A transaction | Two things |
---|---|
START TRANSACTION WITH CONSISTENT SNAPSHOT | START TRANSACTION WITH CONSISTENT SNAPSHOT |
update curriculum set teacher=100 where id = 1; | |
commit; | |
update curriculum set teacher=teacher+1 where id = 1; | |
select teacher from curriculum where id=1; | |
commit; |
Compared to the above example, we added an update statement before the transaction 1 query, so what is the value of teacher queried? The value is 101. Why is 101 not 2? If transaction 1 updates the data according to the original transaction record, the update operation of transaction 2 will be lost and the data will be inconsistent. In this case, the current data will need to be read. In summary, * query operations default to “snapshot read”, with update operations and post-update queries need to get “current read” data. *
A further evolution of this example:
A transaction | Two things |
---|---|
START TRANSACTION WITH CONSISTENT SNAPSHOT | START TRANSACTION WITH CONSISTENT SNAPSHOT |
update curriculum set teacher=100 where id = 1; | |
update curriculum set teacher=teacher+1 where id = 1; | |
select teacher from curriculum where id=1; | |
commit; | |
commit; |
Put the commit for transaction 2 after the update statement for transaction 1. The update statement in transaction 1 is executed only after transaction 2 completes the commit. The update statement in transaction 1 is executed only after transaction 2 completes the commit. MySQL locks are more complex, and we have a separate article on portals
Persistence:
MySQL records multiple logs to prevent data loss. The binlog is written in two steps:
- Write log to binlog cache
- Flush the binlog cache to disk
When a transaction commits, the MySQL executor writes the entire transaction from the binlog cache to the binlog and clears the binlog cache. Each thread has its own binglog cache. There are two other operations that binlog does when writing to a file: write and fsync. Write simply writes data to the buffer of the file system, and fsync drives the data directly to the disk. MySQL provides a parameter, sync_binlog. When sync_binlog=0, only write but not fsync is used for each commit transaction. When sync_binlog=n, fsync is used for all commit transactions after n commit attempts. Redo log writing is similar to binlog writing and has two phases
- Write transaction logs to the redo log buffer
- Data in the redo log buffer is flushed to disks
The redo log buffer uses the innodb_flush_log_at_trx_commit parameter to determine the drive policy.
- A value of 0 leaves only the redo log in the redo log buffer for each transaction commit
- Value 1: Fsync the redo log directly to disk for each transaction commit
- Value 2: each transaction commit writes the redo log to the file system cache
InnoDB will periodically check the transaction logs and make a disk drop according to the above configuration policy. (Voice-over: In order to improve the efficiency of disk writing, MySQL uses the “group commit” mechanism to reduce the number of disk flushing.) In order to ensure the accuracy of the final data, MySQL adopts the two-phase commit strategy:
- Write the redo log and enter the prepare phase
- Write the binlog, wait for the binlog to complete, commit, commit the transaction
If the redo log is committed, the redo log is committed. If the redo log is committed, the redo log is committed. If the redo log is in the prepare state only, check whether the redo log is complete. If the redo log is complete, commit the transaction. If the redo log is incomplete, roll back the transaction according to the data version in the Undo log.
Reference: mysql.taobao.org/monthly/201… Mysql.taobao.org/monthly/201… Mysql.taobao.org/monthly/201… Dev.mysql.com/doc/refman/…