This article mainly discusses the implementation principle of ACID under MySQL InnoDB engine. It does not elaborate too much on the basic knowledge such as what is transaction and the meaning of isolation level.

ACID MySQL as a relational database, in terms of the most common InnoDB engine, how to guarantee ACID.

Atomicity: A transaction is the smallest unit of execution and cannot be split. Atomicity ensures that actions are either done completely or not at all; Consistency: Data is consistent before and after a transaction is executed. Isolation: A transaction is not disturbed by other transactions when accessing the database concurrently. (Durability) : After a transaction is submitted. Changes to data in the database are persistent, even if the database fails.

Isolation,

Let’s start with isolation, starting with the four levels of isolation.

Different isolation levels are designed to address different problems. That is, dirty reading, magic reading, unrepeatable reading.

So the different levels of isolation, how is isolation achieved, why can things not interfere with each other? The answer is lock and MVCC.

The lock

Let’s start with locks. How many locks does MySQL have?

Granularity refers to table locks, page locks, and row locks. Table locks have intention to share locks, intention to exclude locks, automatic locks and so on. Row locking is implemented by the engines themselves at the engine level. However, not all engines support row locking, such as MyISAM engine does not support row locking.

Types of row locks In InnoDB transactions, row locks are implemented by locking index entries on indexes. This means InnoDB uses row-level locks only if data is retrieved by index criteria, otherwise it uses table locks. Row-level locking is also divided into two types: shared locks and exclusive locks, and intentional shared locks and intentional exclusive locks that need to be obtained before locking.

Shared lock: a read lock that allows other transactions to lock S, but does not allow other transactions to lock X. That is, other transactions are read-only and cannot be written. select… Lock in share mode Lock. Exclusive lock: write lock that does not allow other transactions to have an S or X lock. Insert, update, delete, for update lock. Row locks are added when they are needed, but are not released immediately when they are no longer needed, but at the end of a transaction. This is the two-phase lock protocol.

Implementation algorithm of line lock

Record Lock A Lock on a single row Record that always locks the index Record.

The Gap Lock can only Lock rows, but the new insert record this action, to update the “Gap” between records. So add gap lock to solve phantom read.

Next-key Lock Gap Lock + Record Lock, open and close left.

Locking is an overview of isolation, as you can see. With a lock, when a transaction is writing data, other transactions cannot obtain the write lock and cannot write data, which ensures the isolation between transactions to a certain extent. If you add a write lock, why can other transactions also read data?

MVCC stated earlier that with locks, the current transaction cannot modify data without a write lock, but it can still read, and when read, even if the row of data has been modified and committed by other transactions, it can still read the same value repeatedly. That’s MVCC, Multi-version Concurrency Control.

Version of the chain

Innodb row records are stored in a format with some additional fields: DATA_TRX_ID and DATA_ROLL_PTR.

DATA_TRX_ID: indicates the version number of a data row. Identifies the transaction ID that made the most recent change to the row record.

DATA_ROLL_PTR: pointer to the rollback segment of the row. All the old versions on the row are organized in a linked list in undo log.

Undo log: Records the logs before data modification, which will be explained later.

ReadView is created at the beginning of each SQL entry and has several important properties:

Trx_ids: set of active (uncommitted) transaction version numbers for the current system.

Low_limit_id: Maximum transaction version number +1 when the current read view is created.

Up_limit_id: “system was in active transaction minimum version” when the current Read view was created

Creator_trx_id: transaction version number for creating the current Read View;

Began to query

Now we start the query, and a SELECT comes along and finds a row.

DATA_TRX_ID <up_limit_id: Indicates that data exists before the current transaction.

DATA_TRX_ID >= low_limit_id: Indicates that this data is generated after the current read View is created and is not displayed.

Undo log undo roll_ptr undo log undo roll_ptr undo log undo roll_ptr undo log

Up_limit_id <DATA_TRX_ID <low_limit_id: Depends on the isolation level.

RR level illusion

With locks and MVCC, transaction isolation is resolved. The default RR level resolves illusory reading. Phantom reading is usually for INSERTS, and non-repeatability is for updates.

Our expectation is that

id  name
1   A
2   B
Copy the code

Actually it is

id  name
1   B
2   B
Copy the code

In fact, the isolation level of repeatable reads in MySQL does not completely solve the illusion problem, but solve the illusion problem in the case of read data. However, the modification operation still has the illusion problem, that is, MVCC for the illusion solution is not complete.

Atomicity then atomicity. The undo log rollback was mentioned earlier. The isolation MVCC actually relies on it, as does atomicity. The key to achieving atomicity is the ability to undo all SQL statements that have been successfully executed when the transaction is rolled back.

When a transaction changes the database, InnoDB generates the corresponding undo log. If a transaction fails or a rollback is called, and the transaction needs to be rolled back, you can use the information in the undo log to rollback the data to where it was before the modification. Undo logs are logical logs that record information about SQL execution. When a rollback occurs, InnoDB does the reverse based on the contents of the undo log:

For each INSERT, delete is performed during the rollback; For each DELETE, the rollback will perform an INSERT; For each update, a reverse update is performed to change the data back. Take the update operation as an example. When a transaction performs an update, the generated Undo log contains the primary key of the row that was modified (to know which rows were modified), the columns that were modified, and the values of the columns before and after the changes. This information can be used to restore the data to the state before the update.

Persistence Innnodb has a number of logs, persistence is redo log.

The persistence of an SQL update statement must be related to Write. WAL is often mentioned in MySQL. WAL is short for write-Ahead Logging. Like a small shop business, there is a powder board, there is a ledger, visitors to write powder board first, such as not busy time to write ledger.

The redo log is the pink board. When a log is updated, InnoDB writes it to the redo log. When appropriate, this operation record is updated to disk, and this update is usually done when the system is idle, which is like the shopkeeper after closing time. The redo log has two characteristics: constant size, circular writing

Crash-safe has two phases for redo logs: commit and prepare. If you do not commit in two phases, the state of the database may be inconsistent with the state of the database recovered from its logs. All right, let’s stop there and look at another one.

Buffer Pool

InnoDB also provides a Buffer. The Buffer Pool contains the mapping of some data pages in the disk, which serves as a Buffer for accessing the database. When reading data, it is read from the Buffer Pool first, and if not, it is read from the disk and put into the Buffer Pool. When data is written to the database, it is first written to the Buffer Pool. The modified data in the Buffer Pool is periodically flushed to disks. The use of Buffer Pool greatly improves the efficiency of reading and writing data, but it also brings a new problem: if MySQL crashes and the modified data in the Buffer Pool is not flushed to disk, the data will be lost and the persistence of transactions cannot be guaranteed.

So I added a redo log. When data is modified, the redo log logs data in the Buffer Pool.

When a transaction commits, the fsync interface is called to flush the redo log.

If the MySQL database is down, you can read the redo log data during the restart to restore the database.

Redo log uses write-ahead logging (WAL). All changes are written to the log first and then updated to the Buffer Pool. This ensures that data will not be lost due to MySQL downtime and meets the persistence requirements. And there are two more to do that

Advantages: Dirty Page brushing is random. The redo log sequence I/O dirty Page brushing is performed in the unit of Page. While the redo log contains only the redo logs that actually need to be written, invalid I/OS are reduced.

There is also a binlog that is used to write data and restore data, what’s the difference?

Redolog is a physical log that records “what was changed on a data page”. For example, “Add 1 to the C field in the row where ID=2” is written: The redolog loop is written at many times, and the binlog is appended and written at transaction commit time

Update T set c=c+1 where ID=2; update T set c=c+1 where ID=2;

The executor first finds the engine and fetches the line ID=2. ID is the primary key, found directly by tree search. If the data page on the row ID = 2 is in memory, it is returned directly to the actuator; Otherwise, you need to read the data into the memory from the disk and then return to the memory. The executor takes the row of data given by the engine, adds 1, N+1, to get a new row of data, and then calls the engine interface to write the new row of data.

The engine updates the data to memory and logs the update to the redo log. The redo log is prepared. The executor is then told that the execution is complete and the transaction can be committed at any time. The executor generates a binlog of this operation and writes the binlog to disk. The executor calls the commit transaction interface of the engine, and the engine changes the redo log to the commit state

Why do YOU write redo logs first? Bin: binlog is lost after redo log. Bin after redo: one more transaction, 1 after recovery.

Consistency consistency is the ultimate goal of transaction pursuit. The atomicity, persistence and isolation mentioned in the preceding question are all to ensure the consistency of database state. Of course, the above is the guarantee at the database level, and the implementation of consistency also needs to be guaranteed at the application level. That is, your business, such as the purchase operation only deducts the balance of the user, not the inventory, certainly cannot guarantee the consistency of the state.

How to implement ACID in MySQL? Sometimes, when you know there are undo logs and redo logs but you may not know why, it becomes clearer when you know the purpose of the design.

reference

MVCC implementation principle MySQL transaction lock ACID implementation principle MySQL transaction