This starting point is to talk about the implementation principle of Mysql transactions.

What are the techniques and ideas used to implement transactions?

Atomicity: Undo log for rollback persistence: Redo log for recovery isolation: lock and MVCC for read-write separation, read-parallel, read-parallel consistency: rollback, recovery, and isolation in concurrent environments for consistency. What is MySQL?

MySQL is a relational database management system developed by MySQL AB, a Swedish company. It is currently a product of Oracle. MySQL is one of the most popular Relational Database Management systems, and one of the best RDBMS (Relational Database Management System) applications in WEB applications. MySQL is a relational database management system that keeps data in different tables instead of putting all data in a large warehouse, which increases speed and flexibility. The SQL language used by MySQL is the most commonly used standardized language for accessing databases. MySQL software adopts the double licensing policy, which is divided into community edition and commercial edition. Due to its small size, fast speed and low total cost of ownership, especially the characteristics of open source, MySQL is generally selected as the website database for the development of small and medium-sized websites.

Atomicity Atomicity is an indivisible feature that either succeeds all and commits, or fails all and rolls back. If transactions are started, there will not be 100 less successes for A and 100 more failures for B in the above scenario. MySQL implements atomicity through the Redo Log. When an SQL statement is executed, it is written to the Redo Log buffer before the SQL statement is executed. If the SQL statement fails, it is rolled back based on the records in the Redo Log buffer.

Implementation principle: Undo log

Before explaining the atomicity principle, let’s first introduce MySQL transaction logging. There are many types of MySQL logs, such as binary logs, error logs, query logs, and slow query logs.

In addition, InnoDB storage engine provides two types of transaction logging:

Redo log Undo log The redo log is used to ensure transaction persistence. Undo log is the basis for atomicity and isolation of transactions.

Let’s go back to undo log. The key to achieving atomicity is the ability to undo all SQL statements that have been successfully executed when the transaction is rolled back.

InnoDB implements rollback using undo log:

When a transaction changes the database, InnoDB generates the corresponding undo log. If a transaction fails or a rollback is called, and the transaction needs to be rolled back, you can use the information in the undo log to rollback the data to where it was before the modification. Undo logs are logical logs that record information about SQL execution. When a rollback occurs, InnoDB does the reverse based on the contents of the undo log:

For each INSERT, delete is performed during the rollback. For each DELETE, an INSERT is performed during the rollback. For each update, a reverse update is performed to change the data back. Take the update operation as an example. When a transaction performs an update, the generated Undo log contains the primary key of the row that was modified (to know which rows were modified), the columns that were modified, and the values of the columns before and after the changes. This information can be used to restore the data to the state before the update.

persistence

define

Persistence means that once a transaction is committed, its changes to the database should be permanent. Subsequent operations or failures should not affect it in any way.

Implementation principle: Redo log

Redo log and undo log are InnoDB transaction logs. Let’s talk about why redo logs exist.

As the storage engine of MySQL, InnoDB stores data on disk. However, if I/O of disk is required for reading and writing data, the efficiency will be very low.

For this purpose, InnoDB provides a Buffer Pool that contains a map of some of the data pages on disk as a Buffer to access the database:

When data is read from the database, it is read from the Buffer Pool first. If there is no Buffer Pool, it is read from the disk and put into the Buffer Pool. When data is written to the database, it is first written to the Buffer Pool, and the modified data in the Buffer Pool is periodically flushed to disk (this process is called flushing). The use of Buffer Pool greatly improves the efficiency of reading and writing data, but it also brings a new problem: if MySQL crashes and the modified data in the Buffer Pool is not flushed to disk, the data will be lost and the persistence of transactions cannot be guaranteed.

The redo log was introduced to solve this problem: When data was modified, the redo log was recorded in addition to the data in the Buffer Pool. When a transaction commits, the fsync interface is called to flush the redo log.

If the MySQL database is down, you can read the redo log data during the restart to restore the database.

Redo log uses write-ahead logging (WAL). All changes are written to the log first and then updated to the Buffer Pool. This ensures that data will not be lost due to MySQL downtime and meets the persistence requirements.

Since the redo log also needs to write the log to disk at transaction commit time, why is it faster than writing the modified data in the Buffer Pool directly to disk (i.e., flushing)?

There are two main reasons:

Writing a redo log is an append operation. It is sequential I/O. The default MySQL Page size is 16KB. Every small change on a Page is written to a whole Page. The redo log contains only the parts that actually need to be written, and the number of invalid IO is greatly reduced. Transaction Isolation MVCC MVCC stands for Multi-version Concurrency Control. MVCC enables InnoDB to perform consistent reads at transaction isolation levels.

The simple idea is to query for rows that are being updated by another transaction and see their values before they were updated. This is a powerful technique for increasing concurrency by making it possible for queries to not wait for another transaction to release the lock.

MVCC adds three fields to each ROW: db-trx-ID, db-roll-ptr, and db-row-id

Add or delete check change

In InnoDB, MVCC is implemented by adding two hidden fields to each row, one to record when the row was created and the other to record when the row expired.

In practice, it is not the time that is stored, but the transaction version number, which increases with each new transaction started. So add, delete, change and check the role of the version number is as follows:

Select: reads records whose created version is less than or equal to the current transaction version number, and deletes records whose version is empty or greater than the current transaction version. This ensures that the record exists until it is read

Insert: Saves the version number of the current transaction to the create version number of the row

Update Inserts a new row with the current transaction version as the creation version number of the new row and sets the deletion version number of the original row to the current transaction version number

Delete saves the current transaction version number to the deleted version number of the row

Snapshot read and current read

Snapshot read: Reads the snapshot version, that is, the historical version

Current reading: The latest version is being read

Update, delete, INSERT, select… LOCK In SHARE MODE, SELECT… For update is the current read

Consistency Consistency is a constraint on data visibility that ensures that intermediate data states of multiple operations in one transaction are not visible to other transactions. Because these intermediate states, which are transitional states, are inconsistent with the start and end states of the transaction

Let’s look at the above example in detail.

First, the concept of a transaction unit is introduced. A transaction unit is the smallest unit that completes a specific business. The above example contains two transaction units. In a normal timeline, you should see the following order of execution in order to avoid wrangle. But the wrangle happens, the two transaction units are parallel, and the order of execution follows. Transaction unit two moves to the left, parallel with transaction unit one. In order to avoid conflict, we must make access to shared resources mutually exclusive, using Java code can be described as the following code:Copy the code

public class Consistency {

public static void main(String[] args) { ReentrantLock lock = new ReentrantLock(); lock.lock(); try { int balance = query(); If (balance > 0) {drawingOutCash(); }} catch (Exception e) {

} finally {
  lock.unlock();
}
Copy the code

}} To be strongly consistent, all transaction units execute serially, which is SERIALIZABLE in the transaction isolation level and hence introduces the transaction isolation level.

The isolation level of the transaction

Strongly consistent, all transaction units must be executed SERIALIZABLE, which is the isolation level SERIALIZABLE, but the performance of the system is so predictable that it is almost unusable, so the locking requirements need to be relaxed, so other isolation levels have emerged. The isolation level of a transaction is a breach of consistency for performance reasons, and it is there to break consistency, not to maintain it. There are only four relationships between transaction units: read, read, write, read, and writeCopy the code

Serializable

To further improve performance, read/write locks were introduced, and there were two isolation levels: REPEATABLE_READ (repeatable read) and READ_COMMITED (read committed) : A read lock cannot be upgraded to a write lock, so that writes to shared resources cannot be entered, so that the "read" is parallel, and there is a phantom read, because at this level the table is not considered to be a shared resource, so you can insertCopy the code

Read COMMITTED:

A read lock can be upgraded to a write lock. When a shared resource is being read, a write request can be upgraded to a write lock. In this way, "read" and "read" can be parallel, resulting in illusory and unrepeatable reads and so onCopy the code

read uncommitted

Only add write lock, read without applying lock, so “read”, “read”, “write read” can be parallel, but “write” can not be parallel, so all write is serial, so there are dirty read, unrepeatable read, magic read and so on.

This is the truth of the transaction isolation level, the lower the transaction isolation level, the better the parallelism and the lower the consistency.