What is a transaction?

Transactions refer to the mechanism by which, after a DML data modification statement is committed to the database, either all data is written successfully or all data is rolled back to its original state if one of the operations fails. Database ensures data integrity and consistency through transactions.

ACID

A complete transaction must contain the following four characteristics: atomicity, consistency, isolation, and persistence.

category	describe
Atomicity	The transaction is executed as a whole, and all or none of the operations on the database contained within it are executed.
Consistency	Transactions should ensure that the state of the database changes from one consistent state to another. Consistent state means that the data in the database should meet integrity constraints.
Isolation	When multiple transactions execute concurrently, the execution of one transaction should not affect the execution of other transactions.
“Durability”	Changes made to the database by committed transactions should be permanently stored in the database

Multithreaded concurrent transaction problems

In a multi-threaded environment, it is possible for multiple threads to modify the same data at the same time, which leads to operation conflicts. According to the concurrency control theory, there are two ways to solve the conflict problem:

1. Avoid conflicts, such as serialization and locking.
2. Allow collisions, i.e. allow multiple versions of data, such as using MVCC.

At the same time, there is a problem of isolation in multi-threaded transactions. When multiple transactions are executed concurrently, whether data changes in one transaction can be sensed in another transaction can be detected. Database theory specifically defines a series of terms to describe:

The term	Introduction to the
Dirty read	One transaction can read uncommitted data from another transaction. This often leads to inconsistencies in the data
Unrepeatable read	The situation where the same row is read twice in the same transaction and the data is inconsistent is called non-repeatable reads
Phantom read	A situation in which data counted by statistics or other summary statements are inconsistent within the same transaction

To solve the above problems, the database proposes transaction isolation levels to correspond to them. As you can see from the table below, the better the consistency, the worse the concurrency.

Isolation level	Dirty read	Unrepeatable read	Phantom read
Read Uncommit	There are	There are	There are
Read Commit	There is no	There are	There are
Repeatable	There is no	There is no	There are
Serialization (Serializable)	There is no	There is no	There is no

InnoDB transaction

In the computer world, data consistency and data crash recovery are usually achieved through WAL (Write Ahead Log) technology — if users make changes to the data in the database, they must ensure that the Log falls before the data. In InnoDB engine, data consistency and integrity are guaranteed by two sections of binlog and Redolog (redolog is written before transaction execution and Binlog is written before execution). If there is an exception in the process of data flushing, when MySQL restarts, Data can be restored based on logs to ensure data consistency. In MySQL, log files are the most important data, while data files are not so important.

MVCC mechanism

The InnoDB engine is a multi-version storage engine that implements transactions by preserving older versions of data rows.

Snapshot read and current read

In MVCC, data can be read in two forms:

Snapshot read: Reads only the visible version of the current transaction without locking. Select uses snapshot reads, so query performance can be greatly improved.
Current read: Reads the current version and locks it. Insert, Update, and Delete use the current read.

Undo Log

Older rows of data are called undo logs and are stored in rollback sections of the tablespace. Typically our undo log is used to do two things:

The rollback
Read consistency

The undo log is a logical log that records the original state of the row before the current transaction operation. It can be divided into insert undo log and update undo log.

type	instructions
Insert Undo Log	The insert Undo log is generated when the data is inserted, and is only used when the data is rolled back. Because the row is new, the data did not exist before, so there are no multiple versions and can be discarded after the transaction commits.
Update Undo Log	Updating the Undo log is needed not only for rollback purposes, but also for consistent reads because the data may be used in other transactions. It is possible that other transactions are also processing this data, so these logs should be discarded only if there is no transaction for this data. Therefore, we need to pay attention to the transaction processing time. If a transaction is not released for a long time, the corresponding Undolog log cannot be cleared. When the throughput is high, it needs to occupy a large table space

type

instructions

Insert Undo Log

The insert Undo log is generated when the data is inserted, and is only used when the data is rolled back. Because the row is new, the data did not exist before, so there are no multiple versions and can be discarded after the transaction commits.

Update Undo Log

Updating the Undo log is needed not only for rollback purposes, but also for consistent reads because the data may be used in other transactions. It is possible that other transactions are also processing this data, so these logs should be discarded only if there is no transaction for this data. Therefore, we need to pay attention to the transaction processing time. If a transaction is not released for a long time, the corresponding Undolog log cannot be cleared. When the throughput is high, it needs to occupy a large table space

Data row format

InnoDB engine implicitly adds the following three hidden fields to each row of data. InnoDB implements transactions through the interaction of these three fields. The format of the data row is as follows:

DB_ROW_ID is a 6-byte row primary key that uniquely identifies a row. InnoDB automatically creates a row ID as the primary key if no primary key is specified.
DB_TRX_ID Specifies the transaction ID of the last operation. The value is 6 bytes long and records the transaction ID of the last operation
DB_ROW_PTR Rollback pointer to a row of undo logs in the rollback segment.

For details on how to implement MVCC, see “Do you really know MVCC? The process is written in great detail.

The Delete process

In InnoDB, delete is actually treated as an update operation. When we need to delete a piece of data, InnoDB first marks the deletion bit and its secondary index deletion bit as deleted in memory. Then, when the undo log of the data can be deleted, InnoDB removes the data physically through the Purge thread.

Similarly, InnoDB’s physical data deletion is a logical deletion process, which simply marks the row as deleted and nothing more. Data marked as deleted does not mean that the space owned by the row can be reused immediately, and the column will be overwritten when it can be used later. So when we delete all the data using the delete method, we can see that the.ibd file size does not decrease. This phenomenon that index columns can be reused but not actually used is usually called index void. To solve this problem, we can recycle the disk by rebuilding the table. The rebuild command is as follows:

alert table table_name engine = InnoDB
Copy the code

Clustered indexes differ from secondary indexes

We know that InnoDB primary keys use clustered indexes. For clustered indexes, the leaf node contains all the fields of the data, so the row can be replaced directly when the data changes.

For secondary indexes, things may be a little different. When we update a secondary index, InnoDB first marks the previous index as deleted, then inserts a new index record, and the deleted index is subsequently purged through the Purge thread. So if the secondary index is updated or deleted by another transaction, the overwrite index is invalidated and the query needs to be returned to the table.

Thank you

InnoDB and the ACID Model
How does MVCC (Multi-Version Concurrency Control) work
MySQL · Engine features · InnoDB Undo log roaming
Do you really understand MVCC? How about doing it by hand?

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Mysql > update mysql

What is a transaction?

ACID

Multithreaded concurrent transaction problems

InnoDB transaction

MVCC mechanism

Snapshot read and current read

Undo Log

Data row format

The Delete process

Clustered indexes differ from secondary indexes

Thank you

Mysql > update mysql

What is a transaction?

ACID

Multithreaded concurrent transaction problems

InnoDB transaction

MVCC mechanism

Snapshot read and current read

Undo Log

Data row format

The Delete process

Clustered indexes differ from secondary indexes

Thank you

Related Posts

Chaoneng open source quantitative trading system, covering all kinds of trading at home and abroad interface

Cow, cow, cow! This multi-threaded technical note, Ali architect read all say good!

Source code analysis of DBCP database connection pool principle