Mysql Essay – Database transaction isolation

Referring to transactions we are certainly not unfamiliar, and the database will always use transactions. The classic example is transferring money, let’s say you have 100 yuan in your account. The balance will be queried and updated in the transfer process, but these operations must be guaranteed to be integrated. Otherwise, after the program is checked and before subtraction, you can check the 100 yuan again by taking advantage of the time difference, and then transfer to another friend. If the bank is so whole, it will be a mess, won’t it? This is where the concept of “transactions” comes in.

In simple terms, transactions are about ensuring that a set of database operations are ‘on the line’. It’s all or nothing. In Mysql, transaction support is at the engine level. But not all engines support transactions, Mysql’s native MyISAM engine does not support transactions, which is one of the reasons MyISAM was banned by Innodb, so this article will use Innodb as an example.

Isolation and isolation level

This concept sounds confusing to beginners, at least when I first started working with databases. When it comes to transactions, you think of ACID (Atomicity, Consistency, Isolation, Durability) and they offer “I” or “Isolation”. When multiple transactions are executed on the database at the same time, dirty reads, non-repeatable reads and phantom reads may occur. In order to solve these problems, the concept of “isolation level” is developed. Before talking about isolation levels, you should first know that there is no solution in the world without side effects. In other words, the higher the isolation level, the higher the security level, the lower the efficiency, the larger the space, etc. Mysql is no exception. There is a balance to be struck, but in most companies the DBA does the database isolation level setting for you. But as developers, it is important for us to understand and master this knowledge for screening and interviewing.

The transaction isolation levels of THE SQL standard include read uncommitted, Read Committed, Repeatable Read, and serializable. Let me explain one by one:

Read uncommitted means that changes made by a transaction before it is committed can be seen by other transactions, resulting in dirty reads.

Read Committed means that after a transaction commits, its changes are not seen by other transactions. Notice change, it’s important to understand what change is, and I’ll talk about that later.

Repeatable read means that the data seen during the execution of a transaction is always the same as the data seen at the start of the transaction. Of course, at the repeatable read isolation level, uncommitted changes are also invisible to other transactions. Is that a human word? I don’t understand.

Serialization, as the name implies, is that for the same row, “write” will be added “write lock”, “read” will be added “read lock”. When a read-write lock conflict occurs, the last accessed transaction must wait for the previous transaction to complete before execution can continue.

“Read committed” and “repeatable read” are more difficult to understand, so I use an example to illustrate these isolation levels. Given that there is only one column in table T and one row has the value 1, the following is the behavior of executing two transactions in chronological order.

mysql> create table T(c int) engine=InnoDB;
insert into T(c) values(1);
Copy the code
A transaction Transaction B
Start transaction query until value 1 Starting the transaction
The query returns the value 1
Change the 1 to the 2
The query results in the value v1
Commit transaction B
Query v2
Commit transaction A
The query results in the value v3

Let’s take A look at the different returns of transaction A at different isolation levels, i.e., the return values of V1, V2, and V3 in the figure.

If the isolation level is read uncommitted, the value of V1 is 2. Transaction B has not committed yet, but the result has already been seen by A. So V2 and V3 are also 2.

If the isolation level is read Committed, V1 is 1 and V2 has a value of 2. Transaction B’s updates are not seen by A until they are committed. So V3 is also going to be 2.

If the isolation level is Repeatable, V1, V2 are 1, and V3 are 2. The reason why V2 is still 1 is that the transaction must see consistent data during execution. The difference between a repeatable read and a committed read is that a committed read is an UPDATE operation that is visible to other transactions, whereas a repeatable read is an inssert operation that is visible to other transactions, and an UPDATE operation that is not.

If the isolation level is serialized, it will be locked while transaction B performs “change 1 to 2”. Transaction B cannot continue execution until transaction A commits. So from A’s point of view, V1 and V2 are equal to 1, and V3 is equal to 2.

In implementation, a view is created in the database and accessed based on the logical result of the view. At the Repeatable Read isolation level, this view is created when the transaction is started and is used for the entire life of the transaction. At the Read Commit isolation level, this view is created at the start of each SQL statement execution. It is important to note here that the latest value on the record is returned directly at the read Uncommitted isolation level, with no concept of view; The serialization isolation level uses locking directly to avoid parallel access.

We can see that the database behaves differently at different isolation levels. The default isolation level of Oracle database is “read Commit”, so for some applications that migrate from Oracle to MySQL, you must remember to set the isolation level of MySQL to “read Commit” to ensure that the database isolation level is consistent.

This can be configured by setting the value of the startup parameter transaction-ISOLATION to read-committed. You can use Show variables to see the current values.

mysql> show variables like 'transaction_isolation';
 +-----------------------+----------------+
 | Variable_name | Value |
 +-----------------------+----------------+
 | transaction_isolation | READ-COMMITTED |
 +-----------------------+----------------+
Copy the code

In summary, existence is logic, and each isolation level has its own usage scenario, depending on your business situation. I think you might ask, when do you need a “repeatable” scenario? Let’s look at an example of data proofreading logic. Suppose you are managing a personal bank account. One holds the balance at the end of each month and the other holds the statement details. At this time, you need to do data proofreading, that is, judge whether the difference between the balance of last month and the current balance is consistent with the statement details of this month. You want to be able to proofread without affecting your results if a user makes a new transaction.

This is where the repeatable read isolation level comes in handy. The view when a transaction is started can be considered static and not affected by updates to other transactions. See? It’s an update.

The implementation of transaction isolation, which I haven’t quite figured out yet, is very complex and abstract. And I’ll talk a little bit more about how this thing works in one video.

How a transaction is started

As mentioned earlier, long transactions have these potential risks, which I certainly recommend you avoid. In fact, many times business development students do not intentionally use long transactions, usually due to misuse. MySQL transactions can be started in the following ways:

  1. Explicitly start a transaction statement, begin or Start transaction. The corresponding commit statements are COMMIT and rollback statements are rollback.
  2. Set autocommit=0, this command will turn off autocommit for this thread. This means that if you only execute a SELECT statement, the transaction is started and will not commit automatically. The transaction persists until you actively execute a COMMIT or ROLLBACK statement, or disconnect.

Some client connection frameworks will execute a set autocommit=0 command by default after a successful connection. This results in subsequent queries being in a transaction, which results in an unexpectedly long transaction if the connection is long.