Properties of transactions (ACID)

A group of friends in the interview, encountered the interviewer asked a very difficult question:

Why Mysql uses Repeatable read as default isolation level?

Let’s get down to business:

We all know the properties of transactions: atomicity, consistency, isolation, and durability (ACID)

In order to maintain consistency and isolation, locking is generally used for processing. However, locking reduces concurrent processing capacity

The database is a highly concurrent application, so the handling of locking is the essence of transactions.

Now let’s look at the blocking protocol and what does a transaction do in the database

Locking Protocol

MySQL lock system: Shared and exclusive locks are shared and exclusive locks, also known as read locks (S) and write locks (X). Exclusive locks can be divided into row locks and table locks.

Locking Protocol: Rules for Locking data using X or S locks. For example, when to apply X or S lock, duration, when to release lock, etc.

Level one, level two, and level three lockdown protocols

Different rules for the blocking mode will form different blocking protocols, which provide different guarantees for the correctness of concurrent operations

Lockdown protocol

Level 1 lockdown protocol definition: transaction T must lock data R before modifying it (exclusive lock), until the end of the transaction. Transaction end includes normal end (COMMIT) and abnormal end (ROLLBACK).

A level 1 lockdown protocol prevents lost modifications and guarantees that transaction T is recoverable. The problem of lost modifications can be solved by using a level 1 lockdown protocol.Copy the code

In a tier 1 lockdown protocol, if only the data is read without modification, there is no need to lock, and it does not guarantee repeatable and unread “dirty” data.

Secondary lockdown protocol

Second level lockdown protocol definition: the first level lockdown protocol plus transaction T must add S lock (shared lock) to data R before reading it, and release S lock after reading. The locking and unlocking of transactions are strictly divided into two phases: the first phase is locked and the second phase is unlocked.

  • Lock phase: An S lock (shared lock, but not exclusive lock) is acquired before any read operation, and an X lock (exclusive lock, no other lock) is acquired before any write operation. If the lock fails, the transaction enters the wait state and does not continue until the lock succeeds.

  • Unlock phase: After a transaction has released a lock, the transaction enters the unlock phase. During this phase, the transaction can only be unlocked but cannot be locked.

In addition to preventing lost changes, the second-level lockdown protocol further prevents “dirty” data from being read. However, in a second-level lockdown protocol, the s-lock is released after reading the data, so it is not guaranteed to be repeatable.

The purpose of secondary lockdown is to ensure the correctness of concurrent scheduling. That is, if the transaction satisfies the two-segment locking protocol, the concurrent scheduling strategy of the transaction is serial. Ensure that concurrent scheduling of transactions is serialized (serialization is important, especially for data recovery and backup)

Tertiary lockdown protocol

Three-level lockdown protocol definition: the first level lockdown protocol plus transaction T must lock data R (shared lock) before it is released until the end of the transaction. An S lock is added on the basis of the first level lockdown protocol (first level lockdown protocol: X lock is added before modification, and the transaction is released after completion), and the S lock is released after the transaction ends

In addition to preventing the loss of modified and unread “dirty” data, the three-level lockdown protocol further prevents unrepeatable reads. The main differences in the above three-level protocols are what operations require a lockdown and when they are released.

There are four isolation levels for transactions

In database operations, transaction isolation levels are proposed to ensure the correctness of concurrent data reads. The lockdown protocols mentioned above also exist to build these isolation levels.

Isolation level Dirty Read NonRepeatable Read Phantom Read
Read uncommitted may may may
Read committed Can’t be may may
Repeatable read Do notcan Do notcan may
Serializable Can’t be Can’t be Can’t be

The problems associated with transactional concurrent access and the isolation levels were described in detail in my last article

A book makes sense of business

Why RR

In general DBMS systems, read-comrepeatable (RC) isolation level is used as the default isolation level by default, such as Oracle and SQL Server, while MySQL uses read-repeatable (RR) isolation level. Remember, the higher the isolation level, the more data consistency problems can be solved, the greater the theoretical performance loss, and the lower the concurrency. The isolation levels are SERIALIZABLE > RR > RC > RU

We can set and get the isolation level of the database by:

To view the isolation level of the system:

mysql> select @@global.tx_isolation isolation;
+-----------------+
| isolation       |
+-----------------+
| REPEATABLE-READ |
+-----------------+
1 row in set.1 warning (0.00 sec)
Copy the code

To view the isolation level of the current session:

mysql> select @@tx_isolation;
+----------------+
| @@tx_isolation |
+----------------+
| READ-COMMITTED |
+----------------+
1 row in set.1 warning (0.00 sec)
Copy the code

Set the isolation level of the session in ascending order:

set session transacton isolation level read uncommitted;
set session transacton isolation level read committed;
set session transacton isolation level repeatable read;
set session transacton isolation level serializable;
Copy the code

Set the isolation level of the current system in descending order:

set global transacton isolation level read uncommitted;
set global transacton isolation level read committed;
set global transacton isolation level repeatable read;
set global transacton isolation level serializable;
Copy the code

Repeated Read: Repeated Read. A LOCK based concurrency control DBMS requires read locks and write locks on selected objects until the end of a transaction, but does not require “range-locks.” Therefore, “phantom reads” can occur at this transaction level to ensure that data retrieved from the same transaction is consistent from start to finish. Is the default transaction level of Mysql.

So let’s think about two questions

  • What if unrepeatable reads occur at the Read Commited level? Need to be fixed?

Don’t solve it, this problem is acceptable! After all, your data has been submitted, read itself is not too big a problem! The default isolation level for Oracle and SqlServer is RC, and we have not changed its default isolation level.

  • Mysql > Select Read Commited (Read Commited) as the default isolation level. What about choosing Repeatable Read as the default isolation level?

Due to historical reasons, the default format of Binlog statements in Mysql(before version 5.1) is to record the SQL requests received by the system in sequence. 5.1 and thereafter, MySQL provides three types of Binlog format: Row,Mixed and Statement. When the Binlog format is statement, there will be a BUG when using RC isolation level, so MySQL uses Repeatable Read as the default isolation level!

Binlog profile

Mysql binlog is a binary log file used to record data updates or potential updates to the Mysql database (for example, when a DELETE statement is executed and no data is actually deleted). To view the specific event types of binlog, run the “show binlog events in ‘binlogfile'” statement. All operations recorded by binlog actually have corresponding event types

MySQL binlog can work in three different modes: Row (select Row if you need to use MySQL’s special functions, such as stored procedures, triggers, functions, and maximizes data) The modified data on each row is recorded in logs, and the same data is modified on the slave side. Advantages: Clearly records the details of each row of data modification. Disadvantages: Large amount of data

Statement (Default) Description: Each MODIFIED SQL Statement is recorded in the master’s bin-log. The SLAVE SQL process parses the modified SQL Statement into the same SQL Statement executed by the master. The statement mode is not recommended for master/slave synchronization because some statements are not supported. For example, some statements contain the UUID function and LOAD DATA IN FILE statements. The advantages of Row level are solved. Data changes in each Row do not need to be recorded, reducing the amount of bin-log logs, saving disk I/O, and improving the performance. Disadvantages: Inconsistency between primary and secondary replication is easy to occur

Mixed: Combines the advantages of Row level and Statement level, with a more complex binlog structure.

A binlog is a file that records database changes and is required for master/slave replication

Inconsistent master-slave operation

If the binlog format is STATEMENT and the isolation level is Read Commited **, what is the bug? Test table:

mysql> select * from test;
+----+------+------+
| id | name | age  |
+----+------+------+
|  1 | NULL | NULL |
|  2 | NULL | NULL |
|  3 | NULL | NULL |
|  4 | NULL | NULL |
|  5 | NULL | NULL |
|  6 | NULL | NULL |
+----+------+------+
6 rows in set (0.00 sec)

Copy the code
Session1 Session2
mysql> set tx_isolation = ‘read-committed’;
Query OK, 0 rows affected (0.00 SEC) mysql> set tx_isolation = ‘read-committed’;
Query OK, 0 rows affected (0.00 SEC)
begin;

Query OK, 0 rows affected (0.00 sec)
begin;

Query OK, 0 rows affected (0.00 sec)
delete from test where 1=1;
Query OK, 6 rows affected (0.00 sec)
insert into test values (null,’name’,100);
Query OK, 1 row affected (0.00 sec)
commit;
Query OK, 0 rows affected (0.01sec)
commit;
Query OK, 0 rows affected (0.01sec)

And the Master says

select * from test;
+----+------+------+
| id | name | age  |
+----+------+------+
|  7 | name |  100 |
+----+------+------+
1 row in set (0.00 sec)

Copy the code

However, you execute the statement on the slave at this point to get the output


mysql> select * from test;
Empty set (0.00 sec)

Copy the code

The order of execution on master is delete first, then insert! The binary log is cached before the transaction is submitted, and written to the record after the transaction is submitted. Therefore, the sequence is inserted first and then deleted. The slave synchronizes binglog, so the order of execution on the slave machine is different from that on the host! The slave deletes all data after insertion.

There are two solutions! (1) The isolation level is set to Repeatable Read, and clearance lock is introduced under this isolation level. When Session 1 executes the DELETE statement, the gap is locked. Ssession 2 will block the insert statement! (2) Change the format of binglog to row format, at this time is based on the row copy, naturally there will not be SQL execution order is not the same problem! However, this format was only introduced in mysql5.1. For historical reasons, mysql set the default isolation level to Repeatable Read to ensure that master/slave replication is not a problem!

RU and Serializable

** Read UnCommitted and Serializable ** isolation levels are not used in projects because:

Read UnCommitted

Dirty reads are allowed, that is, data that may be read from uncommitted transactions in other sessions from one transaction to another

Serialization (Serializable)

Use of pessimistic locking theory, simple implementation, more secure data, but very poor concurrency. This can be used if your business has very little or no concurrency and requires timely and reliable data. This isolation level is typically used when mysql is using its own distributed transaction functionality

RC and RR

There should be only one question left: should the isolation level be read committed or repeatable?

Next, compare the two levels in the first case:

In RR isolation level, there are gap locks, resulting in a much greater chance of deadlock than RC!

Implement a simple gap lock example

select * from test where id <11 ;
+----+------+------+
| id | name | age  |
+----+------+------+
|  1 | NULL | NULL |
|  2 | NULL | NULL |
|  3 | NULL | NULL |
|  4 | NULL | NULL |
|  5 | NULL | NULL |
|  6 | NULL | NULL |
|  7 | name |   7  |
+----+------+------+
7 rows in set (0.00 sec)
Copy the code
session1 session2
mysql> set tx_isolation = ‘repeatable-read’;
Query OK, 0 rows affected (0.00 SEC) mysql> set tx_isolation = ‘repeatable-read’;
Query OK, 0 rows affected (0.00 SEC)
Begin;
select * from test where id <11 for update;
insert into test values(null,’name’,9); // blocked!
commit;
Query OK, 0 rows affected (0.00 sec)
Query OK, 1 row affected (12.23 SEC) // The operation was completed after the lock was released

At RR isolation level, the gap (-∞, 10) can be locked to prevent other transactions from inserting data! At the RC isolation level, there are no gap locks and other transactions can insert data!

Ps: Deadlocks are not impossible at the RC isolation level, but less likely than RR

Lock table and lock row

In RR isolation level, a table is locked if a condition column does not match an index! At the RC isolation level, only rows are locked

select * from test;
+----+------+------+
| id | name | age  |
+----+------+------+
|  8 | name |   11 |
|  9 | name |    9 |
| 10 | name |   15 |
| 11 | name |   15 |
| 12 | name |   16 |
+----+------+------+
Copy the code

Example of a lock table:

session1 session2
Begin;
update test set age = age+1 where age = 15;
Rows matched: 2 Changed: 2 Warnings: 0
insert into test values(null,’test’,15);
ERROR 1205 (HY000): Lock wait timeout exceeded;
Commit;

Session2 insert failed

select * from test;
+----+------+------+
| id | name | age  |
+----+------+------+
|  8 | name |   11 |
|  9 | name |    9 |
| 10 | name |   16 |
| 11 | name |   16 |
| 12 | name |   16 |
+----+------+------+
Copy the code

Semi-consistent is the feature

At RC isolation levels, the semi-consistent feature increases the concurrency of UPDATE operations!

In 5.1.15, InnoDB introduced a concept called “semi-consistent” to reduce conflicts and lock waits when updating the same row. A semi-consistent read is an UPDATE statement that reads a row of locked records. InnoDB returns the most recently committed version of the record to determine if it meets the WHERE condition. If yes, retry the read operation, the latest version of the row will be read and locked!

advice

At the RC level, binlogs are in row format, which is row-based copy, and Innodb’s founders recommended binlogs in this format

For Internet projects, use the Read Commited isolation level

conclusion

Due to historical reasons, the statement format is used in the binlog of older versions of Mysql. If the RR isolation level is not used, the primary and secondary files are inconsistent

Currently (after version 5.1) we use binlog in row format with RC isolation level to achieve better concurrency performance.

Concern public number: Java baodian