Author’s words: this is my first time to write a blog, originally just want to be used as their own memo, but write to write to find that write a blog is not so simple, is not familiar with this operation, I thought I clicked save later will save the draft box, but the fact is directly released. It had been seen in this interval and mocked mercilessly (sad…) .

background

Because in ready to change my job recently, so start adding some basic knowledge, I always before going to hard back some knowledge points, this time spent a lot of time to ask ask why, years ago about Mysql content has a little bit of experience, in order not to forget, but also to only speak out is one of the really understand, through this blog himself in review.

Four elements of a transaction

  1. Atomicity: A transaction is an atomic unit of operation in which all or none of the modifications to data are performed
  2. Consistency: Database integrity constraints are not broken before and after a transaction.
  3. Isolation: Only one transaction is allowed to request the same data at a time, with no interference between different transactions.
  4. Persistence: After the transaction completes, all updates made by the transaction to the database are saved to the database and cannot be rolled back.

Problems with different isolation levels of transactions

Transaction isolation level Dirty read Unrepeatable read Phantom read
Read uncommitted is is is
Reading has been submitted no is is
Repeatable read no no is
serialization no no no

The difference between illusory and unrepeatable

As for the four isolation levels of transactions in Mysql InnoDB storage engine, I will not elaborate on the specific content here. Here I will mainly say the word illusion, before I have not quite understood this word, now I try to explain it. In High Performance Mysql, illusory reads are defined as follows:

Phantom reading refers to that when a transaction reads a record in a range, another transaction inserts a new record in the range. When the previous transaction reads the record in the range again, phantom rows will be generated.

According to the definition, the meaning of this expression is not very different from that of non-repeatable read. So what is the difference between them? Simply put, it can be understood as: non-repeatable read refers to multiple reads with inconsistent data content, while unreal read refers to: multiple reads with inconsistent data amount.

Phantom read can be understood as the result set of a certain select cannot support the next operation (the operation here mainly refers to insert). More specifically, in the current transaction, select a record is found not to exist, then perform insert operation, indicating that the record already exists, and select again to find the record still does not exist. For the above phenomenon, I think it is a hallucination. This is my understanding of the concept of illusory reading.

When does phantasm occur

According to the four isolation levels of transaction, read uncommitted, read committed, repeatable read and serialization, phantom can not be avoided at the first three levels, but in reality, different databases can avoid phantom at the repeatable read (RR) level. For Mysql is the use of MVCC+ next-key implementation to avoid phantom reading. So what is MVCC and Next-key? Let’s say it one by one:

MVCC

Concurrency Control is called Multiversion Concurrency Control. The concurrency Control is designed to address the problem of repeating a transaction and ensuring that all the data read to the same transaction is the same. Unreal unreal, especially read, in MVCC there are two types of read: snapshot read and current read,

  1. Snapshot read: Reads a version of data
  2. Current read: Reads the latest data from the current database

Select reads snapshot reads. Insert, Update, and Delete are current reads. In simple terms, the database engine will implicitly add two fields to each record, like create time and delete time, but instead of storing the timestamp, it will store some version of the system (i.e. transaction ID).

Although the Select operation reads a snapshot, it can be understood that each read is a historical version of the read, so it appears to solve the phantom problem, but MVCC alone can solve the phantom problem? Come to think of it, this is not the case. Let’s assume that only MVCC is currently used to avoid illusions, such as the following case:

Current data;

id name age
1 John 13
2 Mike 14
3 Bob 12

Two transactions arrive

Transactions/Steps Transaction 1 Transaction 2
1 begin;
2 Select * from table; The begin.
3 Insert into table (name.age) values(‘Bruce’, 15);
4 commit;
5 Select * from table;
6 update table set age = 16 where id = 4;
7 Select * from table;

The isolation level of the database is repeatable read. We can see that transaction 1 has the same number of select in step 2 and step 5, so we can not see the data committed by transaction 2. After the update is complete, the select operation is performed again. It is found that you can see the data submitted by transaction 2, and it is the updated version after transaction 1.

Note: the scene described above is simulated after understanding the realization principle of MVCC, and has not been tested in practice.

Although the above scenario is not tested, it is theoretically possible, so the conclusion is that MVCC alone cannot solve the illusion problem.

Next-Key Lock

MVCC alone cannot solve the illusion problem, so Mysql introduces the concept of next-key Lock. What is next-key Lock? In fact, it is not a separate Lock, it is made up of Record Lock and Gap Lock.

  1. Record Lock: Lock the Mysql index Record
  2. Gap Lock: a Gap Lock

As we all know, in the Innodb engine, is the primary key index clustering index (an index contains all the data), average index is a clustering index, so under the Innodb engine, if by ordinary index query data you need to query two index (first by ordinary index to the primary key index, again by the primary key index to data) to get the data.

For Record Lock, Innodb is in the index is row level Lock, he locked in fact is the index, Lock index is equivalent to Lock this data.

Current database data;

id name age card
1 John 13 2
4 Mike 14 5
8 Bob 12 7
13 Bob 12 16
20 Bob 12 19

If id is the primary key index and card is the common index, when a transaction changes the value of card=7, the Gap lock range is (5,7),(7,16). In other words, no other transaction is allowed to insert, change, or delete values in the upper two ranges of card. At the same time, more importantly, we said, ordinary index is a clustering index, he have to go through the primary key index search to find the real data, so the clearance of the lock to lock up more than ordinary index, he will also need to at the same time corresponding to the scope of the primary key index interval and clearance lock, the lock the primary key id in (4, 8], (the two interval 8, 13]. In other words, the primary key id in this interval is not allowed to insert, update, delete data.

In this way, record-lock locks the required values, and Gap locks the Gap between the index and the primary key, so that other transactions can not modify the corresponding data, with MVCC in RR (repeatable read) level to solve the phantom operation.

Mysql logs

There are three types of Mysql logs that we need to care about: binlog, redolog, undolog.

  1. Binlog: In Mysql, the Server layer records write operations to the database in the form of addend. You can set the maximum value by using max_binlog_size. It is mainly used for primary/secondary replication and data recovery.

Brushing time:

  • 0: not mandatory. The value is determined by the OS
  • 1: flush to disk immediately after each commit
  • N: Flushed to disk immediately after N transactions
  1. Redolog: Log generated by InnoDB engine transaction operations. There are two parts: log buffer in memory and log file on disk.

Write-ahead Log (WAL): writes logs for each transaction before flushing them to the disk.

Redo log flush time:

  • 0: lazy write, lazy flush. Logs are now written to the redo log buffer. Logs are written to the OS buffer every 1 second and are flushed to disk by calling fsync().
  • 1: Real-time write, real-time flush. Every time a transaction commits, the log is immediately written to the OS buffer through the redo log buffer and flushed to disk by calling fsync().
  • 2: Real-time write, delayed flush, write to THE OS buffer through the redo log buffer immediately upon transaction commit, and the operating system calls fsync() every second to flush to disk.

Record format: A fixed size circular record is adopted. Two Pointers, one to record the current log position and the other to record the data to the disk. When the specified size is reached, the pointer to record the current log position points to the beginning of the file.

  1. Undo log: InnoDB to ensure atomicity of transactions, used to do rollback operations. Each Insert statement in the transaction corresponds to an Undo log DELETE statement, and each Update statement corresponds to an opposite Update statement. This is also the key to MVCC.

SQL optimization

  1. The leftmost prefix matches
  2. Paging query cloud.tencent.com/developer/a…
  3. Reduce back table queries: create joint indexes with primary keys

References:

Segmentfault.com/a/119000001…

Juejin. Cn/post / 684490…

www.jianshu.com/p/cef49aeff…

Segmentfault.com/a/119000002…