My original blog address: blog.csdn.net/weixin_3931…

An overview of the

Our database will generally execute multiple transactions concurrently. Multiple transactions may concurrently add, delete, change, and check the same batch of data, which may lead to the problems of dirty write, dirty read, unrepeatable read, and phantom read. In order to solve the problem, the database designs transaction isolation mechanism, lock mechanism, MVCC multi-version concurrency control isolation mechanism, and uses a set of mechanisms to solve the multi-transaction concurrency problem. In the following sections, we’ll take a closer look at these mechanisms to give you a thorough understanding of how the database works internally.

The transaction

A transaction is a logical processing unit consisting of a set of SQL statements, and a transaction has the following four properties, often referred to simply as the ACID property of the transaction. Atomicity: A transaction is an atomic unit of operation in which all or none of the modifications to data are performed. Consistent: Data must be Consistent at the beginning and completion of a transaction. This means that all relevant data rules must be applied to transaction modifications to preserve data integrity. Isolation: The database system provides some Isolation mechanism to ensure that transactions are executed in a “separate” environment that is not affected by external concurrent operations. This means that intermediate states during transaction processing are not visible to the outside world and vice versa. Durable: Changes to data after transactions are finished are permanent and can be maintained even in the case of system failures.

Problems brought about

  • When two or more transactions select the same row and then Update the row based on the value originally selected, the Lost Update problem occurs because each transaction is unaware of the existence of the other transaction — the last Update overwrites the Update made by the other transaction.
  • A transaction is making changes to a record whose data is in an inconsistent state before the transaction is completed and committed. At this point, another transaction reads the same record, and if left unchecked, the second transaction reads the “dirty” data and performs further processing accordingly, resulting in uncommitted data dependencies. This phenomenon is known figuratively as “dirty reading”. Bottom line: Transaction A reads data that transaction B has modified but has not yet committed, and acts on that data. In this case, if B’s transaction is rolled back, the data read by A is invalid and does not meet consistency requirements.
  • Non-repeatable Reads A transaction that Reads previously read data ata certain time after reading some data only to find that the read data has been changed or some records have been deleted! This phenomenon is called unrepeatable reading. If the same query statement inside transaction A reads inconsistent results at different times, it does not conform to isolation. If the same query statement inside transaction A reads inconsistent results, it is not friendly to make A lot of judgments.
  • Phantom Reads are the phenomenon when a transaction re-reads previously retrieved data according to the same query criteria, only to find that other transactions insert new data that meets their query criteria. Transaction A reads the new data submitted by transaction B and does not comply with isolation

There are some explanations that can be found on Baidu. It’s boring, but it’s still important to understand.

The isolation level of the transaction

“Dirty reads”, “unrepeatable reads” and “phantom reads” are all database read consistency problems, which must be solved by the transaction isolation mechanism provided by the database.

< the picture below is a lot of online, casually found a. >Solving the isolation level problem is essentially solving the concurrency problem. If only one thread accesses the database at a time. I’m sure it won’t be a problem. So there’s no concurrency problem. Then the performance will definitely not work. And each scenario requires different levels of data concurrency,We should choose the minimum transaction concurrency for data security in our scenario.Similar to multithreaded concurrency

  • Check the transaction isolation level of the current database: show variables like ‘TX_isolation ‘;
  • Set transaction isolation level: set tx_isolation=’REPEATABLE-READ’;
  • Mysql’s default transaction isolation level is repeatable reads. For example, when developing programs with Spring, the default isolation level set by Mysql is used if the isolation level is not set, and the isolation level set by Spring is used if the isolation level is set

The lock

A lock is a mechanism by which a computer coordinates concurrent access to a resource by multiple processes or threads. In databases, in addition to the contention for traditional computing resources (such as CPU, RAM, I/O, etc.), data is also a resource that needs to be shared by users. How to ensure the consistency and validity of concurrent data access is a problem that all databases must solve, and lock conflict is also an important factor that affects the performance of concurrent data access.

Classification of lock

  • Optimistic lock: Access the data assuming that the data is unchanged, but after the modification of the data to do a check, if it really does not change before the real modification. General CAS+while loop
  • Pessimistic locking: Data is accessed on the assumption that the data will definitely change, and a mutex is added to prevent other threads from accessing the data until the lock is demonstrated.
  • Read lock: Locks data when it is read, allowing others to read but not write. Concurrency is improved to some extent without blocking read requests.
  • Write lock: Lock the data while writing, then do not allow others to read and write, must wait for me to release. You have to read it.
  • Row lock: The resource granularity of a lock is a row of data
  • Table lock: The resource granularity of a lock is a table

The lock on the mysql

Table locks

Each operation locks the entire table. Low overhead, fast lock; No deadlocks occur; Large lock granularity has the highest probability of lock conflict and the lowest concurrency. This parameter is used in the whole table data migration scenario.

‐ ‐ built tableSQL
CREATE TABLE `mylock` (
`id` INT (11) NOT NULL AUTO_INCREMENT,
`NAME` VARCHAR (20) DEFAULT NULL.PRIMARY KEY (`id`)
) ENGINE = MYISAM DEFAULT CHARSET =utf8; ‐‐ insert dataINSERT INTO`test`.`mylock` (`id`, `NAME`) VALUES ('1'.'a');
INSERT INTO`test`.`mylock` (`id`, `NAME`) VALUES ('2'.'b');
INSERT INTO`test`.`mylock` (`id`, `NAME`) VALUES ('3'.'c');
INSERT INTO`test`.`mylock` (`id`, `NAME`) VALUES ('4'.'d');
Copy the code
  • Manually add table locks
lock tableTable name Read (write) indicates the table name2 read(write);
Copy the code
  • View the locks added to the table
show open tables;
Copy the code
  • Delete table locks
unlock tables;
Copy the code

Case study add read lock

Open another session to read and write discoveries, which can be read. But you can’t write it

Case study add write lock

Open another session to read and write discovery,It cannot be read at this time. You can’t write it

Conclusion:

  1. Read operations (read locks) on MyISAM tables do not block other processes’ read requests on the same table, but block other processes’ write requests on the same table. Only after the read lock is released can other processes write. Write operations are not allowed in the same process
  2. Write operations on MylSAM (add write lock) will block other processes’ read and write operations on the same table. Only after the write lock is released, other processes can perform read and write operations on the same table

Row locks

Each operation locks one row of data. High overhead, slow lock; Deadlocks occur; The lock granularity is minimum, the probability of lock conflict is lowest, and the concurrency is highest. There are two major differences between InnoDB and MYISAM:

  • InnoDB supports TRANSACTION
  • InnoDB supports row-level locking

Why does MYISAM not support transactions? This is very simple, the transaction is to solve the above dirty read, non-repeatability, etc. Your watch is all locked, there is no such problem. Why support transactions.

Row locking demo

If one session starts transaction update and does not commit it, updating the same record in another session will be blocked, but updating different records will not be blocked.Before executing the SELECT statement, MyISAM will automatically lock all tables involved in the read, update, INSERT, and DELETE operations will automatically lock all tables involved in the write.InnoDB does not lock the SELECT query (non-serial isolation level). However, update, INSERT, and DELETE operations add row locks. In short, read locks block writes, but not reads. Write locks block both reads and writes.

Demonstrate row locking for different transaction isolation levels

 CREATE TABLE `account` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(255) DEFAULT NULL,
 `balance` int(11) DEFAULT NULL.PRIMARY KEY (`id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
 INSERT INTO `test`.`account` (`name`, `balance`) VALUES ('lilei'.'450');
 INSERT INTO `test`.`account` (`name`, `balance`) VALUES ('hanmei'.'16000');
 INSERT INTO `test`.`account` (`name`, `balance`) VALUES ('lucy'.'2400');
Copy the code

Read uncommitted

Open client A, set the current transaction mode to read uncommitted, and query the initial value of the table account: set TX_ISOLATION =’read-uncommitted’;

The client has not committed the transaction yet, but client B can already see the transaction modified by client A. If you read data in Java code that hasn’t been committed yet. If you don’t do special checks, you’ll get bugs; You can solve this problem by reading the committed isolation level

Reading has been submitted

Set TX_ISOLATION =’ read-COMMITTED ‘; open client A, set the current transaction mode to read COMMITTED, and query all entries in the account: set TX_ISOLATION =’ read-COMMITTED ‘;

Read Committed: As the name implies, data can only be read if another transaction has been committed. Data that has not yet been committed will not be read. Because other transactions might rollback;

Repeatable read

Open client A, set transaction mode to REPEATable read, query all records of table account: set TX_ISOLATION =’ REPEATable -read’;The data obtained in the same transaction is always the same. Does not change, even if other transactions commit data;

Question:

  1. Why do we do this? Everything else has been committed. Why can’t I see it in my select transaction? This is mainly to make the application code easier. Otherwise you have to do a lot of validation every time you read the data.
  2. Since I only have to start the transaction and look it up once, I don’t change the data in the transaction. So if I want to modify the data in this transaction, wouldn’t I be modifying the data based on the old version? You thought of this, mysql also thought of this, and this is guaranteed by the MVCC mechanism. That is, during the write operation, the snapshot version is recreated. Write operations based on the latest version. I’ll talk about that in the next post.

Phantom read

Client B does not read the row where Id=4 from client A commit, but can modify the row where Id=4 from client A commit.

serialization

Open A client A and set the current transaction mode to serialIZABLE. Query the initial value of the table account: set TX_ISOLATION =’ serialIZABLE ‘;

Open a client B and set the transaction mode to serialIZABLE. Updating the same record with ID 1 will block and wait. Updating the same record with ID 2 will be successful. If client A performs A range query, then all rows within that range, including the gap range of each row, will be locked even if the row has not been inserted. If client B inserts data in this range, it blocks, so phantom reads are avoided.

This isolation level is extremely low concurrency and is rarely used in development.

Gap in the lock

A gap lock locks the space between two values. The default Mysql level is repeatable-read. Gap locking can solve illusionary problems in some cases.

The top gap is: (3,10), (10,20), (20,+ infinity)

Update account set name = ‘zhuge’ where ID > 8 and ID <18; , other sessions cannot insert or modify any data in the range of row records (including gap row records) or in the gap of row records, that is, no data can be modified in the range of id (3,20). Note that the last 20 is also included. Gap locking takes effect only at the repeatable read isolation level.

Next-key Locks

Next-key Locks are a combination of row and gap Locks. The entire interval of (3,20) in the example above could be called a keylock.

Supplement (Problem practice)

A no-index row lock is upgraded to a table lock

Session1: update account set balance = 800 where name = ‘lilei’; session1: update account set balance = 800 where name = ‘lilei’; Session2 blocks any row in this table. InnoDB rows are locked on indexes, not records. The index must not fail, otherwise it will be upgraded from a row lock to a table lock. You can also lock a row using lock in share mode and for update, for example: select * from test_innodb_lock where a = 2 for update; Other sessions can only read the row, and changes are blocked until the session commits the locked row

Conclusion Innodb storage engine implements row-level locking. Although the performance loss of Innodb storage engine may be higher than table level locking, it is far superior to MYISAM table level locking in terms of overall concurrent processing capacity. When system concurrency is high, Innodb’s overall performance is significantly better than MYISAM’s.

However, Innodb’s row-level locking also has its vulnerable side. When used incorrectly, Innodb’s overall performance can not only be as good as MYISAM’s, but can even be worse.

Row lock analysis

Analyze contention for row locks on the system by examining the InnoDB_row_lock state variable

show status like 'innodb_row_lock%';
Copy the code

The description of each state quantity is as follows:

  • Innodb_row_lock_current_waits: Number of locks currently being waited for
  • Innodb_row_lock_time: total lock time since system startup
  • Innodb_row_lock_time_avg: average time spent on each wait
  • Innodb_row_lock_time_max: Time spent waiting for the longest time since system startup
  • Innodb_row_lock_waits: Total number of waits since system startup

For these 5 state variables, the most important ones are:

  • Innodb_row_lock_time_avg (average wait time)
  • Innodb_row_lock_waits
  • Innodb_row_lock_time (total wait time)

Especially when the number of waits is very high and the waiting time is not small, we need to analyze why there are so many waits in the system, and then make optimization plans according to the analysis results.

View the INFORMATION_SCHEMA database tables related to system library locks

‐‐ View transactionsselect * fromINFORMATION_SCHEMA.INNODB_TRX; ‐ ‐ to check the lockselect * fromINFORMATION_SCHEMA.INNODB_LOCKS; ‐‐ view lock waitselect * fromINFORMATION_SCHEMA.INNODB_LOCK_WAITS; ‐ release the lock, trx_mysql_thread_ID can be viewed from INNODB_TRX. Kill trx_mysql_thread_ID ‐ view lock wait detailsshow engine innodb status\G
Copy the code

Deadlock analysis

Deadlock demo

set tx_isolation='repeatable-read'; Session_1 executed:select * from account where id=1 forupdate; Session_2 executed:select * from account where id=2 forupdate; Session_1 executed:select * from account where id=2 forupdate; Session_2 executed:select * from account where id=1 for update;
Copy the code

To view recent deadlock logs, run the show engine Innodb status\G command. In most cases mysql can automatically detect deadlocks and roll back the transaction that caused the deadlock, but in some cases mysql cannot automatically detect deadlocks

Lock Optimization Suggestions

  • As much as possible, make all data retrievals through indexes to avoid non-indexed row locks upgrading to table locks
  • Design indexes properly to minimize the scope of locks
  • Reduce the range of search conditions as much as possible to avoid gap locking
  • Try to control transaction size, reduce the amount of locked resources and length of time, and try to execute SQL related to transaction locking at the end of the transaction

The lowest possible level of transaction isolation