The article directories
- One, foreword
- ACID properties of transactions
-
- 2.1 transactions
- 2.2 Theory: ACID properties of transactions
- 2.3 ACID four properties of bank transfer interpretation transactions
- Third, transaction isolation mechanism
-
- 3.1 Isolated Objects (Mutually exclusive database resources)
- 3.2 Four isolation levels and three errors
- 3.3 Four isolation levels and three errors
-
- 3.3.1 Serialized read
-
- 3.3.1.1 Isolation Level – Serialized read – Treat the entire database as a mutually exclusive resource (underlying lock: database lock that locks the target database and then reads and writes to tables in that database)
- 3.3.1.2 Isolation Level – Serialized read – Use the tables of the database as mutually exclusive resources (underlying lock: table-level lock, which locks the target table first and then reads and writes to the table)
- 3.3.2 Isolation Level – Repeatable read
-
- 3.3.2.1 Isolation Level – Repeatable Read: Row-level lock
- 3.3.2.2 Isolation Level – Problems caused by repeatable reads: phantom reads
- 3.3.3 Reading COMMITTED (READ_COMMITTED)
-
- 3.3.3.1 Reading committed – Row-level lock commited
- 3.3.3.2 Problems Caused by Read Submitted: Virtual Read (Non-repeatable read)
- 3.3.4 Read Uncommitted (READ_UNCOMMITTED)
-
- 3.3.4.1 Read Uncommitted – Uncommited row-level lock
- 3.3.4.2 Problems caused by unsubmitted read: Dirty read
- Fourth, other knowledge of affairs
-
- 4.1 Transaction deadlocks
- 4.2 Transaction Logging Undolog guarantees atomicity Redolog guarantees persistence (Emphasis 004)
- 4.3 Transaction application in MySQL
-
- 4.3.1 Practice: Enable and disable AUTOCOMMIT
- 4.3.2 Practice: Set transaction isolation level
- 4.3.3 Problem scenario: Mixing storage engines in transactions
- 4.3.4 Two-phase Locking protocol: Implicit locking and explicit locking
- 4.3.4 Triggers and Stored Procedures
-
- 4.3.4.1 Introduce database transaction stored procedure (SQL statement set of specific function)?
- 4.3.4.2 What about a trigger (a particular stored procedure of a program that executes automatically)?
- 4.3.5 Database concurrency control and database lock
-
- 4.3.5.1 Database concurrency control (Optimistic lock, pessimistic lock, timestamp)
- 4.3.5.2 Database Lock (Row-level lock, table-level lock, page-level lock)
- 5. Interview Goldfinger
-
- 5.1 Transaction starting mode
- 5.2 transaction ACID
-
- 5.2.1 transaction ACID
- 5.2.2 Bank transfer transaction ACID
- 5.2.3 Summary: The bottom implementation of the four properties of ACID
- 5.3 Four isolation levels for transactions
-
- 5.3.1 Highlights: Isolated objects (mutually exclusive database resources) + gap locks
-
- 5.3.1.1 Isolated Objects (Mutually exclusive Database resources)
- 5.3.1.2 Gap Locking (InnoDB is a magic tool for solving phantom errors with repeatable read isolation level)
- 5.3.2 Interview Goldfinger: Four isolation levels and three errors
-
- 5.3.2.1 Q.1: Explain the four isolation levels
- 5.3.2.2 Q2: Four concepts: Isolation level, resource mutual exclusion granularity, transaction concurrency capability, and data consistency The essence of isolation level is resource mutual exclusion granularity, namely database lock level?
- 5.3.2.3 Problem 3: Explain a table: Four isolation levels and three errors
- 5.3.2.4 Practice 4: Set transaction Isolation level + Start transaction
- 5.3.2.5 Low-level 5: Transaction attempt, low-level implementation of transaction isolation
-
- 5.3.2.5.1 Overview: Transaction attempts and table views, transaction attempts are the underlying implementation of transaction isolation
- 5.3.2.5.2 Detail 1: Transaction Snapshot
- 5.3.2.5.3 Detail 2: Version chain (the cornerstone of transaction attempts) : trx_id and roll_pointer
- 5.3.2.5.4 Detail 3: The transaction attempt ReadView is generated at different times (transaction isolation level: the nature of committed read and repeatable read is different)
- 5.3.2.5.4 Detail 4: MVCC(From the source, MVCC mechanism: Multi-version concurrency control)
- 5.4 Transaction additional three
-
- 5.4.1 Transaction deadlocks (analogous to Java concurrent deadlocks)
- 5.4.2 Transaction logging (Undolog undo guarantees atomicity of transactions, redolog guarantees persistence of transactions)
- 5.4.3 Transaction application in mysql
-
- 5.4.3.1 Practice: Enable and Disable AUTOCOMMIT
- 5.4.3.2 Practice: Set the transaction isolation level
- 5.4.3.3 Problem scenario: Mixing storage engines in a transaction
- 5.4.3.4 Two-phase Locking protocol: Implicit locking and explicit locking
- Six, the summary
One, foreword
Four properties of ACID for transactions: Atomicity is guaranteed by undo log, persistence is guaranteed by redo log + double Buffer write, isolation is guaranteed by LBCC lock-based concurrency control +MVCC multi-version concurrency control, Consistency includes database integrity constraints (for example, primary key must be unique before and after transaction execution) and data integrity constraints at the business level (for example, after transaction execution, the balance of account A decreases by 1000, the balance of account B increases by 500, or the balance of account A decreases by 1000 when it is 500, etc.). Data integrity at the database level can be achieved as long as atomicity, persistence and isolation are ensured. Data integrity at the business level, such as bank transfer and e-commerce ordering, should be considered at the business level.
ACID properties of transactions
2.1 transactions
Meaning: A transaction is a set of atomic SQL queries, or a single unit of work. The set of queries is executed if the database can successfully apply all statements to the database. If any of the statements cannot be executed for some reason or other, then none of the statements will be executed. That is, statements within a transaction either all execute successfully (all execute successfully with redo guaranteed persistence) or all execute fail (one execution fails with undo guaranteed atomicity).
From the storage medium, although database and file are stored on disk, but, transaction is one of the important characteristics of database storage is different from file storage, transaction will change the database library from one state to another state. When the database commits work, you can ensure that either all changes are saved or none are saved.
By default, MYSQL is in autoCOMMIT mode. When the table type is InnoDB, manual transactions can be used:
By default, MYSQL is in autoCOMMIT mode, and all database update operations (INSERT UPDATE DELETE) each SQL statement has its own transaction and automatically commits the transaction (select does not have a transaction). If your MYSQL table type is InnoDB, your MYSQL can use manual transaction. If your MYSQL table type is InnoDB, your MYSQL can use manual transaction. If your MYSQL table type is InnoDB, your MYSQL can use manual transaction. You must use COMMIT to COMMIT your changes, or ROLLBACK to roll your changes back and forth.
The following is an example (InnoDB has two ways to manually START transactions) : START TRANSACTION; SELECT SUM(salary) FROM table1 WHERE type=1; UPDATE table2 SET summmary= @a WHERE type=1; The second SQL statement in the transaction is ROLLBACK; /COMMIT; // Rollback or commit transaction, both SQL statements do one atomic operation together
BEGIN; SELECT SUM(salary) FROM table1 WHERE type=1; UPDATE table2 SET summmary= @a WHERE type=1; The second SQL statement in the transaction is ROLLBACK; /COMMIT; // Rollback or commit transaction, both SQL statements do one atomic operation together
Summary: InnoDB-type tables, two ways to start transactions, either BEGIN or Start transaction.
With NavICat or JDBC, when the connection is broken or the session is closed, the transaction is a rollback.
2.2 Theory: ACID properties of transactions
Transactional features: Atomicity, consistency, Isolation, persistence, e.g., ACID.
Atomicity: All operations in the entire transaction are either complete or not complete, and cannot be stopped at some intermediate stage. If a transaction fails during execution, it will be rolled back to the state before the transaction began, as if the transaction had never been executed. (Goldfinger: Atomicity is achieved by rolling back operations, which in turn is achieved by undolog, so atomicity is achieved by undolog logs.)
Consistency: Database integrity constraints are not broken before and after a transaction.
Isolation: The isolation state performs transactions as if they were the only operation performed by the system at a given time. If you have two transactions, running at the same time and performing the same function, the isolation of the transaction ensures that each transaction is considered by the system to be the only one using the system. This property is sometimes called serialization, and to prevent confusion between transaction operations, requests must be serialized or serialized so that only one request is used for the same data at a time.
Persistence: After a transaction has completed (that is, after a transaction has successfully committed), the changes made to the database by that transaction are persisted in the database. Persistence means that the changes are saved to disk, even if the system is corrupted. Persistence is guaranteed by Redolog, even if flush is asynchronous. Goldfinger: the transaction will not be rolled back after successfully committing. This is done via Undolog. Undolog has two functions: undo rollback + MVCC multi-version control.
Unity of atomicity and persistence: Atomicity, transaction execution errors are rolled back; In persistence, the transaction executes correctly and completes, and there is no rollback.
The meaning of THE existence of ACID (to provide a theoretical basis for correct database execution) + ACID explanation
ACID is the four basic elements for the correct execution of database transaction transanction, which provides the theoretical basis for the correct execution of database transaction. Only these four basic elements can be satisfied at the same time, the correct execution of database transaction can be guaranteed.
Atomicity: The execution of an event cannot be interrupted, so you know from atomicity that there are only two states of a transaction: before and after. Atomization guarantees two points: (1) if the transaction fails, it will be rolled back. Atomicity is implemented by rollback, and rollback is implemented by Undolog log. (2) If the transaction succeeds, it’s ok. Consistency: As known by atomicity, there are only two states of a transaction, before and after the transaction is executed, in which the database integrity constraint is not broken. Consistency is guaranteed by two points: (1) During the transaction execution, if the transaction fails, it will not be persisted in the disk; (2) If the transaction is successfully executed, the disk will be persisted. Isolation: The isolation of a transaction is required because of the parallel execution of the transaction. If there is no concurrent execution of the transaction, there is no isolation. The four isolation levels compromise the security and efficiency of the concurrent execution of the transaction. Isolation guarantees five things: 3.3.5 Interview Goldfinger. Persistence: Save to a database. This is a persistence medium. Memory is not a persistence medium. Persistence guarantees two things: (1) once a transaction is committed, it is persisted to disk, as guaranteed by the redolog redolog. (2) if the transaction commits successfully, there will be no rollback, which is guaranteed by undolog undolog, because rollback is implemented by undolog.
2.3 ACID four properties of bank transfer interpretation transactions
The bank application is a classic example of explaining MySQL transactions. If you want to transfer $200 from Jane’s checking account to her savings account, you need at least three more steps:
1. Check your checking account balance for more than $200
Subtract $200 from your checking account balance.
Add $200 to your savings account balance.
All three operations must be packaged in a transaction, and if any of the steps fails, all steps must be rolled back.
You can START a TRANSACTION with the START TRANSACTION statement, and then either use the COMMIT TRANSACTION to persist the changed data, or use ROLLBACK to undo all the changes. The following is a sample of the TRANSACTION SQL:
START TARNSACTION; Start innoDB-type tables manually start transaction; SELECT balance FROM checking WHERE customer_id = 10233276; UPDATE checking SET balance = balance-200.00 WHERE customer_id = 10233276; UPDATE savings SET balance = balance + 200.00 WHERE customer_id = 10233276; -- The fourth sentence COMMIT; -- Rollback /commitCopy the code
Scenario 1: Atomicity: What happens if the server crashes in the fourth sentence? $200 is missing from the check table, but $200 is not added to the storage table, because an error was reported when executing the $200 increase in the storage table. If there is no rollback mechanism for transaction failure, that is, there is no atomicity of the transaction, which causes an error. Atomicity is implemented through rollback, which in turn is implemented through Undolog
Scenario 2: Consistency, the system crashes after the third statement is executed but before the fourth statement is started? Transactions have consistency. The database always transitions from one consistent state to another consistent state. In the previous example, consistency ensures that even if the system crashes after the third statement is executed but before the fourth statement is started, no $200 will be lost in the checking account because the transaction does not commit, so changes made in the transaction will not be saved to the database.
Scenario 3: Isolation, when the third statement is executed but before the fourth statement is started, another process deletes all balances from the checking account, and the result could be that the bank gave Jane $200 without knowing this logic. Without transaction isolation, transaction parallelism cannot be secured. With isolation, changes made by one transaction are not visible to other transactions until the final commit. When the third statement is executed and the fourth statement is not started, another account summary is run and the checking account balance is not subtracted by $200.
Scenario 4: Persistence, after successful execution, the database crashes without time to write to disk? The transaction has a persistence mechanism, which is guaranteed by the Redolog log. There must be a persistence mechanism in the Redolog log.
Storage engines: Innodb and MyISAM
The programmer can choose the appropriate storage engine depending on whether the business requires transaction processing. For some query applications that do not need transactions, choose a non-transactional storage engine (such as MyISAM) to achieve higher performance. Even if the storage engine does not support transactions, the application can be protected to a certain extent by the LOCK TABLE clause. These choices can be independently decided by the programmer.
Third, transaction isolation mechanism
3.1 Isolated Objects (Mutually exclusive database resources)
First, macroscopically, the isolation is the operation of the client side: isolation refers to the fact that when different clients do transaction operations, ideally, each client does not have any interaction, as if unaware of each other’s existence.
Second, microscopically, what is actually isolated is the mutually exclusive database resources of the client operation. The truly isolated object is the mutually exclusive access of the database resources. The isolation is reflected by the different granularity of the database resources partition.
Whether it is multithreaded lock in Java code, or row lock in MySQL (including exclusive lock write lock, shared lock read lock), table lock, are competing for mutually exclusive resources, Java is critical code block, MySQL is on the lock
3.2 Four isolation levels and three errors
Four isolation poles are defined in the SQL standard, and each level specifies which changes made within a transaction are visible and which are not, both within and between transactions.
Lower levels of isolation generally allow for higher concurrency and lower system overhead; The higher level of isolation guarantees higher security, but the concurrency that can be performed is smaller and the system overhead is higher.
Let’s start with the implications of a few database transaction isolation related issues:
Dirty read: A transaction in one thread reads uncommitted data from another thread. Non-repeatable read (virtual read) : a transaction in one thread reads an Update submitted by another thread. Phantom read: when a transaction in one thread reads data from an INSERT submitted by another thread.
One is reading to other affairs uncommitted changes, causing two reads data are not consistent, this error is called dirty read, because uncommitted data page in memory, also has not perform brush dirty operations, has not yet been flushed to disk, this is called the dirty pages, this data is called dirty data, so this kind of mistakes is called dirty reads.
One is that a transaction reads changes that have been committed by another transaction, causing data inconsistency between two reads. This error is called virtual read/non-repeatable read.
One is that a transaction reads an insert that has been committed by another transaction, causing data inconsistency between the two reads before and after the insert. Only the two reads in a transaction caused by the insert are inconsistent, it is called phantom read.
The difference between dirty read and unrepeatable read is as follows: Whether the read data has been flushed, that is, whether the update operation is submitted. The difference between an unrepeatable read and a phantom read: If two consecutive reads in one transaction are inconsistent, it is caused by an insert from another transaction.
The four isolation levels are briefly described below.
READ UNCOMMITTED At the READ UNCOMMITTED level, changes in a transaction, even if they are not committed, are visible to other transactions. Transactions can Read uncommitted data, also known as Dirty reads, which can cause a lot of problems in terms of performance. READ UNCOMMITTED is not much better than the other poles, but it lacks many of the benefits of other levels, and is rarely used in practice unless there is a really compelling reason to do so.
While most database systems default to READ comisolation (although MySQL does not),READ COMMITTED satisfies the simple definition of isolation mentioned above: at the start of a transaction, only the changes made by COMMITTED transactions can be “seen”. In other words, any changes made by a transaction from start to commit are not visible to other transactions. This level is sometimes called nonrepeatable read because executing the same query twice may get different results.
REPEATABLE READ solved the problem of dirty reading. This level ensures that multiple reads of the same record in the same transaction are consistent, but repeatable Read isolation levels theoretically do not solve another Phantom Read problem. Phantom reading is when a transaction reads a range of records, Another transaction inserts a new record in the range, and when the previous transaction reads the record again, Phantom Row will be generated. InnoDB and XtraDB storage engines use multi-version concurrency control (MvC, Concurrency Control (Multiversion Concurrency Control) addresses the illusion problem. Repeatable reads are MySQL’s default transaction isolation level
SERIALIZABLE SERIALIZABLE is the highest isolation level. It avoids the phantom problem by forcing transactions to be executed serially. In short,SERIALIZABLE locks every row of data that is read, which can cause a lot of timeouts and lock contention. This isolation level is also rarely used in practice and is only considered when there is a strong need to ensure data consistency and no concurrency is acceptable.
Serialization goes to the other extreme, locking the entire table, ensuring that only one transaction occupies the table at any one time, and not using concurrency at all, naturally causing no transaction concurrency problems.
Summary of a table:
Isolation level | Dirty Read | NonRepeatable Read | Phantom Read |
---|---|---|---|
Read uncommitted | may | may | may |
Read committed | Can’t be | may | may |
Repeatable read | Can’t be | Can’t be | may |
Serializable | Can’t be | Can’t be | Can’t be |
In fact, for MySQL/InnoDB, the default isolation level is repeatable reads. This is because InnoDB gap locks already guarantee that there will be no magic reads. This ensures both read consistency and efficiency.
Summary: The higher the level, the more secure the data is, but the lower the performance is.
Non-repeatable read and phantom read discrimination: Non-repeatable read and phantom read are similar, both in the same transaction to read different data, the core lies in insert.
Non-repeatable reads: Also known as virtual reads, in database access, two identical queries within the scope of a transaction return different data. This is caused by the commit of other transaction changes in the system at the time of the query. For example, if transaction T1 reads some data, transaction T2 reads and modifies the data, and T1 reads the data again to verify the read value, it gets a different result. A more understandable way of saying this is to read the same data multiple times within a transaction. The same data is accessed by another transaction while the transaction is still active. So, between the two reads of the data in the first transaction. The data read by the first transaction may not be the same due to the modification of the second transaction, so that the data read by the first transaction is not the same, and therefore is called non-repeatable read, i.e. the original read is not repeatable.
Phantom read: Transaction A reads rows that match the search criteria. Transaction B modifies the result set of transaction A by inserting or deleting rows, etc., before committing. A more understandable way of saying this is that it occurs when a transaction is not executed in isolation, such as when the first transaction modifies data in a table, such as when the modification involves “all rows” in the table. At the same time, the second transaction also modifies the data in the table by inserting a “new row” into the table. Then, as if in an illusion, the user operating on the first transaction will discover that there are unmodified rows in the table. The general solution to phantom is to increase the scope lock RangeS to lock the scope of the lock as read-only, so as to avoid phantom. In simple terms, phantom reading is caused by insertion or deletion.
The difference between non-repeatable reads (virtual reads) and phantom reads: The general difference is that non-repeatable reads are caused by changes to data made by another transaction, while phantom reads are caused by insertions or deletions by another transaction. From the overall results, it appears that both appear to have inconsistent results from the two reads. From the point of view of control, the difference between the two is relatively large: for the former, only the records that meet the conditions need to be locked; For the latter, records that meet the condition and are close to it are locked
3.3 Four isolation levels and three errors
3.3.1 Serialized read
3.3.1.1 Isolation Level – Serialized read – Treat the entire database as a mutually exclusive resource (underlying lock: database lock that locks the target database and then reads and writes to tables in that database)
If the entire database is treated as an access to a mutually exclusive resource, then this access has the following properties:
Only one client can connect to the database for transaction operation at a time. No other client can perform transaction operation on the database before the transaction is completed by this client.
When clients access the database, each client accesses the database in exclusive mode. The interaction mode is as follows:
Ideally, this level of isolation is not affected by different client transactions, because all clients queue their transactions.
Database in addition to the theoretical rigor, but also to see its practicality. TPS is the number of Transactions Per Second. The higher THE TPS is, the better the database performance is.
Note: TPS is a key metric for measuring the performance of the four different isolation levels, and the maximum TPS capability of the database is 1/T, assuming each client transaction takes T seconds and there is no idle time. Maximum TPS = 1 / T (T is the average transaction operation time on the client) for example, if T is 10ms, the TPS value of the database is 1/0.01 = 100. That is, the database can complete 100 transactions per second
Rationale: Is it necessary to use the database level as a mutually exclusive resource?
Using database-level access as a mutually exclusive resource does guarantee complete transaction isolation; However, in real application scenarios, it is not necessary to use such coarse-grained mutually exclusive resources.
For example, suppose there are two tables in the database MALL: T_user and t_order. There are four external clients A, B, C, and D. Where, clients A and B only operate on the T_user table, and clients C and D only operate on the T_ORDER table.
From the perspective of mutually exclusive resources, there are two pairs of mutually exclusive resources for clients to access: A < — t_user — B and C< — t_order -d. It is not necessary to use databases as mutually exclusive resources for transaction isolation control. Mutually exclusive resources can be subdivided down to the table level.
3.3.1.2 Isolation Level – Serialized read – Use the tables of the database as mutually exclusive resources (underlying lock: table-level lock, which locks the target table first and then reads and writes to the table)
Following the example above, we treat the tables of the database as mutually exclusive resources, and the interaction mode after subdivision is as follows:
T_user -[A,B] and t_order-[C,D] are mutually exclusive and can be processed in parallel, as shown in the following figure:
As the mutual exclusion level of resources is refined from the database level to the table level, the TPS quantity of the database is also increased a lot. Below, we simply estimate the TPS under the full load state, assuming that the average transaction operation time of the client is T and the number of resource mutual exclusion groups is N, then:
Maximum TPS = (1 / T) * N In this example, if T= 10ms, N = 2, TPS = (1/0.01) * 2 = 200
In contrast to the database as a mutually exclusive resource, you can see that TPS increases as the mutually exclusive granularity drops to the table level.
Note: In a real transaction, it is possible for a client transaction to operate on multiple tables, and any one of these tables will be treated as a mutually exclusive resource.
In the current mainstream database implementation, basically provide the lock table to provide mutually exclusive resource access, through the lock table transaction isolation processing, in the operation sequence, is the nature of queuing, the highest level of transaction isolation, namely: SERIALIZABLE READ.
Summary: We can easily understand the implementation of serialized reads: lock the entire table, lock the target table first, then read and write to the table
3.3.2 Isolation Level – Repeatable read
3.3.2.1 Isolation Level – Repeatable Read: Row-level lock
Locking the entire table causes client transactions on the same table to become queued serialized operations. Now look at another scenario:
Suppose we now have clients A and B using the same table T_USER for transactions, but with different rows:
In the above figure, although client A and client B access table T_USER in mutually exclusive mode, but the operation data row record is not really mutually exclusive, so we can continue to refine the lock granularity, from the lock table level, again to the lock row record level, which will further improve the system concurrent processing capacity. After row lock refinement, the isolation level is reduced to repeatable reads.
Expand the above example and reflect it through the model, as shown in the figure below:
Client A and client B attempt to access the same row data at the same time; Client C and client D also attempt to access the same row at the same time. In this race, you can see that up to two clients can access the table T_USER at the same time, which increases the concurrency of the entire client by an order of magnitude compared to serialized reads!
Expressed by client timing relation as follows:
Wow, why lock SERIALIZABLE READ when the concurrency is so high with row locking? Before we answer that question, let’s look at what’s wrong with row locking.
3.3.2.2 Isolation Level – Problems caused by repeatable reads: phantom reads
The client can lock rows by row locking. And in the process of transactions, may to the corresponding inserting new data in the table, and insert the new data, first not data locking range (the fundamental reasons of phantom read), so, using a SQL database operation data, may return more satisfy the condition data, add new row locks, as shown in the figure below:
As shown above, two identical queries within the same transaction return an inconsistent number of records, as if they had Read too much data. This is called Phantom Read.
The use of this row lock for resource isolation is called REPEATABLE READ at the database isolation level.
Note: Although row locks are mutually exclusive for database operations, phantom reads can occur. To avoid phantom reads, you can use table level locks — increasing transaction isolation boundaries — SERIALIZABLE READ.
3.3.3 Reading COMMITTED (READ_COMMITTED)
3.3.3.1 Reading committed – Row-level lock commited
In fact, when the database implements atomicity, there are actually two states for a particular row of a table: Uncommited and Commited. We further subdivide the resource row data based on it, as shown in the figure below:
A table row is divided into uncommitted and committed states based on atomicity.
To further improve the concurrency of the database, as shown in the figure above, a read-write split lock mechanism is used on a row of data:
Although clients A, B, C, and D all operate on the same row in the same table
Client B and client D read the row record and use the read lock to read the data. The read lock is a shared lock, so it can be performed simultaneously.
Client A and client C write data to the row using A write lock. The write lock is an exclusive lock. There are two steps in the data write operation of the database transaction: Uncommited – Commited, wait until commit time, then use write locks to complete the transaction in mutually exclusive mode, reducing the time of mutually exclusive access to resources.
The above manner in which client B and client D read only committed data is called READ_COMMITED at the isolation level.
By following the above process, the concurrency of our database can be increased by another order of magnitude.
3.3.3.2 Problems Caused by Read Submitted: Virtual Read (Non-repeatable read)
But this is just an imaginary beauty, and here’s the problem. Suppose we have the following database operation:
REPEATABLE (” REPEATABLE-READ “); REPEATABLE (” AGE “); REPEATABLE (” AGE “); REPEATABLE (” AGE “)
Although the concurrent READ capability at the READ COMMITED isolation level increases by many orders of magnitude, it can result in no-repeatable READ conditions within a transaction.
What are the implications of reading submitted unrepeatable reads for developers? Not repeatable read will result in a row twice to read data may not be consistent, this requests us in the transaction database operation, not repeatable read solutions: as little as possible with the query results as a parameter to perform the subsequent updateSQL statement, as far as possible using a state machine to ensure the integrity of the data + using the repeatable read isolation level. This knowledge can lead to a separate topic: How to use databases to ensure logical integrity of business data?
3.3.4 Read Uncommitted (READ_UNCOMMITTED)
3.3.4.1 Read Uncommitted – Uncommited row-level lock
The essence of READ_COMMITTED is to limit the granularity of mutually exclusive access to committed rows. In fact, you can further refine the granularity of mutually exclusive access to UNCOMMITED rows, as shown in the following figure:
3.3.4.2 Problems caused by unsubmitted read: Dirty read
In this way, because the granularity of the resource lock is more refined, the concurrency capability of the client is further improved. However, at the same time, there will be a new problem – dirty read phenomenon, as shown in the following figure:
As shown above: In the course of a transaction, a client reader reads uncommitted data from another client updater. The client reader may then operate on the data as if it had been persisted. In fact, the client updater may roll back its data. The reader reads dirty data. If the updater does not roll back, the reader reads dirty data.
In summary, there is nothing to be said for serialization, lock base lock table, single database must be secure, distributed database is not secure, use distributed transactions. For repeatable reads, the row is locked, but the newly added row is not locked, so the new row insert operation of B transaction can be inserted in the middle of the two read operations of A transaction, so that A transaction has A magic read, the same transaction takes out different rows, increased, but only insert can succeed, delete cannot succeed. Because all the existing rows are locked. The update operation of transaction B can be inserted between the two read operations of transaction A, resulting in an unrepeatable read error. Sequential write operations can also be interrupted. Write operations can be interrupted during read operations to read updated values that have not been persisted.
Fourth, other knowledge of affairs
4.1 Transaction deadlocks
Deadlock refers to the phenomenon that two or more transactions occupy the same resource and request to lock the resource occupied by the other, resulting in a vicious cycle. Summary: A transaction holding a resource waits for another transaction resource, but does not release the resource itself. In case one, a deadlock may occur when multiple transactions attempt to lock resources in different order. In case two, a deadlock also occurs when multiple transactions simultaneously lock the same resource.
For example, imagine the following two transactions processing the StockPrice table simultaneously, satisfying the second scenario:
TRANSACTION 1 START TRANSACTION; UPDATE StockPrice SET close = 45.50 WHERE stock_id = 4 and date ='2019-05-01';
UPDATE StockPrice SET close = 19.80 WHERE stock_id = 3 and date = '2019-05-02';
COMMIT;
Copy the code
TRANSACTION 2 START TRANSACTION; UPDATE StockPrice SET high = 20.12 WHERE stock_id = 3 and date ='2019-05-02';
UPDATE StockPrice SET high = 47.20 WHERE stock_id = 4 and date = '2019-05-01';
COMMIT;
Copy the code
Transaction deadlock generation: two or more transactions access mutually exclusive resources, meet the deadlock four conditions, deadlock, with Java concurrency deadlock comparison learning.
If happened, two transactions are performed the first UPDATE statement, UPDATE a row, and lock the bank data, then each transaction attempts to perform the second UPDATE statement, only to find that the bank has been locked by the other party, and then releases the lock two transactions are waiting for each other, hold each other again at the same time need to lock, in an infinite loop. A deadlock can only be resolved if something external intervenes.
Mysql deadlock resolution: first deadlock detection, second deadlock timeout, if found, transaction rollback. Deadlock detection finds deadlock logic before a deadlock occurs: InnoDB storage engine can detect deadlock loop dependencies and immediately return an error. This works well, otherwise deadlocks can lead to very slow queries. Deadlock waiting to find deadlock logic before a deadlock occurs: it is generally not a good idea to abandon a lock request when the query time reaches the lock wait timeout setting. When a deadlock occurs, transactions are rolled back: InnoDB currently handles deadlocks by rolling back the transactions that hold the least row-level exclusive locks (this is a relatively simple deadlock rollback algorithm). After a deadlock occurs, the deadlock can only be broken by partially or completely rolling back one of the transactions. For a transactional system such as the InnoDB storage engine, this is unavoidable, so applications must be designed with deadlocks in mind. In most cases, you simply need to re-execute transactions that were rolled back due to deadlocks.
Additional: Mysql deadlocks occur for two reasons (1) because of real data conflicts, this situation is often difficult to avoid; (2) due to the implementation of the storage engine. That is, the behavior and order of locks is also dependent on the storage engine. If statements are executed in the same order, some storage engines will generate deadlocks and some will not.
4.2 Transaction Logging Undolog guarantees atomicity Redolog guarantees persistence (Emphasis 004)
The mysql logging system has three types of logs, only two of which are related to transactions. Undo undo rollback guarantees atomicity of transactions, and redolog guarantees persistence of transactions.
Transaction logging can help make transactions more efficient. With transaction logging, the storage engine only needs to modify the in-memory copy of a table’s data and record the modification to a transaction log that persists on disk, rather than persisting the modified data itself to disk each time.
Transaction logging is appending, so writing sequential I/O across a small area of the disk, unlike random I/O, which requires moving the head across multiple places on the disk, is relatively fast (kafka writes to disks this way). After the transaction log is persisted, the modified data in memory can be slowly flushed back to disk in the background. This is what most storage engines currently implement, often referred to as write-ahead logging, where data changes require two disk writes.
If the data changes have been recorded in the transaction log and persisted, but the data itself has not been written back to disk, the system crashes and the storage engine automatically recovers the modified data when it restarts. The specific recovery method depends on the storage engine.
4.3 Transaction application in MySQL
MySQL provides two transactional storage engines: InnoDB and NDB Cluster.
Remember 1: The two most common storage engines, MyISAM does not support transactions and InnoDB does; Remember # 2: In mysql, transactions are not only supported by InnoDB, but also by NDB Cluster.
4.3.1 Practice: Enable and disable AUTOCOMMIT
MySQL uses AUTOCOMMIT mode by default, which means that if a transaction is not explicitly started, each query is committed as a transaction. In the current connection, you can disable AUTOCOMMIT mode by setting the AUTOCOMMIT variable.
mysql SET AUTOCOMMIT = 1; Mysql SET AUTOCOMMIT = 0; // Turn off automatic transaction commitsCopy the code
1 or ON indicates enable,0 or OFF indicates disabled.
When AUTOCOMMIT=0, automatic transaction COMMIT is disabled and all queries are in one transaction until an explicit COMMIT or ROLLBACK is performed and the transaction ends and a new transaction is started.
Use the SET AUTOCOMMIT command to SET the start and stop of AUTOCOMMIT transactions. Note 1: Changing AUTOCOMMIT will have no effect on non-transactional tables of myISAM type. For this type of table, there is no concept of COMMIT or ROLLBACK, so it is always in AUTOCOMMIT enabled mode. Note 2 that there are also commands that force a COMMIT to COMMIT the current active transaction before executing it. Example 1: In data Definition Language (DDL), if is an operation that causes a large number of data changes, such as ALTER TABLE. Such as 2: There are other statements such as LOCK TABLES that cause the same result. If necessary, check the official documentation of the corresponding version for a list of all statements that could result in automatic commit.
4.3.2 Practice: Set transaction isolation level
Summary: MySQL can SET the ISOLATION LEVEL by executing the SET TRANSACTION ISOLATION LEVEL command. The new isolation level takes effect at the start of the next transaction, and you can set the isolation level for the entire database in the configuration file or just change the isolation level for the current session
mysql SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
Copy the code
MySQL recognizes all four ANSI isolation levels, and the InnoDB engine supports all isolation levels.
4.3.3 Problem scenario: Mixing storage engines in transactions
The MySQL server layer does not manage transactions; transactions are implemented by the underlying storage engine. Therefore, it is not reliable to use multiple storage engines in the same transaction.
Mysql architecture is divided into two layers: mysql service layer and mysql storage engine layer
Mysql service layer: connector, cache, profiler, optimizer, executor
Below the mysql service layer is the storage engine layer
Scenario: A mix of transactional and non-transactional tables (e.g. InnoDB and MyISAM tables) are used in transactions (1) with no problem under normal commit conditions. (2) in the case of abnormal submission of the problem, transaction error indicates that the transaction needs to be rolled back, myISAM type of non-transactional table changes can not be discarded, this will lead to the database in a different state, this situation is difficult to repair, the final result of the transaction will be uncertain. Summary: Therefore, it is important to select the appropriate storage engine for each table when creating the table. Note that MySQL usually does not alert or report errors when transaction-related operations are performed on non-transactional tables. Sometimes a warning is issued only during rollback: “Changes on some non-transactional tables cannot be rolled back.” But in most cases, operations on non-transactional tables will not be prompted.
4.3.4 Two-phase Locking protocol: Implicit locking and explicit locking
Summary: InnoDB uses the two-phase locking protocol. Locks can be performed at any time during a transaction. Locks are released only when a COMMIT or ROLLBACK is performed, and all locks are released at the same time.
The locks described above are implicit locks that InnoDB automatically locks when needed, depending on the isolation level.
In addition, InnoDB also supports explicit locking with specific statements that are not part of the SQL specification:
SELECT ... LOCK IN SHARE MODE
SELECT ... FOR UPDATE
Copy the code
MySQL also supports LOCK TABLES and UNLOCK TABLES statements, which are implemented in the MySQL server layer, independent of the storage engine layer. They have their uses, but are not a substitute for transaction processing. If your application requires transactions, you should still choose a transactional storage engine.
It is common to find that applications have converted TABLES from MyISAM to InnoDB, but still explicitly use LocK TABLES statements. Not only is this unnecessary, but it can seriously affect performance, and InnoDB’s row-level locking actually works better.
4.3.4 Triggers and Stored Procedures
4.3.4.1 Introduce database transaction stored procedure (SQL statement set of specific function)?
A set of SQL statements intended to perform a particular function, stored in a database and called again after the first compilation without the need to compile again, executed by the user by specifying the name of the stored procedure and giving it parameters (if the stored procedure has parameters). A stored procedure is an important object in a database.
Stored procedure optimization ideas:
- SQL to avoid small loops: try to use some SQL statements to replace some small loops, such as aggregate functions, average functions, etc.
- SQL avoids creating large loops itself: try not to place lookup statements inside loops.
- Intermediate results: Intermediate results are stored in temporary tables, indexed, tables can be indexed, speed up.
- Cursor: Use cursors sparingly. SQL is a collection language with high performance for collection operations. Cursors are procedural operations. For example, query a million rows of data. Cursors require a million reads of the table, as opposed to a few reads without cursors.
- Transactions: The shorter the transaction, the better. Sqlserver supports concurrent operations. If too many transactions are too long, or the isolation level is too high, concurrent operations can be blocked and deadlocks can occur. Results in extremely slow queries and extreme CPU usage.
- Exception: Use try-catch to handle error exceptions.
4.3.4.2 What about a trigger (a particular stored procedure of a program that executes automatically)?
Trigger is a special stored procedure that can execute automatically.
The difference between a trigger and a normal stored procedure is that a trigger fires when an operation is performed on a table. When performing operations such as update, INSERT, and delete, the system automatically invokes the triggers corresponding to the table.
Triggers can automatically respond to a behavior, so triggers are appropriate for situations where a business-level response must be made to a behavior.
Triggers can be divided into two categories: DML triggers and DDL triggers. DML triggers contain T-SQL code that responds to INSERT, UPDATE, or DELETE operations on a table or view. DDL triggers respond to server or database events rather than data modification.
First, triggers and stored procedures are both SETS of SQL statements. The only difference is that they are executed differently. Triggers cannot be invoked with an EXECUTE statement, but trigger (activate) execution when the user executes a Transact-SQL statement. Triggers are executed primarily through event execution, whereas stored procedures can be invoked directly through stored procedure names. Second, a trigger is a stored procedure that is executed when data in a specified table is modified. Triggers are special stored procedures. Third, triggers are often created to enforce referential integrity and consistency of logically related data in different tables. Since users cannot bypass triggers, they can be used to enforce complex business rules to ensure data integrity. For example, when an operation such as UPDATE, INSERT, or DELETE is performed on a table, SQL Server automatically executes the SQL statements defined by the triggers to ensure that data is processed in accordance with the rules defined by the SQL statements.
There are six triggers allowed in the Mysql table: BEFORE INSERT AFTER INSERT BEFORE UPDATE AFTER UPDATE BEFORE DELETE AFTER DELETE
4.3.5 Database concurrency control and database lock
4.3.5.1 Database concurrency control (Optimistic lock, pessimistic lock, timestamp)
Concurrency control generally adopts three methods, namely optimistic locking, pessimistic locking and timestamp.
Optimistic locks assume that when a user reads data, others will not write the data they read. Pessimistic locking is just the opposite. When you think you are reading a database, someone else may be writing the data you just read. In fact, it is a conservative attitude. Timestamps are used to control concurrent problems without locking.
1. Optimistic lock Optimistic lock considers that when a user reads data, others will not write the data he/she reads. Therefore, no lock is used.
2. Pessimistic lock
Pessimistic locking is when reading data, in order not to let others change their reading data, will first read the data lock to oneself, only oneself read data, to allow others to modify the data, or conversely, the article is to modify a data, don’t allow others to read the data, only own the whole affairs such as submitted, In order to release the lock that they add, to allow other users to access that part of the data. Above pessimistic lock said to add “lock”, in fact, including two kinds of lock: exclusive lock (write lock) and shared lock (read lock).
3. Timestamp
TimeStamp is alone in a database table column TimeStamp, such as “TimeStamp”, each time you read it, this field is read out, when to write back, add the field 1, before submission, compare with the database of the field, if is greater than the value of the database, is allowed to save, otherwise not allowed to save, Although this approach does not use the locking mechanism provided by the database system, it can greatly increase the concurrency of database processing.
4.3.5.2 Database Lock (Row-level lock, table-level lock, page-level lock)
1. Row-level locking
Row-level locking is an exclusive lock that prevents other transactions from modifying this row; MySQL automatically applies row-level locking when using the following statement:
- Lock: INSERT, UPDATE, DELETE, SELECT… FOR UPDATE [OF columns] [WAIT n | NOWAIT]; Statement allows the user to lock multiple records at once for updates
- Unlock: Release locks using COMMIT or ROLLBACK statements.
2. Table level locking
Table level locks include shared table read locks (shared locks) and exclusive table write locks (exclusive locks). Lock the entire table in the current operation. It is simple to implement, consumes less resources, and is supported by most MySQL engines. The most commonly used MYISAM and INNODB both support table-level locking.
Page-level locking
Page-level locking is a kind of lock whose granularity is between row-level locking and table-level locking in MySQL.
Table locking is fast but has many conflicts, while row locking is slow but has few conflicts.
So a compromise page level is taken, locking adjacent sets of records at a time. BDB supports page-level locking.
What types of locks are available in Mysql? MyISAM supports table locking; MyISAM supports table locking. InnoDB supports table and row locks. InnoDB defaults to row locks. Table lock: low overhead, fast lock, no deadlock. Large lock granularity, the probability of lock conflict is the highest, the concurrency is the lowest row-level lock: high overhead, slow lock, deadlock occurs. The lock force is small, the probability of lock conflict is small, and the concurrency is the highest
5. Interview Goldfinger
5.1 Transaction starting mode
Goldfinger: Three starting questions (to remember) 1. When does a transaction occur? In mysql, the two most important storage engines, myiASM doesn’t have transactions, and InnoDB does. Remember 1: The two most common storage engines, MyISAM does not support transactions and InnoDB does; Remember # 2: In mysql, transactions are not only supported by InnoDB, but also by NDB Cluster.
2. Why innoDB transactions? Why are transactions needed? Benefits of transactions? The benefits of transactions are the four features of transactions that cannot be done without transactions. Atomicity: There are only two states before and after a transaction. Transaction execution cannot be interrupted, so multiple atomic SQL statements can be placed in a single transaction. This cannot be done without transactions. Consistency: As noted above, there are only two states of a transaction, before and after the transaction is executed, in which the database integrity constraint is not broken. This cannot be done without transactions. Isolation: The isolation of transactions is the isolation of mysql, creating mysql parallel execution. This cannot be done without transactions. Persistence: Save to a database. This is a persistence medium. Memory is not a persistence medium. This cannot be done without transactions.
Mysql > alter database transaction; A transaction consists of three parts: ACID properties, transaction isolation level, additional deadlocks, transaction logs, and transactions in MySQL.
5.2 transaction ACID
5.2.1 transaction ACID
Summary rollback: Atomicity, transaction execution errors are rolled back; In persistence, the transaction executes correctly and completes, and there is no rollback.
Auric goldfinger: The significance of the existence of ACID (to provide the theoretical basis for correct database execution) ACID is the four basic elements of the correct execution of database transaction transanction, which provides the theoretical basis for the correct execution of database transaction. Only when these four basic elements are met at the same time can a database transaction be guaranteed to execute correctly. Atomicity: The execution of an event cannot be interrupted, so you know from atomicity that there are only two states of a transaction: before and after. (1) If the transaction fails, it is rolled back. Atomicity is implemented by rollback, which is implemented by undolog. (2) If the transaction succeeds, it’s ok. Consistency: As noted above, there are only two states of a transaction, before and after the transaction is executed, in which the database integrity constraint is not broken. Consistency Remember two points: (1) during the transaction execution, the transaction execution fails, the disk will not persist; (2) If the transaction is successfully executed, the disk will be persisted. Isolation: The parallel execution of transactions is the foundation of isolation. If a transaction can only be executed serially, then it has nothing to do with the transaction. Isolation is designed only for parallel execution of transactions. Isolation Remember five things: 3.3.5 Interview Goldfinger stuff. Persistence: Save to a database. This is a persistence medium. Memory is not a persistence medium. Remember two things about persistence: (1) once the transaction is committed, it is persisted to disk, guaranteed by the redolog redolog. (2) if the transaction commits successfully, there will be no rollback, which is guaranteed by undolog undolog, because rollback is implemented by undolog.
5.2.2 Bank transfer transaction ACID
Scenario 1: Atomicity: What happens if the server crashes in the fourth sentence? $200 is missing from the check table, but $200 is not added to the storage table, because an error was reported when executing the $200 increase in the storage table. If there is no rollback mechanism for transaction failure, that is, there is no atomicity of the transaction, which causes an error. Atomicity is implemented through rollback, which in turn is implemented through Undolog
Scenario 2: Consistency, the system crashes after the third statement is executed but before the fourth statement is started? Transactions have consistency. The database always transitions from one consistent state to another consistent state. In the previous example, consistency ensures that even if the system crashes after the third statement is executed but before the fourth statement is started, no $200 will be lost in the checking account because the transaction does not commit, so changes made in the transaction will not be saved to the database.
Scenario 3: Isolation, when the third statement is executed but before the fourth statement is started, another process deletes all balances from the checking account, and the result could be that the bank gave Jane $200 without knowing this logic. Without transaction isolation, transaction parallelism cannot be secured. With isolation, changes made by one transaction are not visible to other transactions until the final commit. When the third statement is executed and the fourth statement is not started, another account summary is run and the checking account balance is not subtracted by $200.
Scenario 4: Persistence, after successful execution, the database crashes without time to write to disk? The transaction has a persistence mechanism, which is guaranteed by the Redolog log. There must be a persistence mechanism in the Redolog log.
5.2.3 Summary: The bottom implementation of the four properties of ACID
Summary: The bottom implementation of ACID’s four properties, the atomicity of the high lattice transaction is implemented by undolog undo rollback; The consistency of transactions is nothing to talk about. The isolation of transactions is achieved by the isolation level; The persistence of a transaction guarantees that the transaction will persist after completion, and is implemented by the redolog redolog.
5.3 Four isolation levels for transactions
5.3.1 Highlights: Isolated objects (mutually exclusive database resources) + gap locks
5.3.1.1 Isolated Objects (Mutually exclusive Database resources)
First, macroscopically, the isolation is the operation of the client side: isolation refers to the fact that when different clients do transaction operations, ideally, each client does not have any interaction, as if unaware of each other’s existence.
Second, microscopically, what is actually isolated is the mutually exclusive database resources of the client operation. The truly isolated object is the mutually exclusive access of the database resources. The isolation is reflected by the different granularity of the database resources partition.
5.3.1.2 Gap Locking (InnoDB is a magic tool for solving phantom errors with repeatable read isolation level)
For MySQL/InnoDB, the default isolation level is repeatable reads, because InnoDB gap locks already guarantee no phantom reads.
Unreal read errors at the repeatable read isolation level Transaction B queries the data whose appId is TestAppID and finds that the data does not exist. Transaction B prepares to insert A record of testAppID. Transaction A inserts A data whose appId is TestAppID. Transaction B does not insert the testAppID data, but fails to insert the testAppID data. To solve the illusory problem, there are two methods:
Option 1: Lock the entire table when a transaction to read data is started. This is the highest level of transaction isolation: Serializable, serialized.
Scheme 2: In Repeatable read level, add shared lock or exclusive lock, innoDB engine will actively add gap lock to avoid phantom read.
So, a better way to avoid phantom reads is the second: RR+ gap lock (RR stands for repeatable read isolation level)
What is a gap lock?
Gap lock: Locks a range of records, but not the record itself. This works at RR level. InnoDB actively uses gap lock for lock operations (read with lock, update with lock, delete). According to different matching conditions, the clearance of the lock range is as follows: 1, the matching conditions for range, locking the range 2, matching conditions for the equivalence “= =”, match results is not null, lock are matching record in the open interval of 3, matching conditions for the equivalence “= =”, matching result is empty, lock the records left and right sides of the open interval
The data of leaf nodes in index B+ tree is arranged in sequence. The data between two leaves is called gap. When RR data is read with a lock, Mysql locks the gap that meets the query conditions to prevent other transactions from inserting new data between leaves and damaging THE RR.
How not to Use gap locks:
1, index: Mysql does not add a gap lock when using a unique index. 2, down isolation level: set isolation level to RC, read committed, down isolation level, no gap locks. 3. Mysql actively uses the insert intent lock: For insert operations, Mysql tries to add an insert intent lock first. It does not prevent other transaction insert operations, and improves the efficiency of concurrent inserts. If there is no gap lock, select… For update, select… Lock in share mode, delete, update statement, insert intent lock invalid.
5.3.2 Interview Goldfinger: Four isolation levels and three errors
5.3.2.1 Q.1: Explain the four isolation levels
SERIALIZABLE read has the highest isolation level. Clients access database resources in mutually exclusive mode. Within a unified time, a resource can only be accessed by one client, as if the client is queuing for access. Note: For the same row, “write” will add “write lock”, “read” will add “read lock”, when the read/write lock conflict occurs, the last access transaction must wait for the previous transaction to complete before continuing to execute. This is the highest level of isolation, the least efficient, the most secure, and it is guaranteed that transactions will not be executed in parallel.
REPEATABLE_READ repeatable read, repeatable read using row-level locks, row-level locks can guarantee, a client within a transaction, multiple access the same resource, return the result is the same, as the name implies, referred to as repeatable read, this isolation level may be the cause of phantom read phenomenon (the newly inserted row, has not been row-level locks lock). Explanation: The data seen during the execution of a transaction is always the same as the data seen when the transaction is started. Of course, at the repeatable read isolation level, uncommitted changes are also invisible to other transactions.
READ_COMMITTED Read committed. The client reads the latest committed data from the database each time a query is committed in a transaction.
READ_UNCOMMITTED: A client can read uncommitted data of other clients in a transaction.
5.3.2.2 Q2: Four concepts: Isolation level, resource mutual exclusion granularity, transaction concurrency capability, and data consistency The essence of isolation level is resource mutual exclusion granularity, namely database lock level?
Question 2: Four concepts: Isolation level, resource mutual exclusion granularity, transaction concurrency capability, data consistency The essence of isolation level is resource mutual exclusion granularity, namely database lock level?
The four transaction isolation levels described above are explained in terms of mutually exclusive access to resources. The lower the isolation level, the finer the control of the granularity of the mutual exclusion of resources causes two problems: first, the higher the concurrency of client transactions, and second, the reduced data consistency.
The number of concurrent transactions and data consistency are two opposite ideal metrics. And the direction of database research and development is to improve the two indicators at the same time, as far as possible to reduce the impact of the reaction between.
5.3.2.3 Problem 3: Explain a table: Four isolation levels and three errors
Problem 3: Explain a table: four isolation levels and three errors
Serialized read: table lock, exclusive lock for both read and write. Repeatable read: Row lock, exclusive lock for both read and write. Committed Read: Row-lock, lock is committed only when the write operation ends. If the updater falls back, the reader reads dirty data. If the updater does not roll back, the reader reads real data.
Distinguish repeatable read isolation level from non-repeatable read (also known as virtual read) errors? Repeatable reading, as the name implies, returns the same result when accessing the same resource multiple times within the same transaction; Non-repeatable read (virtual read) error: If the same resource is accessed multiple times in the same transaction, the result is different because the newly inserted row is unlocked + update.
Analyze the table thoroughly to understand the resource exclusivity level of lock level manufacturing and the isolation level of resource exclusivity level manufacturing (3*4=12).
Isolation level | Dirty Read | NonRepeatable Read | Phantom Read |
---|---|---|---|
Read uncommitted | may | may | may |
Read committed | Can’t be | may | may |
Repeatable read | Can’t be | Can’t be | may |
Serializable | Can’t be | Can’t be | Can’t be |
1.1 Why serialization doesn’t phantom? An atomized database transaction, similar to Synchronized or lock in Java, does not generate two readers of the same transaction interrupted by an insert from another transaction because the database transaction is atomized.
1.2 Why does serialization not appear unrepeatable reads (virtual reads)? An atomized database transaction, similar to Synchronized or lock in Java, does not generate two readers of the same transaction interrupted by another transaction’s updater because the database transaction is atomized.
1.3 Why does serialization not show dirty reads? Table level locks, which are exclusive locks for reads and writes, and atomized database transactions, similar to Synchronized or lock in Java, do not generate a transaction updater that is interrupted by a reader of another transaction because the database transaction is atomized.
2.1 Why phantom Errors occur in repeatable reads? Root cause: Newly inserted rows are not locked. Row-level locks lock all rows in the table that meet the WHERE clause condition and atomize operations on them. Scenario description: Transaction A locks the rows that meet the WHERE clause condition, but the newly inserted rows are not locked. Assuming that the newly inserted rows also meet the WHERE clause condition, the second query in transaction A has one more row. One solution: Set the isolation level to serialized read
2.2 Why does repeatable Read avoid Non-repeatable read (Virtual read) errors? Row-level locks lock all rows in the table that meet the WHERE clause condition and atomize operations on them. Therefore, the virtual read scenario cannot be implemented. The updater objects are existing rows, and existing rows that satisfy the WHERE clause are locked.
2.3 Why Does repeatable Read avoid Dirty Read Errors? Row-level locks lock all rows in the table that meet the WHERE clause condition and atomize operations on them. So the dirty read scenario cannot be implemented, reader objects are existing rows, and existing rows that satisfy the WHERE clause are already locked.
3.1 Why A Non-repeatable Read (Virtual Read) Error Occurs on a Committed Read? Root cause: Starting from the definition of isolation level, use read/write lock separation: committed write operations added exclusive locks, but read operations did not lock, shared locks are not added. A reader reads rows that satisfy a WHERE clause without locking them, allowing another transaction B to modify them. This is the fundamental difference between committed reads and repeatable. Scene Description: The updater can interrupt A reader between two database reads because the database is not locked. However, the updater cannot interrupt A reader between two database reads because the database is locked. So the two reader rows record different values. There are two solutions for non-repeatable reads: use the query results as parameters as little as possible to execute subsequent updateSQL statements; use the state machine as much as possible to ensure data integrity; and use the repeatable read isolation level
3.2 Why does a Magic Read Error Occur in a Committed read? Root cause: Starting from the definition of isolation level, use read/write lock separation: committed write operations added exclusive locks, but read operations did not lock, shared locks are not added. Insert can be done between two readers, but the reader is not locked, because the new row is not locked, and now the read is not locked.
3.3 Why No Dirty Read Error Occurs When Read Is Submitted? Root cause: Starting from the definition of isolation level, use read/write split locks: Committed write operations plus exclusive locks, dirty read scenarios cannot be satisfied, and updater operations cannot be interrupted by readers.
4.1 Why does not commit a read create a Dirty read? Root cause: From the isolation level definition, write operations are also unlocked. In transaction B, write operations do not lock rows that satisfy the WHERE clause, so those rows can be locked in transaction A. If the updater rolls back, the reader reads dirty data. If the updater does not roll back, the reader reads real data. Lock dirty data. Update to committed read, place an exclusive lock on the write operation updater, so that the reader cannot interrupt.
4.2 Why Does uncommitted Read Cause Unrepeatable Reads (Virtual reads)? Root cause: Starting with the isolation level definition, write operations are not locked, and read operations are not locked. In transaction A, two reader operations can be added to the updater operation because the read operation is not locked. Solution: Upgrade to repeatable read, lock on row level, so
4.3 Why Does not submit the read isolation level cause phantom reads? Root cause: Starting with the isolation level definition, write operations are not locked, and read operations are not locked. Insert (insert, insert); insert (insert, insert); insert (insert, insert); insert (insert, insert); insert (insert, insert); insert (insert, insert) Solution: Upgrade to serialized read, directly lock the table
5.3.2.4 Practice 4: Set transaction Isolation level + Start transaction
First, change the isolation level of the transaction:
SET [GLOBAL|SESSION] TRANSACTION ISOLATION LEVEL level; // There are several levels aboveCopy the code
Second, there are two ways to start a transaction:
- Begin or start transaction. The corresponding commit statement is COMMIT and the rollback statement is rollback.
- Set autocommit=0. This command will turn off auto-commit for this thread, meaning that if you execute only one SELECT statement, the transaction will start and will not commit automatically. The transaction persists until you actively execute a COMMIT or ROLLBACK statement, or disconnect.
Note: It is important to be careful with long transactions, because once set autoCOMMIT =0 automatically starts the transaction, all queries will also be in the transaction, and slow SQL will easily bring down the database. I recently encountered such a problem, database often receive alarm, which has a long transaction caused by the problem.
All the rest is InnoDB based, as MyISAM does not support transactions.
5.3.2.5 Low-level 5: Transaction attempt, low-level implementation of transaction isolation
5.3.2.5.1 Overview: Transaction attempts and table views, transaction attempts are the underlying implementation of transaction isolation
Transaction view, which is the basis of transaction isolation implementation, creates a view in the database and accesses it based on the logical result of the view. That’s ReadView.
Note: In MySQL, there are two concepts of “views” : one is a view, which is a virtual table defined by a query statement that executes and generates results when called. The syntax for creating a view is create View… , and its query method is the same as the table, this view and table query management, in the table and view. InnoDB implements MVCC in a consistent read view that supports read Committed (RC) and Repeatable Read (RR) isolation levels. This is Transactions and Views.
5.3.2.5.2 Detail 1: Transaction Snapshot
At the repeatable read isolation level, a transaction “takes a snapshot” when it is started, note that this snapshot is based on the entire library.
You say, how is that possible? If a library is 100 GIGABytes, then I start a transaction and MySQL has to copy 100 GIGABytes of data. How slow is this process? Who can handle that? But when you think about it, ordinary business is carried out quickly.
In fact, the database does not need to copy the 100GB of data, so how to achieve the snapshot?
First, every transaction in InnoDB has a unique transaction ID, called transaction ID, which is applied to InnoDB’s transaction system at the beginning of the transaction and is applied in a strictly increasing order.
Second, for each row in the database, there are multiple versions of each row. Each transaction update generates a new version of the data, and assigns the transaction ID to the transaction ID of the data version, which is called row trx_id. At the same time, the old version of the data is kept, and in the new version of the data, the information can be accessed directly.
That is, a row in a data table may actually have multiple versions (rows), each with its own ROW TRx_id.
Transaction ID for each transaction and trx_ID for each row Transaction ID = InnoDB transaction id = InnoDB transaction id = InnoDB transaction ID = InnoDB transaction ID Get transaction ID; Row trx_ID (TRX ID); row TRx_ID (TRX ID); row TRx_ID (TRX ID);
This is a hidden column, and there is another roll_pointer: every time a change is made to a clustered index record, the old version of the data is written to the Undo log, and the hidden column acts as a pointer to find information about the record before it was modified
Why does a snapshot of the entire database not require a copy of the entire database? Transactions and Snapshots: At the repeatable read isolation level, a transaction “takes a snapshot” when it is started. Note that this snapshot is based on the entire library. Snapshot and redo logs: An Undo log is logged for each transaction that changes the data row. Simply copy the current Undo log.
5.3.2.5.3 Detail 2: Version chain (the cornerstone of transaction attempts) : trx_id and roll_pointer
Trx_id and roll_pointer are both in the InnoDB cluster index and look something like this:
The undo log rollback mechanism also relies on this version chain. Every time an undo log is changed, an undo log is logged, and each undo log has a roll_pointer attribute (the undo log for INSERT does not have this attribute because there is no earlier version of the log). These undo logs can be linked together into a linked list, so the situation now looks like the following:
Name = transaction id; size = datatype length; TRX id = transaction ID; roll pointer = historical version; It’s just an address. The following 2, 3 and 4 are the historical version information recorded in the undo log, but 1 is in the current database, to return to the historical version as long as there is an undo log, the same meaning of snapshot.
5.3.2.5.4 Detail 3: The transaction attempt ReadView is generated at different times (transaction isolation level: the nature of committed read and repeatable read is different)
The relationship between the transaction isolation level and MVCC (committed read + repeatable read) is described as follows: Concurrency Control for multi-version Concurrency Control (MVCC) is a transaction that uses two isolation levels: READ COMMITTD and REPEATABLE READ. The process of accessing the version chain of a record (MVCC definition and meaning) during a normal SELECT operation, so that read-write and read-read operations of different transactions can be executed concurrently (read committed: update between reads; Repeatable reads: insert between reads) to improve system performance. MVCC is simply a version chain process for two isolation levels of select operations to access records. As you can see from the version chain, three transactions (trxIDS 200, 40, 10) do different things. At the read commit isolation level, this view is created at the start of each SQL statement, and at this isolation level, a transaction generates a separate ReadView at the start of each query.
REPEATABLE READ. For transactions with REPEATABLE READ isolation level, only a ReadView is generated when the data is READ by the first query statement. Subsequent queries are not generated repeatedly, so the query result of a transaction is the same every time.
(1) Read uncommitted isolation level directly returns the latest value on the record. There is no concept of view, i.e. the column above. Dirty reads, magic reads, and unrepeatable reads are all possible. (2) Serialization of isolation level, the underlying direct use of locking (table locking) to avoid parallel access. In summary, the only two types of MVCC multi-version concurrency control that are relevant to transaction isolation levels are repeatable reads and read committed.
During project development, the company’s database isolation level was read committed by default, but we turned on repeatable reads in some scenarios, and serialization was rare.
Repeatable read and serialization applications: repeatable read before we are with the order amount is related to the scene to open, and there are many data modification process can also use the duplication, because a lot of value is a need to look out, on the basis of the value to do other operation, if the results of the query values are not the same for many times, that the latter would also be affected. Serialization is known as the gold standard database isolation level, directly to the table lock, it is the most commercial database systems to provide the highest isolation level, table level lock table to ensure atomicity, speaking, reading and writing, some highly widely deployed system cannot provide even with serializable isolation level as high as the financial scene, performance is the worst, after all, The bank didn’t care about the extra seconds.
How to view the isolation level of the database you are using:
show variables
Summary: One big difference between the two isolation levels is when the transaction attempts to ReadView is generated. (1) READ COMMITTD generates a transaction attempt ReadView before each normal SELECT operation; (2) REPEATABLE READ only generates a transaction attempt ReadView before the first ordinary SELECT operation. The REPEATABLE READ of data is actually the repeated use of ReadView.
5.3.2.5.4 Detail 4: MVCC(From the source, MVCC mechanism: Multi-version concurrency control)
Goldfinger: Isolation is implemented through MVCC, while MVCC is implemented through Undolog, and persistence is implemented through Redolog. In isolation, only committed reads and repeatable reads are implemented by MVCC. Uncommitted reads and serializable reads are not implemented by MVCC.
Initially, the MVCC mechanism: The isolation level of multi-version Concurrency Control transactions relies on the MULTI-version Concurrency Control implementation (MVCC) to increase the Concurrency of read operations. Principle: Each transaction will have a transaction ID, the transaction ID increases with time, each row record has two hidden columns, maintain two version numbers (transaction ID), each row record can exist multiple versions (record in Undolog), add, delete, check and change around these two version numbers. Create_version: transaction version number of the row created. 2. Delete_version: transaction version number of the row deleted. Take RR level as an example, MVCC reads, inserts, deletes, and updates: 1. Read: Only the row whose version is smaller than the current transaction version is read and deleted. Insert: Create a new record and record the version number of the current transaction to create_version. 3. Delete: Set the current system version number to delete_version in this line. Update: copy the old record, generate a new copy of the row, set the version of the current transaction to create_version of the new row, and set the version of the old row to delete_version of the new row. At the RC isolation level, MySQL only reads the latest version number. One is called snapshot read and one is called current read. Advantages of MVCC: Supports concurrent read without locking, increasing the concurrency. Disadvantages of MVCC: Maintains multiple versions of the same row, consuming more space
5.4 Transaction additional three
5.4.1 Transaction deadlocks (analogous to Java concurrent deadlocks)
Deadlock refers to the phenomenon that two or more transactions occupy the same resource and request to lock the resource occupied by the other, resulting in a vicious cycle. Summary: A transaction holding a resource waits for another transaction resource, but does not release the resource itself. In case one, a deadlock may occur when multiple transactions attempt to lock resources in different order. In case two, a deadlock also occurs when multiple transactions simultaneously lock the same resource.
Mysql deadlock resolution: first deadlock detection, second deadlock timeout, if found, transaction rollback. Deadlock detection finds deadlock logic before deadlock: InnoDB storage engine can detect deadlock loop dependencies and immediately return an error. This works well, otherwise deadlocks can lead to very slow queries. Deadlock waiting to find deadlock logic before deadlock: it is generally not a good idea to abandon a lock request when the query time reaches the lock wait timeout setting. After a deadlock, transactions are rolled back: InnoDB currently handles deadlocks by rolling back transactions that hold the least row-level exclusive locks (this is a relatively simple deadlock rollback algorithm). After a deadlock occurs, the deadlock can only be broken by partially or completely rolling back one of the transactions. For a transactional system such as the InnoDB storage engine, this is unavoidable, so applications must be designed with deadlocks in mind. In most cases, you simply need to re-execute transactions that were rolled back due to deadlocks.
Additional: Mysql deadlocks occur for two reasons (1) because of real data conflicts, this situation is often difficult to avoid; (2) due to the implementation of the storage engine. That is, the behavior and order of locks is also dependent on the storage engine. If statements are executed in the same order, some storage engines will generate deadlocks and some will not.
5.4.2 Transaction logging (Undolog undo guarantees atomicity of transactions, redolog guarantees persistence of transactions)
The mysql logging system has three types of logs, only two of which are related to transactions. Undo undo rollback guarantees atomicity of transactions, and redolog guarantees persistence of transactions.
5.4.3 Transaction application in mysql
5.4.3.1 Practice: Enable and Disable AUTOCOMMIT
Use the SET AUTOCOMMIT command to SET the start and stop of AUTOCOMMIT transactions. Note 1: Changing AUTOCOMMIT will have no effect on non-transactional tables of myISAM type. For this type of table, there is no concept of COMMIT or ROLLBACK, so it is always in AUTOCOMMIT enabled mode. Note 2 that there are also commands that force a COMMIT to COMMIT the current active transaction before executing it. Example 1: In data Definition Language (DDL), if is an operation that causes a large number of data changes, such as ALTER TABLE. Such as 2: There are other statements such as LOCK TABLES that cause the same result. If necessary, check the official documentation of the corresponding version for a list of all statements that could result in automatic commit.
5.4.3.2 Practice: Set the transaction isolation level
Summary: MySQL can SET the ISOLATION LEVEL by executing the SET TRANSACTION ISOLATION LEVEL command. The new isolation level takes effect at the start of the next transaction, and you can set the isolation level for the entire database in the configuration file or just change the isolation level for the current session
5.4.3.3 Problem scenario: Mixing storage engines in a transaction
The MySQL server layer does not manage transactions; transactions are implemented by the underlying storage engine. Therefore, it is not reliable to use multiple storage engines in the same transaction.
Mysql architecture is divided into two layers: mysql service layer and mysql storage engine layer
Mysql service layer: connector, cache, profiler, optimizer, executor
Below the mysql service layer is the storage engine layer
Scenario: A mix of transactional and non-transactional tables (e.g. InnoDB and MyISAM tables) are used in transactions (1) with no problem under normal commit conditions. (2) in the case of abnormal submission of the problem, transaction error indicates that the transaction needs to be rolled back, myISAM type of non-transactional table changes can not be discarded, this will lead to the database in a different state, this situation is difficult to repair, the final result of the transaction will be uncertain. Summary: Therefore, it is important to select the appropriate storage engine for each table when creating the table. Note that MySQL usually does not alert or report errors when transaction-related operations are performed on non-transactional tables. Sometimes a warning is issued only during rollback: “Changes on some non-transactional tables cannot be rolled back.” But in most cases, operations on non-transactional tables will not be prompted.
5.4.3.4 Two-phase Locking protocol: Implicit locking and explicit locking
Summary: InnoDB uses the two-phase locking protocol. Locks can be performed at any time during a transaction. Locks are released only when a COMMIT or ROLLBACK is performed, and all locks are released at the same time.
Six, the summary
MySQL > Transaction isolation level > transaction isolation level > transaction isolation level
Play code every day, progress every day!