Related reading: 2.7W! Java basic interview questions/knowledge summary! (2021 latest)

This post has been posted before, but I’ve recently refactored it and fixed a number of minor issues. So, sync again!

The content is hard! It is highly recommended that you take about 10 minutes to read it!

MySQL based

Introduction to relational databases

As the name implies, relational database is a database built on the basis of a relational model. A relational model represents the relationships (one-to-one, one-to-many, many-to-many) between data stored in a database.

In a relational database, our data is stored in various tables (such as a user table), and each row of the table contains one piece of data (such as a user’s information).

Most relational databases use SQL to manipulate data in the database. Also, most relational databases support the four properties of transactions (ACID).

What are the common relational databases?

MySQL, PostgreSQL, Oracle, SQL Server, SQLite (wechat local chat records are stored in SQLite)…… .

MySQL is introduced

MySQL is a relational database that is used to persistently store some data in our system, such as user information.

As MySQL is an open source, free and mature database, it is widely used in various systems. Anyone can download it under the General Public License (GPL) and modify it according to their personal needs. The default port number of MySQL is 3306.

The storage engine

Storage engine-related commands

Check out all the storage engines that MySQL provides

mysql> show engines;
Copy the code

From the above figure, we can see that the default storage engine of MySQL is InnoDB, and only InnoDB of all storage engines in 5.7 is transactional storage engine, that is, only InnoDB supports transactions.

View the current default storage engine of MySQL

We can also view the default storage engine by using the following command.

mysql> show variables like '%storage_engine%';
Copy the code

View the storage engine for the table

show table status like "table_name" ;
Copy the code

Difference between MyISAM and InnoDB

Before MySQL 5.5, the MyISAM engine was the default storage engine for MySQL, which was in its heyday.

Although, MyISAM’s performance is ok and its various features (such as full-text indexing, compression, spatial functions, etc.) are ok. However, MyISAM does not support transaction and row-level locking, and its biggest drawback is that it cannot safely recover from a crash.

After version 5.5, MySQL introduced InnoDB (transactional database engine), the default storage engine after MySQL 5.5 is InnoDB. Remember InnoDB, you use this storage engine every time you use MySQL database, right?

Back to business! Let’s compare the two briefly:

1. Check whether row-level locking is supported

MyISAM only has table-level locking, while InnoDB supports row-level locking and table-level locking. The default row-level locking is row locking.

In other words, MyISAM locks the entire table, which is silly in the case of concurrent writing. This is why InnoDB performs better when writing concurrently!

2. Whether transactions are supported

MyISAM does not provide transaction support.

InnoDB provides transaction support with the ability to commit and rollback transactions.

3. Check whether foreign keys are supported

MyISAM does not support it, InnoDB does.

🌈 expand:

We generally do not recommend using foreign keys at the database level, but at the application level. However, this poses a threat to the consistency of the data. Whether or not to use foreign keys will depend on your project.

4. Check whether security recovery after database crashes is supported

MyISAM does not support it, InnoDB does.

When InnoDB is restarted after an abnormal crash, the database will be restored to the state before the crash. This recovery process relies on the redo log.

🌈 expand:

  • The MySQL InnoDB engine uses redo log to ensure transaction persistence and undo log to ensure atomicity.
  • MySQL InnoDB engine uses lock mechanism, MVCC and other means to ensure transaction isolation (the default isolation level is REPEATABLE-READ).
  • Consistency can only be guaranteed when transaction persistence, atomicity, and isolation are ensured.

5. Check whether MVCC is supported

MyISAM does not support it, InnoDB does.

To be honest, this comparison is a bit of a dud, since MyISAM doesn’t even support row-level locking.

MVCC can be seen as an upgrade to row-level locking that effectively reduces locking operations and provides performance.

About MyISAM and InnoDB choice problem

Most of the time we use the InnoDB storage engine, and in some read-intensive cases MyISAM is appropriate. However, if your project doesn’t mind MyISAM’s lack of transaction support, crash recovery, etc. (we all do!) .

MySQL High Performance has a sentence that reads:

Don’t believe the rule of thumb that “MyISAM is faster than InnoDB”. This conclusion is not always absolute. InnoDB is faster than MyISAM in many known scenarios, especially with clustered indexes or applications where data that needs to be accessed can be stored in memory.

In most cases InnoDB is fine, but in some cases MyISAM is a good choice if you don’t care about scalability and concurrency, transaction support, and security recovery after a crash. But in general, we need to consider these issues.

Therefore, there is little reason for you to use MyISAM as a storage engine for your MySQL database for everyday business systems.

Locking mechanism and InnoDB locking algorithm

MyISAM and InnoDB storage engines use locks:

  • MyISAM uses table-level locking.
  • InnoDB supports row-level locking and table-level locking. The default row-level locking is performed

Table level locking vs. row level locking

  • Table-level lock: the lock with the largest granularity in MySQL. It locks the entire table in the current operation. It is simple to implement, consumes less resources, and locks quickly, avoiding deadlocks. It has the largest locking granularity, the highest probability of triggering lock conflicts, and the lowest concurrency. Both MyISAM and InnoDB engines support table-level locking.
  • Row-level lock: the lock with the smallest granularity in MySQL. Only the current row is locked. Row-level locking can greatly reduce conflicts in database operations. It has the smallest locking granularity and high concurrency, but the cost of locking is also the largest, and locking is slow, resulting in deadlock.

InnoDB storage engine lock algorithm has three kinds:

  • Record lock: a lock on a single row Record
  • Gap Lock: A Gap lock that locks a range, excluding the record itself
  • Next-key lock: Record + GAP locks a range, including the record itself

The query cache

When executing a query statement, the cache is queried first. However, MySQL 8.0 was removed because this feature is not very useful

My.cnf add the following configuration, restart MySQL to enable query cache

query_cache_type=1
query_cache_size=600000
Copy the code

MySQL can also enable query cache by running the following command

set global query_cache_type=1;
set global query_cache_size=600000;
Copy the code

As shown above, if the query cache is enabled, the results are directly returned in the cache under the same query conditions and data. The query criteria include the query itself, the current database to be queried, and the client protocol version number, which may affect the results. So any two queries that differ on any character will result in a cache miss. In addition, if the query contains any user-defined functions, storage functions, user variables, temporary tables, or system tables in the MySQL library, the query results will not be cached.

After the cache is created, MySQL’s query caching system keeps track of each table involved in the query. If these tables (data or structure) change, all cached data associated with that table is invalidated.

Although caching can improve the performance of database queries, it also brings additional overhead. After each query, a cache operation must be done, and destruction must be done after failure. Therefore, be careful about enabling query caching, especially in write-intensive applications. If this function is enabled, properly control the size of the cache space. Generally, it is appropriate to set the size to tens of MB. Sql_cache and SQL_NO_cache can also be used to control whether a query needs to be cached:

select sql_no_cache count(*) from usr;
Copy the code

The transaction

What is a transaction?

In a nutshell, a transaction is a logical set of operations that either all or none of them execute.

Can you give a quick example?

The most classic and often cited example of a transaction is the transfer of money. If Xiao Ming wants to transfer 1000 yuan to Xiao Hong, the transfer will involve two key operations:

  1. Reduce Xiao Ming’s balance by 1000 yuan
  2. Increase xiao Hong’s balance by 1000 yuan.

The transaction treats these two operations as a logical whole that contains operations that either succeed or fail.

In this way, Ming’s balance will not decrease while Hong’s balance does not increase.

What are database transactions?

Database transactions are the most common part of our daily development. If your project is a monolithic architecture, you are often exposed to database transactions.

When we talk about transactions, if we don’t specifically refer to distributed transactions, we tend to refer to database transactions.

So what does a database transaction do?

Simply put: a database transaction ensures that multiple operations on the database (that is, SQL statements) form a logical whole. The database operations that make up the logical whole follow: either all execute successfully or none.

Start a transactionSTARTTRANSACTION; # multipleSQLStatements SQL1, SQL2... ## Commit transactionCOMMIT;
Copy the code

In addition, relational database transactions (e.g. MySQL, SQL Server, Oracle, etc.) have ACID properties:

What is ACID property?

  1. atomic(Atomicity) : Transactions are the smallest unit of execution and are not allowed to be split. The atomicity of the transaction ensures that the action either completes completely or does not work at all;
  2. consistency(Consistency) : Data should be consistent before and after the execution of a transaction. For example, in a transfer business, the total amount of the remitter and the remittee should remain unchanged no matter whether the transaction is successful or not;
  3. Isolation,(Isolation) : When accessing the database concurrently, a user’s transaction is not disturbed by other transactions, and the database is independent between the concurrent transactions;
  4. persistence(Durabilily) : After a transaction is committed. Its changes to the data in the database are persistent and should not be affected if the database fails.

How do data transactions work?

Here we take MySQL’s InnoDB engine as an example to briefly say.

The MySQL InnoDB engine uses redo log to ensure transaction persistence and undo log to ensure atomicity.

MySQL InnoDB engine uses lock mechanism, MVCC and other means to ensure transaction isolation (the default isolation level is REPEATABLE-READ).

Consistency can only be guaranteed when transaction persistence, atomicity, and isolation are ensured.

What are the problems associated with concurrent transactions?

In a typical application, multiple transactions run concurrently, often working on the same data to accomplish their respective tasks (multiple users working on the same data). Concurrency, while necessary, can lead to the following problems.

  • Dirty read: When a transaction is accessing data and making changes to the data that have not been committed to the database, another transaction also accesses the data and then consumes it. Because this data is not committed yet, another transaction reads this data as “dirty data” and may not be doing the right thing based on “dirty data”.
  • Lost to modify: When a transaction reads data that is also accessed by another transaction, the first transaction changes the data and the second transaction changes the data. The result of the modification within the first transaction is then lost and is therefore called a lost modification. For example, transaction 1 reads A table A=20, transaction 2 also reads A=20, transaction 1 changes A=A-1, transaction 2 also changes A=A-1, the final result A=19, the modification of transaction 1 is lost.
  • Unrepeatable read: Read the same data multiple times within a transaction. While this transaction is not finished, another transaction also accesses the data. Then, between the two reads in the first transaction, the data read by the first transaction may not be the same because of modifications made by the second transaction. This occurs when the data read twice within a transaction is not the same and is therefore called a non-repeatable read.
  • Phantom read: Phantom read is similar to unrepeatable reading. It occurs when one transaction (T1) reads several rows of data, followed by another concurrent transaction (T2) inserts some data. In subsequent queries, the first transaction (T1) will find more records that did not exist before, as if an illusion occurred, so it is called phantom read.

The difference between unrepeatable and phantom reads:

Non-repeatable reads focus on modifying such as reading a record many times and finding that the values of some columns have been changed. Phantom reads focus on adding or deleting such as reading a record many times and finding that the number of records has increased or decreased.

What are the transaction isolation levels?

The SQL standard defines four isolation levels:

  • Read-uncommitted: The lowest isolation level that allows UNCOMMITTED data changes to be READ, potentially resulting in dirty, illusory, or unrepeatable reads.
  • Read-committed: Allows concurrent transactions to READ data that has been COMMITTED, preventing dirty reads, but magic or unrepeatable reads can still occur.
  • REPEATABLE-READ: Multiple reads of the same field are consistent, unless the data is modified by the transaction itself. This can prevent dirty reads and unrepeatable reads, but phantom reads are still possible.
  • SERIALIZABLE: Highest isolation level, fully subject to ACID isolation level. All transactions are executed one by one so that interference between transactions is completely impossible. That is, this level prevents dirty reads, unrepeatable reads, and phantom reads.

Isolation level Dirty read Unrepeatable read Phantom read
READ-UNCOMMITTED Square root Square root Square root
READ-COMMITTED x Square root Square root
REPEATABLE-READ x x Square root
SERIALIZABLE x x x

What is the default isolation level for MySQL?

The default isolation level supported by MySQL InnoDB storage engine is REPEATABLE-READ. We can do this by SELECT @@tx_isolation; MySQL 8.0 SELECT @@transaction_isolation;

mysql> SELECT @@tx_isolation;
+-----------------+
| @@tx_isolation  |
+-----------------+
| REPEATABLE-READ |
+-----------------+
Copy the code

Note here that the difference with SQL standard is that InnoDB storage engine inREPEATABLE-READThe transaction isolation level uses a next-key Lock locking algorithm, which avoids phantom reads, unlike other database systems, such as SQL Server. So the isolation level InnoDB storage engine supports by default isREPEATABLE-READThe isolation requirements of transactions are fully guaranteed, that is, the SQL standardSERIALIZABLEIsolation level.

🐛 error: REPEATABLE READ of MySQL InnoDB is not guaranteed to avoid phantom READ. Lock READ is required. The mechanism used for this locking is next-key Locks.

Most database systems have read-committed isolation because the lower the isolation level is, the less locks there are on transaction requests, but remember that InnoDB storage engine uses REPEATABLE READ by default without any performance penalty.

The InnoDB storage engine typically uses the SERIALIZABLE isolation level for distributed transactions.

🌈 (excerpt from Chapter 7.7 of MySQL Tech Insider: InnoDB Storage Engine (2nd edition)) :

InnoDB storage engine provides XA transaction support and distributed transaction implementation through XA transaction. Distributed transactions allow multiple independent transactional resources to participate in a global transaction. Transactional resources are typically relational database systems, but can be other types of resources. A global transaction requires all participating transactions to either commit or roll back, which increases the ACID requirement for a transaction. Also, when using distributed transactions, the transaction isolation level of the InnoDB storage engine must be set to SERIALIZABLE.

Afterword.

Finally, I recommend a great open source project for Java tutorial classes: JavaGuide. I started JavaGuide in my junior year of college when I started preparing for my fall job interview. At present, this project has 100K + STAR, related reading: 1049 days, 100K! Simple replay! .

It’s great for learning Java and preparing for your Java interview! As the author says, this is a Java learning + interview guide that covers the core knowledge most Java programmers need to master!

Related recommendations:

  • Graphic Computer fundamentals!
  • Ali ACM masters open source study notes! TQL!
  • Computer quality books search + learning route recommended!

I’m Guide, embrace open source and love to cook. Author of JavaGuide, Github: Snailclimb-Overview. In the next few years, I hope to continue to improve JavaGuide, and strive to help more people learn Java! ‘! 凎! Click here for my 2020 work Report!

reference

  • High Performance MySQL

  • www.omnisci.com/technical-g…