One, foreword
For you as a programmer, database is a required course, and MySQL database is undoubtedly the most used database. System stability, efficiency, high concurrency and other indicators, largely depends on whether the database performance is good enough, so the importance of performance optimization, it is not difficult to understand that you will be asked about database tuning in any interview.
Therefore, this is the main reason why I consider writing this series of articles, I hope that this series of articles (MySQL performance optimization) can bring you a harvest, so that you can systematically and comprehensively master the skills and techniques of MySQL performance optimization. This series of articles will be shared and updated continuously, and if you think they might be useful to you now or in the future, please keep them bookmarked.
Before MySQL performance optimization, it is necessary to re-understand MySQL to make it easier to understand the knowledge points involved in MySQL performance optimization. This article will focus on MySQL architecture and core issues, which will be the first of a series of articles on MySQL performance optimization.
MySQL logical architecture
Before delving into MySQL, it is necessary to understand the logical architecture of MySQL. The logical architecture diagram is as follows:
The logical architecture of MySQL is divided into three layers, as shown in the red dotted box in the figure above.
The topmost architecture is not unique to MySQL. Most client/server based systems or services have similar architectures, including MySQL connection handling, authorization, security control, and so on.
The second layer architecture is the most core part of MySQL, including query parsing, analysis, optimization, caching and all built-in functions (such as date, time, function, etc.). All cross-storage engine functions are implemented in this layer, such as stored procedures, triggers, views, etc.
The third layer of architecture is the storage engine. Storage engines store and extract data from MySQL. Similar to file systems in Linux, different storage engines have their own advantages and disadvantages. You can use different engines in different scenarios. The different storage engines do not communicate with each other, but simply respond to requests from the upper layer.
How to control high concurrency read and write?
Whenever, for the database, high concurrent read and write operations are very common, for the same record at the same time to modify, query operations, will produce concurrency control problems, improper handling will appear a large number of dirty data. So how do you control highly concurrent read and write operations?
1. Read and write locks
When we learn any language, we will choose the lock mechanism to solve and control the concurrency problem, which is also the classic method to solve the concurrency control, MySQL is no exception. There are two types of locks that can be implemented in MySQL to handle high concurrency reads and writes. These two types of locks are commonly referred to as Shared locks and exclusive locks, and are also referred to as read locks and write locks.
Read locks are shared, that is, they do not block each other. Multiple requests can read the same record at the same time without interfering with each other.
Write locks are exclusive. In other words, a write lock blocks other write locks and read locks, avoiding read and write operations during the write process. This ensures accurate and clean data. In the database, locks occur all the time. When one request changes the data, MySQL uses locks to prevent other requests from reading the same data.
2. Lock strategy
With the mechanism of locking, we can better control the read and write operations with high concurrency. As we all know, locking also has scope, and the selection of the scope of the lock object is more challenging. Try to lock only some of the data that needs to be modified, not all of the data, which is the most desirable choice of the lock object range. The more accurate the locking range, the smaller the amount of data locked, the higher the concurrency of the system, and the smaller the resource consumption of locking itself.
All that is mentioned above is setting the lock granularity. MySQL provides a variety of options, and each MySQL storage engine can implement its own lock policy and lock granularity. Here are two of the most common locking strategies.
2.1 Table Lock
Table level locking is the most basic locking strategy in MySQL and is the least expensive. It locks the entire table. Before a request can perform write operations (insert, modify, or delete) on the table, it must obtain a write lock. In this case, all read/write operations on the table are blocked by other requests. Only when there is no write lock, other requests can read and acquire the read lock. Read locks do not block each other.
Although the storage engine can manage its own locks, MySQL itself uses various effective table-level locks for different purposes. For example, statements such as ALTER TABLE use table-level locking, ignoring the storage engine’s locking mechanism.
Low overhead, fast lock; No deadlocks occur; The lock granularity is large, and the probability of lock conflict is high and the concurrency is low.
Table-level locking is more suitable for query-oriented applications where only a small amount of data is updated by index criteria.
2.2 Row Lock
Row-level locking maximizes concurrent processing (while incurring the maximum lock overhead).
High overhead, slow lock; Deadlocks occur; The lock granularity is minimum, the probability of lock conflict is lowest, and the concurrency is highest.
Row-level locking is more suitable for applications where a large number of different data are updated by index and there are concurrent queries, such as some online transaction processing systems.
How does MySQL storage engine work?
In the file system, MySQL saves each database (i.e., schema) as a subdirectory under the data directory data. When creating a table, MySQL will create a. FRM file in the data directory of the database with the same name as the table to save the table definition.
Different storage engines store data and indexes differently, but table definitions are handled uniformly at the MySQL service layer.
You can use the show table status like ‘table name’ \G command to check the storage engine of the table and other information about the table, for example, to check the user table in the mysql database:
mysql> use mysql;
No connection. Trying to reconnect...
Connection id: 20587
Current database: *** NONE ***
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show table status like 'user' \G;
*************************** 1. row ***************************
Name: user
Engine: MyISAM
Version: 10
Row_format: Dynamic
Rows: 3
Avg_row_length: 125
Data_length: 512
Max_data_length: 281474976710655
Index_length: 4096
Data_free: 136
Auto_increment: NULL
Create_time: 2019-07-12 14:45:17
Update_time: 2019-12-20 15:55:44
Check_time: NULL
Collation: utf8_bin
Checksum: NULL
Create_options:
Comment: Users and global privileges
1 row in set (0.00 sec)
ERROR:
No query specified
Copy the code
The Engine field in the query result indicates that the storage Engine type of the user table is MyISAM.
Before getting to know the MySQL storage engine, you can use the show engines command to see which storage engines your MySQL database supports. MySQL 5.7.25 MySQL 5.7.25 MySQL 5.7.25
InnoDB, Mrg_Myisam, Memory, Blackhole, MyISAM, CSV, Archive, Performance_Schema, Federated.
This article only focuses on InnoDB and MyISAM, two of the most common storage engines. Other storage engines are only briefly explained. For details, please refer to the official documentation.
1.InnoDB storage engine
InnoDB is MySQL’s default transactional engine and the most important and widely used storage engine, with row-level locking and foreign key constraints.
It is designed to handle a large number of short-lived transactions, which are mostly committed normally and rarely rolled back. InnoDB’s performance and automatic crash recovery features make it popular for non-transactional storage requirements as well. Unless there is a very specific reason to use another storage engine, the InnoDB engine should be preferred.
InnoDB applies to the following scenarios/features:
- Frequently updated tables are suitable for handling multiple concurrent update requests.
- Support transactions.
- You can recover from disasters (through bin-log, etc.).
- Foreign key constraints. Only he supports foreign keys.
- Supports automatic increment of column attributes auto_INCREMENT.
2.MyISAM storage engine
MyISAM provides a number of features, including full text retrieval, compression, and so on, but does not support transaction and row-level locking, and supports table-level locking. MyISAM can still be used for read-only data, or for scenarios where tables are small enough to tolerate repair operations.
MyISAM applies to the following scenarios/features:
- The design of transactions is not supported, but it does not mean that projects with transactions cannot be used
MyISAM
Storage engines can be controlled at the program level according to their own business needs. - Table designs with foreign keys are not supported.
- Queries are fast if the database
insert
andupdate
The operation is more applicable. - A scenario where tables are locked all day long.
MyISAM
There is a strong emphasis on fast reads.MyIASM
Store the number of rows of the table, soSELECT COUNT(*) FROM TABLE
, you only need to directly read the saved values without performing a full table scan. If there are far more reads than writes to the table and no database transaction support is requiredMyIASM
It’s also a good choice.
3. Other storage engines built into MySQL
MySQL also has some special-purpose storage engines that can be quite fun to use in special situations. In newer versions of MySQL, some may no longer be supported for some reason, while others will continue to be supported, but need to be explicitly enabled before they can be used.
3.1 Archive Storage engine
The Archive engine only supports insert and SELECT operations, and prior to MySQL 5.1 did not even support indexes.
The Archive engine caches all writes and uses Zlib to compress inserted rows, so it has less disk I/O than the MyISAM engine. However, each select query requires a full table scan, so Archive is more suitable for logging and data collection applications, and such applications often require full table scan for data analysis.
The Archive engine supports row-level locking and dedicated buffers, so high concurrency inserts are possible. The Archive engine blocks other SELECT executions until a query starts until all rows that exist in the table are returned, in order to achieve consistent reads. In addition, this enables batch inserts to be invisible to read operations until they are complete.
3.2 Blackhole Storage Engine
The Blackhole engine does not implement any storage mechanism, and it loses all inserted data without saving anything. Strange, isn’t it useless?
But the server logs Blackhole, so it can be used to copy data to a standby repository, or simply to log. This special storage engine can be useful for special replication architectures and log auditing.
But the existence of such a storage engine is still a little hard to understand.
3.3 CSV Storage Engine
The CSV engine can treat regular CSV files as MySQL tables, but these tables do not support indexes.
The CSV engine can copy files in or out of the database when the database is running. It can store data in spreadsheets such as Excel as CSV files and copy them to the MySQL data directory, so that the data can be opened and used in MySQL. Similarly, if you write data to a CSV engine table, other external programs can immediately read CSV data from the table’s data files.
Therefore, the CSV engine can be very useful as a mechanism for data exchange.
3.4 Memory Storage Engine
Using a Memory engine can be useful if you need to access data quickly and if it can’t be modified or lost after a restart. The Memory engine is at least an order of magnitude faster than MyISAM because all the data is kept in Memory and no disk I/O is required. The table structure of the Memory engine remains after a restart, but the data is lost.
The Memory engine can be useful in a number of scenarios:
- Used to find or map a table, such as one that maps a mailbox to a state name.
- Used to cache the results of periodically aggregated data.
- Used to store intermediate data generated during data analysis.
The Memory engine supports Hash indexing, so lookups are very fast. Although Memory is very fast, it cannot replace traditional disk-based tables. Memory engines are table-level locks, so concurrent inhalation performance is low.
If MySQL needs to use temporary tables to hold intermediate results during query execution, the Memory engine uses temporary tables internally. If the intermediate results are too large for Memory, or contain BLOB or TEXT fields, the temporary table is converted to MyISAM’s engine.
Memory is often confused with temporary tables. A TEMPORARY TABLE is a TABLE created using the CREATE TEMPORARY TABLE statement. It can use any storage engine and is therefore not the same as Memory. Temporary tables are visible only in a single connection, and when the connection is disconnected, the temporary table ceases to exist.
For more information about temporary tables and Memory engines, see MySQL · Engine features · temporary tables.
MySQL storage engine and third-party storage engine, there are many more, I will not introduce one here, if necessary, then further talk about.
4. How to choose the right storage engine
There are so many storage engines. How do we choose?
In most cases, InnoDB is the default storage engine, and it was the right choice, so Oracle finally made InnoDB the default storage engine with MySQL 5.5.
How to choose the right storage engine can be summed up in a simple sentence: “Unless you need some features that InnoDB does not have, and there is no substitute, you should choose InnoDB engine first.”
For example, if you want to use full text search, it is recommended to use InnoDB plus Sphinx rather than MyISAM which supports full text search. Of course, if you don’t need InnoDB’s features and the features of other engines can better meet your needs, you can consider other storage engines.
It is recommended not to mix storage engines unless absolutely necessary, which can lead to a host of complex problems, as well as potential bugs and boundary issues.
If you need to use different storage engines, consider the following factors.
- The transaction
- The backup
- restore
- Unique characteristics
Reference article:
-
www.cnblogs.com/sunsky303/p…
-
www.cnblogs.com/coderyuhui/…