This paper will briefly introduce what I know about MySQL based on my interview experience and recent review. The knowledge points covered may be relatively trivial and of general depth. Later, I will find time to discuss some parts.
Storage engines
A storage engine is a way to store data, build indexes, update query data, and so on. Storage engines are table based, not library based. So storage engines can also be called table types.
Databases such as Oracle and SqlServer have only one storage engine. MySQL provides a plug-in storage engine architecture. Therefore, MySQL has various storage engines. You can use them or write storage engines as required.
To query the storage engines supported by the current database, specify Show Engines:
The following describes several commonly used storage engines and compares their differences.
The characteristics of | InnoDB | MyISAM | MEMORY | MERGE | NDB |
---|---|---|---|---|---|
Storage limits | 64TB | There are | There are | There is no | There are |
The transaction security | support | ||||
Locking mechanism | Row locking (suitable for high concurrency) | Table locks | Table locks | Table locks | Row locks |
B-tree indexes | support | support | support | support | support |
The hash index | support | ||||
The full text indexing | Support (after 5.6) | support | |||
The cluster index | support | ||||
Data index | support | support | support | ||
The index buffer | support | support | support | support | support |
Data compressibility | support | ||||
Space use | high | low | N/A | low | low |
Memory usage | high | low | medium | low | high |
Batch insertion speed | low | high | high | high | high |
Support foreign keys | support |
Here we will focus on the two most commonly used storage engines: InnoDB and MyISAM.
1.1 the InnoDB
InnoDB is the most familiar storage engine and the default storage engine after MySQL5.5. You can query the default storage engine with the command: show variables like ‘%storage_engine%’; . Compared with traditional ISAM and MyISAM, InnoDB’s biggest feature is ACID compatible Transaction, which is REPEATABLE READ at default isolation level. Phantom Locking is prevented by MVCC + next-key Locking. We only consider other storage engines when some of InnoDB’s features do not meet the needs of a particular scenario. Like MyISAM, InnoDB writes less efficiently and takes up more disk space to retain data and indexes. Use start transaction and commit to start and commit the transaction, and use rollback to rollback the transaction.
InnoDB is the only storage engine in MySQL that supports foreign keys. When creating foreign keys, the parent table must have corresponding indexes, and the child table will automatically create corresponding indexes when creating foreign keys. However, considering that foreign keys and cascading update are suitable for low-concurrency single machine, they are not suitable for distributed and high-concurrency cluster. Cascading update is strongly blocked, and there is the risk of database update storm. Foreign keys affect the insertion speed of the database, so alibaba Java development manual has clearly prohibited the use of cascading, foreign keys, all foreign key concepts must be solved in the application layer.
InnoDB supports true online hot backup, other storage engines of MySQL do not support online hot backup. To obtain a consistent view, you need to stop writing to all tables. In a mixed read/write scenario, stopping writing may also mean stopping reading.
1.2 MyISAM
MyISAM does not support transactions, nor does it support foreign keys. MyISAM has the advantage of fast access, no requirement for transaction integrity, or SELECT, INSERT based applications can basically use this engine to create tables. Each MyISAM is stored on disk as three files with the same file name as the table name, but with the extension:.frm (store table definitions); .myd (MYData, store data); .myi (MYIndex, store index).
Row-level locking is not supported. Only the entire table can be locked. Shared locks are added to all tables to be read, and exclusive locks are added to tables to be read. However, new records can be inserted into a table while a read is being performed on the table, which is called CONCURRENT INSERT.
1.3 compare
- Transactions: InnoDB is transactional and can use Commit and Rollback statements.
- Concurrency: MyISAM only supports table level locking, while InnoDB also supports row level locking.
- Foreign keys: InnoDB supports foreign keys.
- Backup: InnoDB supports online hot backup.
- Crash recovery: MyISAM crashes at a much higher rate than InnoDB, and recovery is slower.
- Other features: MyISAM supports compressed tables and spatial data indexes.
Second, the transaction
2.1 Four characteristics of ACID
- Atomicity: The smallest indivisible unit of operation in which an operation either succeeds or fails simultaneously.
- Consistency: The total amount of data remains the same before and after a transaction.
- Isolation: Multiple transactions are independent of each other.
- Durability: Database stores data forever after a transaction commits or rolls back.
2.2 Isolation Level
Transaction isolation level: Multiple transactions are isolated and independent of each other, but if multiple transactions operate on the same batch of data, problems can arise that can be resolved by setting different isolation levels. Here are some possible problems with concurrent operations:
Dirty read: a transaction that reads data not committed in another transaction.
Phantom reads: In the same transaction, two read different data. (Not repeatable)
Phantom read: a transaction (the same Read view) that queries the same range twice before and after, and the latter query sees rows that the previous query did not see. Generally for insertion.
Isolation level:
read uncommitted
: Read uncommitted –> Dirty read, virtual read, phantom readread committed
: Read committed –> Virtual read, phantom read (Oracle default)repeatable read
: repeatable read –> False read (MySQL default)serializable
: serialization
To view the current isolation level: select @@TX_ISOLATION;
Set the current isolation level: Set Session Transaction Isolatin level Isolation level.
Three, index,
Indexing is an unavoidable problem for MySQL, which plays a huge role in database efficiency. An index is a data structure (ordered) that helps MySQL obtain data efficiently. In addition to the data, the database system maintains data structures that satisfy a particular lookup algorithm, that reference (point to) the data in a way that makes it possible to implement advanced lookup algorithms on these data structures, called indexes.
3.1 Disk I/O Preknowledge
Disk I/O: Data is read from a disk mechanically. Data is read from a disk in three steps: tracing, searching, and copying to the memory. The seek time is the time required for the magnetic arm to move to the specified track, usually less than 5ms. The average search time is half a turn. If it is a 7200 RPM disk, the average search time is 600000/7200/2=4.17ms. The copy time to memory is fast and negligible compared to the previous two, so the average IO time is around 9ms. That sounds fast, but 9000s of data in a database of millions is a disaster.
Considering the disk I/O is very high, the optimization of computer operating system to do the pre-reading, when an I/O, not only the current disk address data, but the adjacent data are read into memory buffer, because when the computer access to an address data, the adjacent will soon be access to data.
The data read by each I/O is called a page. The specific size of the data on a page depends on the operating system, and it is usually 4K or 8K. In other words, when we read the data on a page, only one I/O actually occurs. So if we wanted to optimize database queries, we wanted to minimize disk I/O, and that’s where indexes came in.
3.2 Index advantages and disadvantages
advantage
- Similar to book catalog indexes, improve the efficiency of data retrieval and reduce database I/O costs.
- Sorting data through indexed columns reduces the cost of sorting data and CPU consumption.
disadvantage
- In fact, the index is also a table that holds the primary key and index fields and points to the records of the entity class, so the index column also takes up space.
- Although indexes greatly improve query efficiency, they also reduce the speed of updating tables, such as INSERT, UPDATE, and DELETE tables. MySQL will update the index file every time it updates a column that has been added to the index.
3.3 Index Structure
Indexing is implemented in MySQL’s storage engine layer, not in the server layer. Not all storage engines support all types of indexes. InnoDB uses B+ tree indexes.
B+ tree is a variant of B tree. In order to understand the relevant knowledge of B+ tree, we first introduce B tree.
BTree is also called multi-path balanced search tree. The BTree features of an M-fork are as follows:
- Each node in the tree contains a maximum of m children.
- Each node except root and leaf nodes has at least one
[ceil(m/2)]
A child. - If the root node is not a leaf node, there are at least two children. `
- All the leaf nodes are in the same layer.
- Each non-leaf node is composed of N keys and N +1 Pointers, where
[ceil(m/2)-1] <= n <= m-1
.
Compared with binary trees, B trees are more efficient in data query, because BTREE has a smaller hierarchical structure than binary trees for the same amount of data, so the search speed is faster.
B + Tree structure
B+Tree is a variant of BTree. The differences between B+Tree and BTree are as follows:
- An N-fork B+Tree contains a maximum of N keys, and a BTree contains a maximum of N -1 keys.
- The leaf nodes of a B+Tree store all key information in order of key size.
- All non-leaf nodes can be considered index parts of a key.
The B+ tree in InnoDB has Pointers to adjacent leaf nodes in each leaf node, which speeds up access efficiency, especially interval query. In addition:
- Since only leaf nodes of B+Tree store key information, query for any key must go from root to leaf. Therefore, the B+Tree query efficiency is more stable.
- Each node of the B tree contains not only the key value but also the data value. The storage space of each page is limited. If the data is large, the number of keys that can be stored on each node (that is, a page) is small. If the data is large, the depth of BTree is large, which increases the disk I/O times during query, affecting the query efficiency.
3.4 Index Classification
- Single-value index: that is, an index contains only a single column. A table can have multiple single-column indexes
- Unique index: The value of the indexed column must be unique, but empty values are allowed
- Compound index: That is, an index contains multiple columns
3.5 Basic Index Operations
-
Create: crate [UNIQUE | FULLTEXT | SPATIAL] index index name [USING index_type] on the table name (index_col_name,…).
-
Query: show index from table name;
-
Editor:
-
Alter table table_name add primary key(column_list); Add a primary key, which means the index value must be unique and cannot be NULL
-
Alter table table_name add unique column_list; The value used to create the index must be unique (NULL may occur multiple times in addition to NULL)
-
Alter TABLE table name add index Index name (column_list); Add a normal index. The index value can appear more than once.
-
Alter table table_name add fullTEXT index name (column_list); Specifies the index as FULLTEXT for full-text indexing
-
-
Drop: drop index name on table name;
3.6 Design Principles
-
A single table has a maximum of 16 indexes
-
Create indexes for tables that are frequently queried and have a large amount of data.
-
For index field selection, the best candidate columns should be extracted from the condition of the WHERE clause. If there are many combinations in the WHERE clause, the combination of the most frequently used columns with the best filtering effect should be selected.
-
With unique indexes, the higher the differentiation, the more efficient the index usage.
-
Indexes can effectively improve the efficiency of querying data, but the number of indexes is not the better, the more indexes, the cost of index maintenance naturally increases. For tables with frequent DML operations such as insert, update, and delete, too many indexes introduce high maintenance costs, reduce the efficiency of DML operations, and increase the time consumption of corresponding operations. In addition, MySQL also suffers from selection difficulty if there are too many indexes. Although it will eventually find a usable index, it will undoubtedly increase the selection cost.
-
Short indexes are also stored on hard disks after they are created. Therefore, improving the I/O efficiency of index access can also improve the overall access efficiency. If the total length of the fields forming the index is short, more index values can be stored in a given size block, which can effectively improve the I/O efficiency of MySQL to access the index.
-
If the first few columns of the index are used in the WHERE clause, then the query SQL can use the combined index to improve the query efficiency.
Four, lock
4.1 lock classification
From the granularity of operations on data:
1) Table lock: The entire table will be locked during operation.
2) Row lock: The current row will be locked during operation.
From the types of operations on data:
1) Read lock (shared lock) : Multiple read operations can be performed simultaneously for the same data without affecting each other.
2) Write lock (exclusive lock) : It blocks other write locks and read locks until the current operation is complete.
Compared with other databases, the locking mechanism of MySQL is relatively simple. The most significant feature of MySQL is that different storage engines support different locking mechanisms. The following table lists the lock support by storage engine:
The storage engine | Table level lock | Row-level locks | Lock the page |
---|---|---|---|
MyISAM | support | Does not support | Does not support |
InnoDB | support | support | Does not support |
MEMORY | support | Does not support | Does not support |
BDB | support | Does not support | support |
4.2 Unlocking a Lock
MyISAM will automatically lock all tables involved in a read operation before performing a SELECT statement, and will automatically lock all tables involved in a write operation before performing an UPDATE operation (UPDATE, DELETE, INSERT, etc.). This process does not require user intervention. You do not need to use the LOCK TABLE command to explicitly LOCK the MyISAM TABLE.
Display table lock syntax:
Read lock: locktabletable_name read; Write lock: locktableTable_name write;Copy the code
Read locks block writes, but not reads. Write locks, on the other hand, block both reads and writes.
- Shared lock (S) : Also known as read lock or S lock for short, a shared lock means that multiple transactions can share a lock for the same data. All transactions can access data but can only read data but cannot modify it.
- Exclusive lock (X) : also known as write locks, referred to as “X lock, exclusive lock is cannot coexist with other lock, such as a transaction to obtain a data row exclusive lock, and other transactions can’t obtain the bank’s other locks, including Shared and exclusive locks, lock but the transaction to get exclusive lock is to read and modify data.
For UPDATE, DELETE, and INSERT statements, InnoDB automatically assigns an exclusive lock (X) to the data set involved;
InnoDB does not place any locks on normal SELECT statements; A recordset can be displayed with a shared or exclusive lock:
Shared lock (S) :SELECT * FROM table_name WHERE. LOCKINSHARE MODE; Exclusive lock (X) :SELECT * FROM table_name WHERE.FOR UPDATE;
Copy the code
If data is not retrieved by index criteria, InnoDB locks all records in the table, which works like a table lock.
Five, SQL optimization
5.1 Using Indexes to improve Query efficiency
Avoid index failure during query. Suggestions for using indexes and possible index failures are as follows:
-
Use overridden indexes whenever possible, avoiding SELECT *, full value matching, and specifying values for all columns in the index.
-
Left-most prefix rule: If you index multiple columns, follow the left-most prefix rule. Indicates that the query starts at the left-most front of the index and does not skip columns in the index. If the middle column is skipped, only the left portion is indexed.
-
Do not perform operations on index columns; the index will be invalidated.
-
The string is not quoted, which invalidates the index because of implicit type conversions.
-
A condition split with or. If the column in the condition before or has an index and the column behind it has no index, the index involved will not be used.
-
Index invalidation for like fuzzy query starting with %.
-
In goes to the index, not in the index is invalid.
-
If MySQL evaluates using an index as slower than using a full table, the index is not used.
5.2 Basic steps and logic of SQL optimization
5.2.1 Viewing the SQL Execution Frequency
After the success of the MySQL client connection, through the show [session | global] status command can provide server status information, You can add parameters “session” or “global” as needed to display session-level (current connection) count results and global level (since last database startup) count results. If no, session is used by default.
Com_xxx represents the number of times each XXX statement is executed. Innodb_xxx is an innoDB-only count.
5.2.2 Finding inefficient SQL
You can locate inefficient SQL statements in either of the following ways:
- Slow query logs: Use slow query logs to locate SQL statements that are not executed efficiently
--log-slow-queries[=file_name]
When the option starts, mysqld writes a log file containing all SQL statements that take longer than long_query_time seconds to execute. show processlist
: Slow query logs are recorded after the query is complete. In this case, slow query logs cannot be used to locate problems in application execution efficiencyshow processlist
Command to view the current MySQL thread, including the thread status and whether to lock the table. You can view the EXECUTION of SQL in real time and optimize some table locking operations.
Parameter analysis:
-
Id: Indicates the value of “connection_id” assigned by the system when a user logs in to MySQL. You can use the connection_id() function to view the value
-
User: Displays the current user. If you are not root, this command displays only SQL statements within the user’s jurisdiction
-
Host: shows which port on which IP the statement was issued from, and can be used to track the user of the statement in question
-
Db: Displays which database this process is currently connected to
-
Command: displays the command executed on the current connection. The value can be sleep, query, or connect
-
Time: Displays the duration of the status, in seconds
-
State: Displays the state of the SQL statement using the current connection, the most important column. State describes a state in the execution of a statement. An SQL statement, such as a query, may need to go through states such as copying to TMP tables, sorting result, sending data, etc
-
Info: Displays this SQL statement, which is an important basis for determining the statement in question
5.2.3 Explain analysis of execution plan
By analyzing the execution plan, we can understand the order in which tables are read, the types of data read operations, which indexes are available, which indexes are actually used, references between tables, how many rows of each table are queried by the optimizer, and so on.
Field parsing:
-
Id: Serial number of a SELECT query. It is a group of numbers that indicate the order in which the select clause or operation table is executed in the query.
-
Select_type: indicates the select type. Common values are as follows:
SIMPLE
: a simple select query that does not contain subqueries or unions.PRIMARY
: If the query contains any complex subqueries, the outermost query is marked with this identifier.SUBQUERY
: contains subqueries in select or WHERE lists.DERIVED
: Subqueries contained in the FROM list, labeled DERIVED, are executed recursively by MySQL, putting the results in temporary tables.UNION
: If the second select occurs after the UNION, mark it as UNION; If UNION is included in the subquery of the FROM clause, the outer select is marked as DERIVED.UNION RESULT
: select to get the result from the UNION table.
-
Table: output result set table
-
Type: indicates the join type of the table. There are as many as 14 types of type in mysql5.7. The common join types for good to bad performance are (system –> const –> eq_ref –> ref –> ref_or_NULL –> index_MERGE –> index_subquery –> range –> index –> all), generally speaking, our optimization target is ref, at least to reach range.
NULL
MySQL does not access any tables, indexes, and returns results directly.system
The: table has only one row of records (equal to the system table). This is a special case of const type and is not normally encountered.const
Const is used to compare primary key or unique indexes. Because it only matches one row, it’s fast. By placing the primary key in the WHERE list, MySQL can convert the query to a constant. Const compares all parts of a “primary key” or “unique” index to a constant value.eq_ref
: Is similar to ref, except that it uses a unique index. Only one record can be queried associated with the primary key. Common for primary key or unique index scans.ref
: non-unique index scan that returns all rows matching a single value. It is also essentially an index access that returns all rows (multiple rows) that match a single value.range
: a scoped index scan, which is scoped and superior to a full index scan of index. Where between, <, >, in, etc.index
The difference between index and all is that index traverses the index tree, which is usually faster than all, which traverses the data file.all
: traverses the table to find matching rows.
-
Possible_keys: possible index during query.
-
Key: indicates the actual index.
-
Key_len: length of the index field.
-
“Rows” : indicates the number of scanned rows.
-
Extra: Description of execution.
using filesort
: indicates that MySQL uses an external index sort to read data instead of the index sequence in the table. This is called file sort, which is inefficient.using temporary
:. Temporary tables are used to hold intermediate results. MySQL uses temporary tables when sorting query results. This is often seen in order by and group BY, with low efficiencyusing index
: indicates that the corresponding SELECT operation uses an overwrite index to avoid accessing rows of the table.
5.2.4 profiles analysis
The Show Profiles statement can help us understand where time is being spent when doing SQL optimizations. The have_profiling parameter allows you to see if MySQL currently supports profiles:
Profiling is turned off by default and can be turned on at the Session level with the set statement: set profiling=1; . Select * from t_cas where id<5; select count(*) from t-cas; , and then show profiles; You can view the execution time of each command:
Through the show profile [all | | CPU block IO | context switch | page faults] for query query_id; The SQL statement can view the state and elapsed time of each thread during the SQL execution:
5.3 Improving data import efficiency
-
Primary key order insertion: Since InnoDB-type tables are stored in primary key order, data is imported in primary key order, which can effectively improve the efficiency of data import. If an InnoDB table does not have a primary key, the system automatically creates an internal column as the primary key by default, so if you can create a primary key for the table, you can use this to improve the efficiency of data import.
-
Disable the uniqueness check: Run the SET UNIQUE_CHECKS=0 command to disable the uniqueness check before importing data. Run the SET UNIQUE_CHECKS=1 command after importing data to restore the uniqueness check to improve import efficiency.
-
Manual commit transaction: If the application uses automatic commit, you are advised to run SET AUTOCOMMIT=0 before the import and disable the automatic commit. After the import, run SET AUTOCOMMIT=1 to enable the automatic commit to improve the import efficiency.
5.4 Optimizing insert statements
-
If you need to insert many rows into a table at the same time, you should try to use the insert statement of multiple value tables. This method greatly reduces the cost of connection, closing, and so on between the client and the database. Makes efficiency faster than a single INSERT statement executed separately.
For example, the original format is:
insert into tb_test values(1.'xie'); insert into tb_test values(2.'Cat'); insert into tb_test values(3.'you'); Copy the code
The optimized scheme is as follows:
insert into tb_test values(1.'xie'), (2.'wu'), (3.'you'); Copy the code
-
Insert data in a transaction.
start transaction; insert into tb_test values(1.'xie'); insert into tb_test values(2.'wu'); insert into tb_test values(3.'you'); commit; Copy the code
-
Sequential insertion of data
insert into tb_test values(4.'li'); insert into tb_test values(1.'xie'); insert into tb_test values(3.'you'); insert into tb_test values(2.'wu'); Copy the code
After the optimization:
insert into tb_test values(1.'xie'); insert into tb_test values(2.'wu'); insert into tb_test values(3.'you'); insert into tb_test values(4.'li'); Copy the code
5.5 Optimize the Order BY statement
Two kinds of sorting
- By sorting the returned data, as it’s usually called
filesort
All sorts that do not return the sort result directly by index are called FileSort sorts. - The ordered data is returned directly through the ordered index sequential scan, in which case
using index
, do not need additional sorting, high operation efficiency.
Therefore, you should minimize extra sorting and return ordered data directly through the index. Where conditions use the same index as ORDER by, and order by has the same order as index, and order by’s fields are either in ascending or descending order. Otherwise, you would definitely need to do something extra, which would result in FileSort.
MySQL has two kinds of sorting algorithms:
1) Double scan algorithm: before MySQL4.1, use this way to sort. Sort (sort buffer, sort buffer); if sort buffer is not enough, store the sort result in temporary table. Once sorted, the records are read back into the table based on the row pointer, which can result in a large number of random I/O operations.
2) Single-scan algorithm: take out all fields that meet the conditions at one time, and then output the result set directly after sorting in sort area sort buffer. The memory cost of sorting is high, but the sorting efficiency is higher than that of two-scan algorithm.
If max_LENGTH_FOR_sort_data is larger than max_length_for_sort_data, then the length_for_sort_data algorithm is used. If max_length_for_sort_data is larger than max_length_sort_data, the length_sort_data algorithm is used. Otherwise use the first.
Sort_buffer_size and max_LENGTH_FOR_SORT_DATA system variables can be appropriately raised to increase the size of the sort area and improve sorting efficiency.
5.6 Optimizing the Group BY Statement
Since group by actually does the same sort operation, and compared with Order BY, group BY mainly only has more grouping operations after sorting. Of course, if other aggregate functions are used in grouping, then some calculation of aggregate functions is required. Therefore, indexes can be used in the implementation of Group BY as well as order BY.
If the query contains group by but the user wants to avoid consumption of the sort result, then order by NULL can be used to disable the sort. Explain select age,count(*) from user group by age order by null; Filesort is not required here.
5.7 Optimize OR conditions
As mentioned in 5.1, if an index is to be used in a query clause containing AN OR, the index must be used for each condition column between or, and no compound index can be used. If there is no index, you should consider adding an index.
Select * from user where id=1 or age=30; Select * from user where id=1 union select * from user where age=30;
5.8 Optimizing Paging Query
In general, performance can be improved by creating overridden indexes when paging queries. When a large amount of data is generated, such as limit 1000000010,10, MySQL needs to sort the first 10000010 records, only return 10000000-10000010 records, other records are discarded, the cost of query sort is very high.
Select * from table where primary key (s) = 1; select * from table where primary key (s) = 1; Explain select * from t_cas t, (select id from t_cas order by ID limit 1000000,10) a where T.id = a.id;
Explain select * from T_cas where id>10000 LIMIT 10; explain select * from T_cas where id>10000 LIMIT 10;
5.9 Using SQL Prompts
- After the table name in the query statement, add
use index
MySQL > select * from index list; MySQL > select * from index list;explain select * from t_cas use index(idx_name) where name='xie';
- This can be used if you simply want MySQL to ignore one or more indexes
ignore index
:explain select * from t_cas use index(idx_name) where name='xie';
- To force MySQL to use a specific index, you can use it in a query
force index
:explain select * from t_cas force index(idx_name) where name='xie';
Cache and concurrency tuning
6.1 the cache
If the Mysql query cache is enabled, when the same SQL statement is executed, the server directly reads the results from the cache. When the data is modified, the previous cache becomes invalid. Tables that are frequently modified are not suitable for query cache.
-
The client sends a query to the server.
-
The server first checks the query cache, and if it hits the cache, it immediately returns the results stored in the cache. Otherwise move on to the next stage;
-
SQL parsing and preprocessing are performed on the server, and then the optimizer generates the corresponding execution plan.
-
MySQL invokes the storage engine API to execute the query according to the execution plan generated by the optimizer.
-
Returns the result to the client.
SHOW VARIABLES LIKE ‘have_query_cache’;
SHOW VARIABLES LIKE ‘query_cache_type’; . Query cache is disabled by default. You need to manually configure parameter query_cache_type to enable query cache. The value of this parameter can be: 1 OFF or 0: query cache is disabled. ② ON or 1 query cache function enabled: SELECT results that meet the cache condition will be cached, otherwise, no cache, explicitly specify SQL_NO_CACHE, no cache; ③ DEMAND or 2: query cache function on DEMAND, explicitly specify SQL_CACHE SELECT statement only cache; Others are not cached
SHOW VARIABLES LIKE ‘query_cache_size’;
To check the STATUS variable of the query cache: SHOW STATUS LIKE ‘Qcache%’;
parameter | meaning |
---|---|
Qcache_free_blocks | Query the number of available memory blocks in the cache |
Qcache_free_memory | Query the amount of memory available for the cache |
Qcache_hits | Example Query the number of cache hits |
Qcache_inserts | The number of queries added to the query cache |
Qcache_lowmen_prunes | The number of queries that were removed from the query cache due to insufficient memory |
Qcache_not_cached | Number of uncached queries (cannot be cached or not cached due to query_cache_type setting) |
Qcache_queries_in_cache | The number of queries registered in the query cache |
Qcache_total_blocks | Query the total number of blocks in the cache |
You can specify two options related to query caching in the SELECT statement:
- SQL_CACHE: if the query result is cacheable, and
query_cache_type
If the value of the system variable is ON or DEMAND, the query result is cached. - SQL_NO_CACHE: the server does not use query cache. It does not check the query cache, whether the results are cached, nor does it cache the query results.
Example:
SELECT SQL_CACHE id, name FROM customer;
SELECT SQL_NO_CACHE id, name FROM customer;
Copy the code
The cache will be invalidated if:
- SQL statements are inconsistent.
- When there is some uncertainty in the query statement, it is not cached. Examples are now(), current_date(), curdate(), curtime(), rand(), uUID (), user(), database().
- No table query statements are used.
- The query
mysql
.information_schema
orperformance_schema
A table in a database does not walk through the query cache. - If the table changes, all cache queries using the table are invalidated and removed from the cache.
6.2 InnoDB Memory Optimization
InnoDB uses a memory area as the I/O cache pool, which is used to cache not only InnoDB index blocks but also InnoDB data blocks. The InnoDB storage engine uses a designated Redo log space (a circular data structure). The Redo log space is adjusted by the innodb_log_file_size and innodb_log_files_in_group parameters (default: 2). Multiply these parameters to get the total available Redo log space.
-
Innodb_log_file_size: This parameter determines the size of the MySQL transaction log file. Setting too small: When a log file is full, InnoDB automatically switches to another log file and triggers a database Checkpoint, which causes InnoDB to flush dirty pages in small batches, significantly reducing InnoDB’s performance. The MySQL database is unable to write redo to any of its buffers. The MySQL database is unable to write redo to any of its buffers. Another possibility is that if there is a large transaction and all the log files are not written yet, the log cannot be switched (because the instance recovery is needed and cannot be overwritten) and MySQL hangs. You can determine the rotation frequency of a log file according to the file modification time. If the rotation frequency is too frequent, the log file is too small. Too large: A large setting reduces Checkpoint and greatly improves I/O performance because the redo log is sequential I/O. However, if the database has an unexpected problem, such as an unexpected downtime, and the logs need to be replayed and the committed transactions restored (i.e., rolling forward during instance recovery, using redo to recover data from the buffer cache), the recovery time can be very long if the logs are large. Even to the point where we can’t accept it.
-
Innodb_buffer_pool_size: This parameter determines the maximum cache size for InnoDB to store engine table data and index data. The default value is 128 MB. The larger the value of Innodb_buffer_POOL_size, the higher the cache hit ratio, the less disk I/O required to access InnoDB tables, and the higher the performance, as long as the operating system and other programs have enough memory available.
innodb_buffer_pool_size=512M
Copy the code
innodb_log_buffer_size
: determines the size of InnoDB redo log cache. The default value is 1MB, increased for large transactions that may produce a large number of update recordsinnodb_log_buffer_size
To prevent InnoDB from performing unnecessary log writes to disk before the transaction commits.
innodb_log_buffer_size=10M
Copy the code
6.3 Concurrency Parameters
In terms of implementation, MySQL Server is a multi-threaded structure, including background threads and customer service threads. Multithreading can effectively utilize server resources and improve the concurrency performance of database.
-
Max_connections: indicates the maximum number of connections. The default value is 151. If the status variable connection_ERRORS_MAX_connections is not zero and keeps growing, then a number of connection requests are failing because the number of database connections has reached the maximum allowed value, which is a good reason to increase the value of max_connections. The maximum number of connections MySQL can support depends on a number of factors, including the quality of thread libraries for a given operating system platform, memory size, load per connection, CPU processing speed, expected response time, etc. On Linux, it is not difficult to support 500-1000 connections on a server with good performance, which needs to be evaluated according to the server performance.
-
Back_log: controls the size of the backlog request stack set when MySQL listens on TCP ports. If the number of connections in MySQL reaches max_connections, new requests will be stored in the stack waiting for a connection to release resources. The stack number is called back_log. If the number of connections exceeds back_log, the connection resources will not be granted and an error will be reported. The default value is 50 before version 5.6.6 and 50 + (max_connections / 5) for later versions, but the maximum value cannot exceed 900. If you need your database to process a large number of connection requests in a short period of time, consider increasing the value of back_log appropriately.
-
Table_open_cache: This parameter controls the number of table caches that can be opened by all SQL statement execution threads. Each SQL statement execution thread must have at least one table cache open during SQL statement execution. The value of this parameter should be set based on the maximum number of connections set max_connections and the maximum number of tables involved in performing an associated query per connection: max_connections x N;
-
Thread_cache_size: To speed up connections to the database, MySQL caches a certain number of customer service threads for reuse. The thread_cache_size parameter controls the number of customer service threads MySQL can cache.
-
Innodb_lock_wait_timeout: this parameter is used to set the time for InnoDB transactions to wait for a row lock. The default value is 50ms, which can be set dynamically as required. For business systems that need quick feedback, the wait time for row locks can be reduced to avoid long suspended transactions. For batch processors running in the background, the wait time for row locks can be increased to avoid large rollback operations.
7. Master/slave replication is separated from read/write
Replication refers to transferring the DDL and DML operations of the master database to the slave server via binary logs, and then re-executing (also called redoing) those logs on the slave to keep the data in sync with the master. MySQL supports simultaneous replication from one primary database to multiple secondary databases. The secondary database can also function as the primary database of other secondary servers to implement chain replication.
7.1 Replication Principles
At the top, there are three steps to replication:
- When a transaction is committed, the Master database records data changes as time Events in the binary log file Binlog.
- The master library pushes log events in the binary log file Binlog to the slave library Relay log.
- Slave rewrites events in the trunk log to alter data that reflects itself.
The binary log records all DDL (Data Definition Language) statements and DML (Data Manipulation Language) statements, but not the data query statements. This log is very important for data recovery in the event of a disaster. The master/slave replication of MySQL is implemented through this log. Binary logs are disabled by default. You need to enable binary logs in the MySQL configuration file my.cnf and configure the format of MySQL logs.
Binlog Log format
-
STATEMENT: This log format records SQL statements inlog files. Each SQL STATEMENT that modifs data is recorded inlog files. You can use the mysqlBinlog tool provided by MySQL to view the text of each STATEMENT. In master/slave replication, the slave library parses the log into the original text and executes it again in the slave library.
-
ROW: This log format records data changes for each ROW in a log file, rather than SQL statements. For example, run the following SQL statement: update tb_book set status=’1′; In the STATEMENT log format, a row of SQL files is recorded in the log. In the case of ROW, data changes for each ROW are recorded in the ROW format log.
-
MIXED: This is the default log format of MySQL, that is, STATEMENT and ROW are MIXED. Statements are used by default, but rows are used for recording in some special cases. MIXED formats take advantage of the best of both modes while avoiding their worst.
Error logs record information about when mysqld is started and stopped, and when any serious errors occur during server running. You can view this log when the database cannot be used properly due to any fault. The log file is saved in the MySQL data directory by default. The default log file name is hostname.err (hostname indicates the hostname). 2. Query logs Query logs record all operation statements of the client. Binary logs do not contain SQL statements for querying data. By default, log query is disabled. To enable log query, set general_log=1 and general_log_file=file_name 3 in the MySQL configuration file my. CNF. Slow query logs record all SQL statements whose execution time exceeds the value of long_QUERy_time and the number of scanned records is not less than min_examined_ROW_limit. You can use the mysqlDumpslow tool provided by MySQL to sort and summarize slow query logs. Long_query_time the default value is 10 seconds, the minimum value is 0, and the accuracy can be up to microseconds. Slow query logs are disabled by default. Slow query logs can be controlled with two parameters: slow_query_log=1 and slow_query_log_file=slow_query.log
7.2 Setting up primary/Secondary Replication
1. Create databases on the two databases
create database db_name;
Copy the code
2. Perform the following configuration on the primary server:
#To modify the configuration file, run the following command to open the mysql configuration file
vi /etc/my.cnf
#Add the following configuration information to the [mysqld] moduleLog-bin =master-bin # binlog-format=ROW # Default mixed server-id=1 # The id of each server must be different. Binlog-do-db =shop # Binlog-ignore-db = NDB #Copy the code
3. Configure account authorization for logging in to the primary server from the secondary server
-- Authorization operation
set global validate_password_policy=0;
set global validate_password_length=1;
grant replication slave on *.* to 'root'@The '%' identified by 'root';
-- Refresh permission
flush privileges;
Copy the code
4. Configure the secondary server
#To modify the configuration file, run the following command to open the mysql configuration file
vi /etc/my.cnf
#Add the following configuration information to the mysqld moduleLog-bin =master-bin # Name of the binary file binlog-format=ROW # Format of the binary file server-id=2 # SERVER ID
#Replicate-do-db sets the database you want to replicate (multiple databases are separated by commas)
#Replicate-ignore-db Sets replicate-database to ignore (separate multiple databases with commas)
#Replicate-do-table sets the table that you want to replicate
#Replicate-ignore-table sets the replicate table that you want to ignore
#Replicate-wild-do-table functions like replication-do-table, but it can be a wildcard
#Replicate-wild-ignore-table acts the same as replication-ignore-table, but with wildcard characters
Copy the code
5. Restart the mysqld service on the primary server
#Restarting the mysql service
service mysqld restart
#Logging in to the mysql database
mysql -uroot -p
#Check the master statusShow the master status.Copy the code
6. Restart the secondary server and perform related configurations
#Restarting the mysql service
service mysqld restart
#Log on to the mysql
mysql -uroot -p
#Connect to the primary serverChange the master to master_host = '192.168.150.11, master_user =' root ', master_password = '123456', master_port = 3306, master_log_file='master-bin.000001',master_log_pos=334;
#Start the slave
start slave
#Check the status of the slaveShow slave status\G #Copy the code
7.3 Read/Write Separation
Read/write separation is closely related to separate database and separate table. Although separate database and separate table are two concepts, they are often used together, and separate tables are generally stored in different libraries.
7.3.1 Horizontal split versus vertical split
Horizontal splitting is the splitting of index table rows. When the number of table rows exceeds 2 million, it will slow down. At this time, the data of a table can be split into multiple tables to store. Normally, we use the mode of module to split the table.
For example: a 400W user table users, in order to improve its query efficiency, we divide it into four tables users1, users2, users3 and users4. By taking the method of ID module, the data is scattered into four tables, and then the query, update and delete are also queried by taking the method of module. You also need a temporary table, uID_temp, to provide the incremented ID during insert. The only use of this table is to provide the incremented ID.
Vertical splitting is the splitting of data table columns. A table with many columns is split into multiple tables. There are not many records in the table, but the field is very long. The table occupies a large space, and the retrieval of the table requires a large amount of I/O, which seriously reduces performance. You need to split the large fields into another table that has a one-to-one relationship with the original table. Usually we do vertical split according to the following principles: ① put uncommon fields in a separate table; (2) Separate large fields such as text and BLOb into attached tables; ③ Often combine query fields in a table.
7.3.2 Read/write Separation Middleware
We generally choose to use proxy middleware proxy tool of MySQL to achieve read and write separation. Proxy is fixedly connected to a database, so there is no need to change the database connection configuration in the project even if the database address is changed. Common middleware tools include: mysql-Proxy, Atlas, Cobar, MyCAT, TDDL, Tinnydbrouter, and mysql Router. See this article for details on the differences between them. I’ll add a demo using Mycat for read/write separation later.
reference
- High Performance MySQL(Version 3)
- MySQL Must Know must Know
- Blog.itpub.net/29654823/vi…