MySQL Basic Architecture

MySQL can be roughly divided into Server layer and storage engine layer

The Server layer includes connectors, query caches, parsers, preprocessors, optimizers, actuators, etc. All cross-storage engine functionality is implemented in this layer, such as stored procedures, triggers, views, etc

The connector

  • Responsible for establishing connections with clients, obtaining permissions, maintaining and managing connections
  • After the user name and password are authenticated, the connector will query the user’s permission in the permission table. After that, the permission judgment logic in the link depends on the permission read at this time. Modifying user permissions using an administrator account does not affect the permissions of existing connections
  • The wait_timeout parameter controls how long the server remains idle with the client. The default value is 8 hours
  • MySQL memory usage skyrocketed due to long connection accumulation
    • Consider periodically disconnecting long connections
    • After MySQL5.7, execute mysql_reset_CONNECTION to re-initialize the connection resource (this process does not require reconnection and re-authentication permissions, but restores the connection to the newly created state)

MySQL: : MySQL 5.7 Reference Manual: : 27.8.7.60 mysql_reset_connection () dev.mysql.com/doc/refman/…

The query cache

  • If the query happens to hit the query cache, the query is not parsed, the execution plan is not generated, and MySQL checks the user’s permissions before returning the query result
  • Updates to a table can cause the query cache of that table to be emptied (the frequent invalidation of the query cache does more harm than good)
  • It is recommended to use query caching on demand
    • Set query_cache_type to DEMAND. By default, SQL does not use query cache. SQL_CACHE specifies that query cache is used
select SQL_CACHE * from T where ID = 10;
Copy the code
  • MySQL 8.0 has removed the entire query cache function

Parsers (what they do)

  • Lexical analysis, identifying what strings are in SQL
  • Syntax analysis
    • On the basis of lexical analysis, according to the syntax rules, judge whether SQL meets MySQL syntax, and finally get a syntax tree
    • For SQL with different parameters but otherwise identical, the execution time is different but the hard parsing time is the same; With the change of the same SQL query data, the execution time of multiple queries may be different, but the hard parsing time is unchanged

Lexical analysis -> parsing can be called hard parsing

The preprocessor

  • Further check whether the syntax tree is valid according to MySQL rules

For example, check whether tables, data columns exist; Are aliases ambiguous, etc

Optimizer (how to do it)

  • The optimizer converts the syntax tree into an execution plan (multiple execution plans can be generated) and then finds the most appropriate execution plan (with the least cost)
  • You can use Last_query_cost to find out the cost of the query

As shown in the figure above, the optimizer believes that this SQL will require approximately 1.399 random data pages to complete the query

  • The optimizer can be asked to resolve various elements of the optimization process (Explain SQL)
  • When there are multiple indexes in a table, determine which index to use
  • You can influence the optimizer’s decision process by prompting the optimizer with special keywords (for index, for example)
  • After the optimizer phase is complete, the execution plan is determined
  • The optimizer sometimes selects the wrong execution plan
    • Inaccurate index statistics
    • The optimizer’s understanding of the optimal execution plan is not necessarily the most responsive

Storage engine layer

  • Responsible for data storage and extraction, plug-in architecture mode, support InnoDB, MyISAM and other storage engines
  • Different storage engines share the same Server layer

redo log

InnoDB engine unique log, storage engine layer log

  • When a record needs to be updated, the InnoDB engine updates memory, writes the record to the redo log, and updates the record to disk when appropriate
  • The size of the redo log is fixed
    • Write pos is the position of the current record, moving back as you write, and starting again at the end
    • The check point is the current position to erase, which is also moved back and iterated, erasing records, updating the log to disk first

With redo log, InnoDB guarantees that all previously committed records will not be lost in the event of an unexpected database restart (crash recovery, crash-safe)

bin log

  • Server layer logs can only be used for archiving, without carsh-safe capability

More details on that later

Difference between redo log and bin log

  1. Redo logs are unique to InnoDB; Binlog is implemented on the Server layer and is available to all storage engines
  2. The redo log is a physical log that records changes made on a data page. A binlog is a logical log that records the original logic of a statement
  3. Redo logs are written in a circular, fixed space; Binlog can be appended and does not overwrite previous logs

The execution flow of an update statement

Log two-phase submission ensures that the redo log and bin log are logically consistent

  • If the transaction crashes at ①, the transaction is rolled back
  • If the transaction crashes at (2), the redo log is in the prepare state and the binlog has already been written, then the transaction is automatically committed when it resumes

Write-Ahead Logging

  • With logging, the storage engine can update data in memory and persist the updates to disk log files without flushing the updated data to disk each time (random IO)
  • Logging is appending, and logging is sequential IO (much faster than random IO) on a small area of disk
  • After logs are persisted to disk, the modified data (dirty pages) in memory can be flushed to disk slowly in the background

Recommended Parameter Settings

  • If innodb_flush_log_at_trx_commit is set to 1, the redo log of each transaction is persisted to disk
  • If the sync_binlog parameter is set to 1, the binlog of each transaction is persisted to disk