Performance optimization, it is stored engineers the pursuit of forever, in our opinion, in addition to adjust the storage architecture, optimization of IO path, is applied to make targeted optimization, is also very important and meaningful things, this means that, in addition to understand the store itself, also need the understanding of the upper applications or middleware has enough. This time, we will take a look at MySQL IO features and storage for MySQL optimization ideas.

This section describes the MySQL architecture components

MySQL and its successor MariaDB are the most popular Relational Database Management System in the market at present. In many application scenarios, MySQL is the preferred RDBMS of users.

MySQL roughly includes the following basic module components:

  • MySQL Client Connectors
  • Management Service & Utilities components
  • Connection Pool component
  • SQL Interface
  • Parser Component (Parser)
  • Query Optimizer component
  • Caches & Buffers
  • Pluggable Storage Engines
  • File System

InnoDB storage engine

Storage engine is located in the third layer of MySQL architecture, which is responsible for storing and extracting data in MySQL. It is a subsystem dealing with files. It is a file access mechanism customized according to the abstract interface of the file access layer provided by the bottom layer. Starting with MySQL 5.5, InnoDB is used as the storage engine by default. Therefore, optimizing the performance of the underlying storage for MySQL services starts with understanding and analyzing how the storage engine interacts with the underlying storage system.

The following figure shows the official InnoDB engine architecture. InnoDB storage engine is mainly divided into memory structure and disk structure.

InnoDB disks mainly contain Tablespaces, InnoDB Data Dictionary, Doublewrite Buffer, Redo Log, and Undo Logs. Redo Log and Binlog are two important mechanisms in MySQL Log system. This article mainly discusses the analysis and storage optimization of Redo Log and Binlog.

MySQL IO model and features

MySQL has two important Log files: Redo Log and Binlog. Redo Log records transaction logs to InnoDB storage engine. Redo Log write IO is a circular write to a fixed file. IO size is 512 bytes aligned (some offsets are equal and write overwrites are performed). The Binlog records all changes to the MySQL database. The write IO of the Binlog is the file append write. The IO is not aligned.

MySQL write request storage behavior: when a single thread performs MySQL insert write data, each insert corresponds to a write operation. MySQL will merge some OF the IO and send it to the file system (if using a remote file system, the IO will be captured by the client of the remote file system, such as the client of YRCloudFile). Call write request to write to /MySQL/ib_logfile. Fsync is called immediately after a write IO.

If the Binlog function is enabled, the system writes a second Binlog after Redo Log writing and fsync the Binlog to ensure data security. When a certain amount of log data is written, another thread in MySQL background will collate all the data in IO 16K size and write it to the /MySQL/ ibData table file in aiO mode.

Storage behavior of MySQL read requests: When MySQL reads data, it looks for data from the MySQL cache. If a cache hit, it does not actually send read IO to the underlying file system.

YRCloudFile is optimized for THE I/O behavior of MySQL logs

Because Redo Log and Binlog are both I/O writes accompanied by fsync, fsync has high storage overhead. Therefore, to optimize MySQL performance, we need to adjust the I/O behavior of the two logs on the premise of fully ensuring the data security of the two Log files.

On the YRCloudFile server, Redo Log files are written in direct mode, and fsync is performed automatically after the Redo Log is written. Binlog Cannot be written in direct mode because I/OS are not aligned. You need to write data to the system cache first and then perform fsync. This eliminates the need for clients to do remote_fsync on Redo Log and Binlog files, eliminating the overhead of client calls to fsync. Here is a comparison of measured data:

From the measured results, we can see that after adjusting the specific write logic of YRCloudFile backend, the performance of MySQL single-thread write has been doubled.

Storage r&d engineers should not only master the core skills of storage, but also pay attention to and analyze the business behavior of the upper application, so as to make targeted optimization of the application. Stay tuned for more application-oriented optimization and analysis in the future.