| the author Wen-an wang, tencent CSIG database of special database engineer, mainly responsible for the related work of tencent cloud database MySQL love technology, welcome to leave a message for communication.


Sometimes in your daily work, you will find that the MySQL is not in the right state. When you look at the monitoring metrics, you may find that: write QPS are starting to burr, or the IO metrics are high. What to do at this point? This paper will start from the Linux level, according to different IO characteristics to analyze the problems MySQL database may encounter, and give some reference optimization/mitigation ideas.

How to understand I/O indicators?

Iostat -x 1 -m, debian 10.2: iostat -x 1 -m, debian 10.2

iostat

Avg-cpu is a CPU related indicator. You can pay attention to %iowait when determining I/O problems. Other indicators are as follows:

· R /s and W /s: the combined read and write requests per second, which can be regarded as IOPS.

·rMB/s and wMB/s: read/write throughput of disks.

· RRQM /s and WRQM /s: number of combined read and write requests per second.

·% RRQM and % WRQM: percentages of combined read and write requests.

· R_await and W_await: average response time of read and write requests, including real processing time and waiting time in queue (ms).

· Aqu-SZ: average queue depth.

· Rareq_SZ and wareq_SZ: The average physical size of a read request and a write request (KB).

· SCVTM: The calculated average IO response time is no longer accurate, so it is unnecessary to pay attention to it.

·%util: If RAID or SSD is used, this indicator is ignored and is only accurate on a single mechanical disk.

Generally speaking, to evaluate whether a piece of IO device (ignoring the case of mechanical disk, there is no significance of evaluation) has reached the high load condition, we can look at the following indicators: R/S, W/S, rMB/ S, wMB/s, R_await, W_await, Aqu-SZ.

MySQL and IO

Because MySQL involves a lot of IO related parameters, here are only a few of the frequently used parameters and use the default Settings in tests & simulations:

parameter

Set up the

note

innodb_io_capacity

16000

Defines the available IOPS of background tasks

innodb_io_capacity_max

32000

Defines the maximum IOPS available for a background task

innodb_flush_log_at_trx_commit

1

Control the commit policy of a transaction, refer to the official documentation for details

sync_binlog

1

Controls the frequency at which the binlog disk falls. For details, see the official documentation

Innodb_io_capacity and innodb_IO_capacity_max are direct IOPS limits. In most cases, SSDS can be set to 16000 or higher. For cloud hosts or other shared storage devices, Learn about the IOPS upper limit and then adjust it. Trx_commit and sync_binlog are also included because different parameter combinations have different IO pressures. The common usage is double 1 or 20 (twenty zero). According to the description of the official document, double 1 will flush the disk every time the transaction is committed, and the pressure on I/O is much higher. 20 is the lag flush disk, the IO pressure will be smaller, so the write QPS will be higher.

Another detail to note is that innodb_io_capacity is described by background tasks. This represents the MySQL background flush, purge operation is restricted by this parameter setting.

Three, test environment

This test uses the IT3 instance of Tencent cloud server with high IO, with 3TB of local NVME. I/O performance of innodb_page_size is set to 16K (default innodb_page_size) using FIO. I/O performance is set to 16K (default innodb_page_size) using FIO.

type

IOPS

Throughput (MB)

Random read

121959

1905

The random write

98326

1536

Random read and write (read part)

47129

750

Random read and write (write part)

47152

754

So why use a test environment that doesn’t have IO bottlenecks at all? This is just to show you what happens when you tweak MySQL. If the IO device load of the entire system is high for a long time, the best optimization strategy is to upgrade the IO device, not adjust MySQL. Therefore, all analysis and response scenarios have high I/O loads in a medium or short period of time.

IO analysis

1. The pure writing

Let’s start with a purer, but less common, IO load scenario:

iostat_wo

This type of metric has a distinct characteristic: there is little or no read-related stress in the IO load. The typical characteristic of this load is that the cache is large enough to hold all the data, so there is no need to read the data from disk, and the pressure is all on the write.

Trx_commit and sync_binlog are trx_commit and sync_binlog. The QPS changes from 1 to 20 are easy to understand. Instead of one transaction needing to be flushed once, multiple transaction flushing operations are merged together. This improves the “transaction efficiency” of each IOPS, for example, from 1 transaction /IOPS to N transaction /IOPS.

In addition to improving “transaction efficiency per IOPS,” there is another way to think about it: limit the IOPS of background tasks appropriately. In fact, MySQL writes involve a large number of buffers, logs, and data related to background tasks. In medium-time high write scenarios, background tasks tend to slowly accumulate flush and Purge data. If innodb_IO_capacity and INNOdb_IO_CAPACity_max are set too high, background tasks may consume too much I/O resources. In this case, lower the value appropriately to hold the QPS for a period of time, and roll back the setting after the high write pressure passes.

In addition, if there is a more refined adjustment method, it will have better effect. At present, we can only rely on this parameter, but do not change it too low, because when the background tasks accumulate too much data and trigger the forced flush /checkpoint mechanism, the IO resources will be greatly occupied, resulting in very violent write QPS fluctuations. This needs to be noted.

Here’s the “reverse adjustment” effect, with log data taken from a sysbench client that significantly increased io_capacity in the mid-20s:

2. Pure read

Another more pure scenario, of course, is pure read, for example:

iostat_ro

The PURE READ I/O feature indicates that the cache is not large enough and hot data needs to be read from the disk. The best solution is to increase memory and increase innodb_buffer_pool_size to put more data into memory. As for how much memory needs to be added, you can combine the response time of the actual business SQL (after indexing) with buffer_pool hit ratio. As a rule of thumb, the hit ratio (in show Engine InnoDB status) is better than 99.5%. If the response time of the actual SQL does not meet the requirements of the business, then the required memory size can be estimated based on the actual hit ratio.

Since MySQL has dynamically adjusted innodb_buffer_pool_size since 5.7, the impact of the change is relatively small, but there is a price to pay. Try to do this during off-peak times.

3. Mix reading and writing

The most common scenarios are definitely mixed read-write scenarios, like this one:

iostat_rw

The analysis is a bit more complicated, but combining pure read and pure write analysis, it is easy to think of the following possibilities:

Scenario 1: Mixed read and write scenarios.

Scenario 2: In the pure write scenario, all data cannot be stored in the data store. Therefore, data needs to be read from the disk and then modified.

Let’s look at the simple scenario 2, which is similar to the pure write scenario in nature, but the memory is not large enough. Therefore, after checking the MySQL read/write SQL ratio (com_XXX series data in global status), you can refer to the content of the pure write chapter for analysis and processing.

Although scenario 1 is a bit more complex, the combination of pure writing and pure reading gives you some ideas for analysis, such as thinking about the following questions in sequence:

What is the service read/write ratio?

Is the read or write performance of the I/O system more serious?

If:

If the ratio of service reads is high (for example, >4:1), the PERFORMANCE of I/O system reads is serious. In this case, increase buffer_pool_size based on the pure read data.

If the service read ratio is high (for example, >4:1), I/O system write performance deteriorates. In this case, adjust the transaction commit policy or IO_capacity based on the pure write content. In addition, such scenarios may be due to changing data in large quantities, and you can also consider optimizing this business behavior.

If the service write ratio is high (for example, <4:1), the PERFORMANCE of the I/O system is serious.

If the service write ratio is high (for example, <4:1), the PERFORMANCE of I/O system write is serious.

The read/write ratio of services does not have obvious characteristics, and the read/write performance of I/O systems is serious. Consider all the above methods, including hardware upgrade.

4. 4. Smart refrigerator

Throughput, IOPS, and some means of spreading read and write stress

Throughput and IOPS are generally the most intuitive indicators to measure THE PERFORMANCE of an I/O system. They are not specifically mentioned because they are easy to judge. If the IOstat indicator has reached or is close to the actual hardware indicator (say 75%), plan hardware upgrades or other measures to distribute read and write pressures based on traffic growth.

Conventional means can simply follow the following scenarios to use as appropriate: read more write less read/write separation, write more read less library disassembly table and cache.

Check the I/O status of MySQL

If MySQL is blocking IO, you can observe the following indicators:

Parameter names

meaning

note

Innodb_data_pending_fsyncs

Fsync operation currently blocked

If innodb_flush_method is set to 0, look at the setting for innodb_flush_method

Innodb_data_pending_reads

The read operation is currently blocked

The value is generally 0. If the index is high and business is affected, refer to the coping method of reading pressure

Innodb_data_pending_writes

Write operation currently blocked

It is generally 0. If the index is high and the business is affected, refer to the coping method of writing pressure

Innodb_os_log_pending_fsyncs

Fsync operation currently blocked while redo log was being written

If the value is greater than 0, it is a bottleneck of the I/O device. Consider migrating redo logs to SSDS or performing I/O isolation to isolate THE PERFORMANCE of the I/O device

Innodb_os_log_pending_writes

A write operation is currently blocked while redo log is being written

It is generally 0. If the index is high and the business is affected, refer to the coping method of writing pressure

InnoDB also has a number of other read and write metrics, as you can see by doing things like show global status like ‘% InnoDB %read%’, but these metrics are generally cumulative, The difference between the values can be used to determine the read/write ratio of MySQL. The pending data in the above table and other system indicators are used to comprehensively determine the LOAD of the I/O system. These indicators are also recommended for monitoring.

Five, the summary

IO problems can be solved in a variety of ways: the easiest way to upgrade hardware; The fastest way to adjust MySQL (the main content of this article) More commonly used means of architecture adjustment (read and write separation, warehouse and table removal); Optimize the behavior of the business based on the actual situation (merge the DML of a single row operation, split the DML statement of a large number of updates, etc.).

While the above methods cannot be fully introduced, the information provided by iostat is very useful in analyzing MySQL bottlenecks. This article only Outlines some ideas for tuning MySQL based on the load characteristics of the hardware. In fact, it requires a combination of various means to better deal with IO problems.