· Cost Model, histogram and optimizer Cost optimization

MySQL has been released to MySQL8.0. In the new version, we can see that the optimizer part of MySQL, which was criticized before, has been changed a lot. Since the author used to work in 5.6 environment and recently switched to the latest 8.0 version, this article covers some parts related to the optimizer that I am interested in. It mainly includes cost Model of MySQL5.7 and histogram function of MySQL8.0.

Based on the latest version of MySQL8.0.12, this paper mainly introduces the usage and relevant codes of cost Model and histogram

Cost Model

Configurable cost constants

Why do I need to configure the cost Model constant? We know MySQL has been around for decades, but the optimizer still uses Hardcode weights to measure IO, CPU, etc. These weights are actually based on years or even decades of experience. Think about how fast hardware has evolved over the years. With dozens or hundreds of core servers in use by large companies, SSDS have long gone mainstream and NVME is on the rise. High-speed RDMA networks are making their way into homes. All these even affect the realization and reform of database system. Obviously, those Hardcode weights are out of date, and we need to provide a way for users to define them, or even better, to intelligently set them automatically based on the hardware environment.

MySQL5.7 introduces two new system tables that are exposed to the user for updates as follows:

root@(none) 04:05:24>select * from mysql.server_cost; +------------------------------+------------+---------------------+---------+---------------+ | cost_name | cost_value |  last_update | comment | default_value | +------------------------------+------------+---------------------+---------+---------------+ | disk_temptable_create_cost | NULL | 2018-04-23 13:55:20 | NULL | 20 | | disk_temptable_row_cost | NULL | 2018-04-23 13:55:20 | NULL 0.5 | | | key_compare_cost | NULL 13:55:20 | 2018-04-23 | NULL 0.05 | | | memory_temptable_create_cost | NULL 13:55:20 | 2018-04-23 | NULL | 1 | | memory_temptable_row_cost | NULL | 2018-04-23 13:55:20 | NULL 0.1 | | | Row_evaluate_cost | NULL 13:55:20 | 2018-04-23 | NULL | | 0.1 +------------------------------+------------+---------------------+---------+---------------+ 6 rowsin set(0.00 SEC) Default: 'default_value' (generated columnfloat GENERATED ALWAYS AS (
(case `cost_name` 
when _utf8mb3'disk_temptable_create_cost' then20.0 the when _utf8mb3'disk_temptable_row_cost' then0.5 the when _utf8mb3'key_compare_cost' then0.05 the when _utf8mb3'memory_temptable_create_cost' then1.0 the when _utf8mb3'memory_temptable_row_cost' then0.1 the when _utf8mb3'row_evaluate_cost' then 0.1 else NULL end)) VIRTUAL

root@(none) 04:05:35>select * from mysql.engine_cost;
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| engine_name | device_type | cost_name              | cost_value | last_update         | comment | default_value |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| default     |           0 | io_block_read_cost     |       NULL | 2018-04-23 13:55:20 | NULL    |             1 |
| default     |           0 | memory_block_read_cost |       NULL | 2018-04-23 13:55:20 | NULL    |          0.25 |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
Copy the code

You can do this with an UPDATE statement, such as:

root@(none) 04:05:52>update mysql.server_cost set cost_value = 40 where cost_name = 'disk_temptable_create_cost'; Query OK, 1 row affected (0.05sec) Rows matched: 1 Changed: 1 Warnings: 0 root@(none) 04:07:13>select * from mysql.server_costwhere cost_name = 'disk_temptable_create_cost'; +----------------------------+------------+---------------------+---------+---------------+ | cost_name | cost_value | last_update | comment | default_value | +----------------------------+------------+---------------------+---------+---------------+ | disk_temptable_create_cost  | 40 | 2018-06-23 16:07:05 | NULL | 20 | +----------------------------+------------+---------------------+---------+---------------+ 1 rowin setroot@(none) 10:10:12> Flush Optimizer_costs; (0.00 SEC) flush Optimizer_costs; Query OK, 0 rows affected (0.00 SEC)Copy the code

As you can see, the usage is very simple as it contains two tables: server_cost and engine_cost, which are configured for the server layer and engine layer respectively

Related codes:

The global cache Cost_constant_cache

The global cache maintains a current cost model, and the user thread determines whether it initializes the local pointer at lex_start. If it does not, it copies the pointer to the cache locally

Initialize global cache:

Cost_constant_cache::init: Cost_model_constants, which contains two types of information: The class structure of server layer cost model and engine layer cost model is as follows:  Cost_constant_cache ----> Cost_model_constants ---> Server_cost_constants //server_cost ---> Cost_model_se_info -->SE_cost_constants //engine_cost from the storage engine if the storage engine provides get_cost_constantsCopy the code

Read configuration from system table for initialization and Flush Optimizer_costs and update cache:

read_cost_constants()
|--> read_server_cost_constants
|--> read_engine_cost_constants
Copy the code

Because the user can update the system table dynamically, after executing Flush Optimizer_costs, it is possible that the old version is still in use in some sessions, and therefore the reference count is required. The old version ref Counter is lowered to 0 before being released

The thread cost Model is initialized

Cost_model_server

THD::m_cost_model: THD::m_cost_model: THD::m_cost_model: THD::m_cost_model

Cost_model_server::init const Cost_model_constants *m_cost_constants = cost_constant_cache->get_cost_constants(); // add a reference count, Const Server_cost_constants * M_SERVER_COST_CONSTANTS = M_COST_CONSTANTS ->get_server_cost_constants(); // Get the global pointerCopy the code

THD does not create its own cost model, but only references Pointers in the cache

Table Cost Model

Struct TABLE::m_cost_model, type: Cost_model_table

The value is taken from the Cost Model object stored in the above THD

Cost_estimate

The unified object type cost_estimate stores the calculated cost result, which contains four dimensions:

  double io_cost;      ///< cost of I/O operations
  double cpu_cost;     ///< cost of CPU operations
  double import_cost;  ///< cost of remote operations
  double mem_cost;     ///< memory used (bytes)
Copy the code

In the future

Right now, you can’t get a reasonable configuration value unless you’ve done enough testing based on your workload, but how you configure it and what’s a reasonable value is something that I think can be adjusted automatically. The key is to figure out how the configuration corresponds to the hardware conditions. This is something we can work towards in the future.

reference:

1. The MySQL Optimizer Cost Model 3. Optimizer Cost Model Improvements in MySQL 5.7.5DMR 4. Slide: MySQL Cost Model

Related Worklog:

WL#7182: Optimizer Cost Model API

WL#7209: Handler interface changes for new cost model

WL#7276: Configuration data base for Optimizer Cost Model

WL#7315 Optimizer cost model: main memory management of cost constants

WL#7316 Optimizer cost model: Command for online updating of cost model constants

Histogram

Histograms are also a much anticipated feature of MySQL, and are in fact common in other database products, as a good guide to optimizer path selection. The data distribution for the specified column is stored using a histogram. MariaDB has supported this feature since the early 10.0.2 release, and MySQL has supported it in the latest 8.0 release

use

MySQL uses the ANALYZE TABLE syntax to perform histograms:

ANALYZE [NO_WRITE_TO_BINLOG | LOCAL]
    TABLE tbl_name
    UPDATE HISTOGRAM ON col_name [, col_name] ...
        [WITH N BUCKETS]
        
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL]
    TABLE tbl_name
    DROP HISTOGRAM ON col_name [, col_name] ...Copy the code

Here’s a simple example:

Let's take a normal Sysbench table as an example:  root@sb1 05:16:33>show create table sbtest1\G *************************** 1. row *************************** Table: sbtest1 Create Table: CREATE TABLE `sbtest1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `k` int(11) NOT NULL DEFAULT'0',
  `c` char(120) NOT NULL DEFAULT ' ',
  `pad` char(60) NOT NULL DEFAULT ' ',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=200001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.01 sec)


Create the histogram and store it in the data dictionaryroot@sb1 05:16:38>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k with 10 BUCKETS; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text  | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics createdfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.55 sec)

root@sb1 05:17:03>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k,pad with 10 BUCKETS;
+-------------+-----------+----------+------------------------------------------------+
| Table       | Op        | Msg_type | Msg_text                                       |
+-------------+-----------+----------+------------------------------------------------+
| sb1.sbtest1 | histogram | status   | Histogram statistics created for column 'k'.   |
| sb1.sbtest1 | histogram | status   | Histogram statistics created for column 'pad'. |
+-------------+-----------+----------+------------------------------------------------+
2 rows in set(7.98 SEC) Delete histogram from pad column: root@sb1 05:17:51>ANALYZE TABLE sbtest1 DROP histogram ON pad; +-------------+-----------+----------+------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+------------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics removedfor column 'pad'. |
+-------------+-----------+----------+------------------------------------------------+
1 row in set(0.06 SEC) root@sb1 05:58:12>ANALYZE TABLE sbtest1 DROP HISTOGRAM ON k; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text  | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics removedfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.08 sec)


The default number of buckets is 100root@sb1 05:58:27>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text  | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics createdfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.56 sec)

Copy the code

Histogram statistics are stored in the InnoDB data dictionary and can be obtained from the INFORMATION_SCHEMA table

root@information_schema 05:34:49>SHOW CREATE TABLE INFORMATION_SCHEMA.COLUMN_STATISTICS\G *************************** 1.  row *************************** View: COLUMN_STATISTICS Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`@`localhost` SQL SECURITY DEFINER VIEW `COLUMN_STATISTICS` AS select `mysql`.`column_statistics`.`schema_name` AS `SCHEMA_NAME`,`mysql`.`column_statistics`.`table_name` AS `TABLE_NAME`,`mysql`.`column_statistics`.`column_name` AS `COLUMN_NAME`,`mysql`.`column_statistics`.`histogram` AS `HISTOGRAM` from `mysql`.`column_statistics`where can_access_table(`mysql`.`column_statistics`.`schema_name`,`mysql`.`column_statistics`.`table_name`)
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.00 sec)Copy the code

Column_statistics system table mysql.column_statistics is hidden and not exposed

Here’s a simple example:

root@sb1 05:58:55>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k WITH 4 BUCKETS; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text  | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics createdfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.63 sec)

Query the histogram information on the table

root@sb1 06:00:43>SELECT JSON_PRETTY(HISTOGRAM) FROM INFORMATION_SCHEMA.COLUMN_STATISTICS WHERE SCHEMA_NAME='sb1' AND TABLE_NAME = 'sbtest1'\G
*************************** 1. row ***************************
JSON_PRETTY(HISTOGRAM): {
  "buckets": [[38671, 99756, 0.249795, 17002], [99757, 100248, 0.500035, 492], [100249, 100743, 0.749945, 495], [100744, 172775, 1.0, 16630]],"data-type": "int"."null-values": 0.0."collation-id": 8,
  "last-updated": "The 2018-09-22 09:59:30. 857797"."sampling-rate": 1.0."histogram-type": "equi-height"."number-of-buckets-specified": 4
}
1 row in set (0.00 sec)
Copy the code

As you can see from the output JSON, the histogram generated after the above statement is executed has four buckets with data type Int and equi-height, i.e. contour histogram (the other type is contour histogram, i.e. SINGLETON). In each Bucket, the information described includes the upper and lower bounds of the value, the frequency, and the number of different values. This information can be used to obtain a more accurate distribution of data, allowing the optimizer to determine a better execution plan based on these statistics.

MySQL may select a constant width histogram if there are a large number of duplicate columns. For example, MySQL may select a constant width histogram if there are a large number of duplicate columns. For example, MySQL may select a constant width histogram if there are a large number of duplicate columns.

root@sb1 10:41:17>SELECT JSON_PRETTY(HISTOGRAM) FROM INFORMATION_SCHEMA.COLUMN_STATISTICS WHERE SCHEMA_NAME='sb1' AND TABLE_NAME = 'sbtest1'\G
*************************** 1. row ***************************
JSON_PRETTY(HISTOGRAM): {
  "buckets": [[10, 0.499995], [20, 1.0]],"data-type": "int"."null-values": 0.0."collation-id": 8,
  "last-updated": "The 2018-09-22 14:41:17. 312601"."sampling-rate": 1.0."histogram-type": "singleton"."number-of-buckets-specified": 100
}
1 row in set (0.00 sec)
Copy the code

As above, for the SINGLETON type, each bucket contains only two values: the column value, and the corresponding cumulative frequency (that is, what percentage of the data is smaller or equal to the value in the current bucket)

Note the sampling-rate here, where the value is 1, which means that all the data in the table are read for statistical purposes. However, for large tables, we may not want to read too much data, because excessive memory consumption may occur. So MySQL also provides a parameter histogram_generation_max_mem_size to limit memory usage.

If the DML in the table is small, then the histogram is basically stable, but if it is written frequently, then we may need to update the histogram regularly, MySQL itself does not actively update the histogram.

The optimizer calculates column filters from histogram, and most of the predicates are available. See official documentation for details

About the histogram impact query project, this blog and this blog

The relevant code

Code structure: taking MySQL8.0.12 as an example, the main codes are in the SQL /histogram directory:

ls sql/histograms/ equi_height_bucket.cc equi_height_bucket.h equi_height.cc equi_height.h histogram.cc histogram.h Singleton. cc singleton.h value_map.cc value_map.h value_map.h Namespace histograms | -- - > the Histogram / / base class | - > Equi_height / / such as Histogram, template class, instantiate the parameters for the data type, the need for type display definition / / file"equi_height.cc"| - > Singleton / / width histogram, only the value and the frequency of their occurrence is storedCopy the code

Create and store histogram:

The relevant functions and stacks for processing histogram are as follows:

Sql_cmd_analyze_table: : handle_histogram_command | - > update_histogram / / update the histogram | - > histograms: : update_histogram // Call the interface function in the namespace a. Determine columns: //histograms::field_type_to_value_map_type: check whether column types are supported // covered_by_singLE_part_index: If the column is Pk or UK, histogram // if the column is generated, the dependent column is found and added to itsetWhere b. Determine the half-partition ratio of the sample. This is limited by the histogram_generation_max_mem_size parameter. Will go to read a full table data analysis | - > fill_value_maps / / need analysis column data read from the table | - > ha_sample_init | - > ha_sample_next | - > handler: : sample_next / / read a record, by means of random Numbers to sampling Value_map < T > : : add_values / / to read data added to the map | - >... | - > ha_sample_end | - > build_histogram / / create a histogram object. Determine the histogram type: If the number of values is less than the number of buckets, Singleton is used, Otherwise use Equi_height type | - > Singleton < T > : : build_histogram | - > Equi_height < T > : : build_histogram | - > Histogram: : store_histogram / / the histogram information stored in the column_statistic table | - > dd: : cache: : Dictionary_client: : update < dd: : Column_statistics > | - > drop_histogram // Delete the histogramCopy the code

The use of the histogram

The way to use it is simpler:

First, add a member m_histograms to TABLE_SHARE, whose structure is an unordered map, key value is field index, and value is the corresponding histogram object

The correlation stack for obtaining column value filtering is as follows:

get_histogram_selectivity |-->Histogram::get_selectivity |->get_equal_to_selectivity_dispatcher |->get_greater_than_selectivity_dispatcher |->get_less_than_selectivity_dispatcher |-->write_histogram_to_trace // Wrote the optimizer_traceCopy the code

MySQL supports the use of histograms for a variety of operation types, including:

col_name = constant col_name <> constant col_name ! = constant col_name > constant col_name < constant col_name >= constant col_name <= constant col_name IS NULL col_name IS NOT NULL col_name BETWEEN constant AND constant col_name NOT BETWEEN constant AND constant col_name IN (constant[, constant] ...) col_name NOT IN (constant[, constant] ...)Copy the code

Through the histogram, we can judge the filter of column values according to the conditions on the column to help select a better execution plan. In the absence of histograms we need to index columns to get a relatively accurate distribution of column values. But we know that indexes are expensive to maintain, and histograms have the flexibility to be created on demand.

reference

WL#5384 PERFORMANCE_SCHEMA, HISTOGRAMS WL#8706 Persistent storage of Histogram data WL#8707 Classes/structures for Histograms WL#8943 Extend ANALYZE TABLE with histogram support WL#9223 Using histogram statistics in the optimizer

other

Optimize rec_per_key

Relevant worklog: WL# 7338: Interface for improved records per key estimates WL#7339 Use improved records per key estimate interface in optimizer

MySQL uses the rec_per_key interface to estimate the number of records (indicating the number of records corresponding to each index Key). However, in earlier versions, this number is an integer, which is rounded for decimal, and cannot represent the exact rec_per_key, thus affecting the selection of indexes. Therefore, in version 5.7, MySQL uses the rec_per_key interface to estimate the number of records corresponding to each index Key. Changes its recorded value to type float

Introduces data cache state calculation overhead

Relevant worklog:

WL#7168 API for estimates for how much of table and index data that is in memory buffer

WL#7170: InnoDB buffer estimates for tables and indexes

WL#7340 IO aware cost estimate function for data access

In previous versions, the optimizer had no way of knowing the state of the data, whether it was cached in memory or needed to be read from disk. This lack of information led the optimizer to assume that the data belonged to disk and calculate overhead. This can lead to inefficient execution of plans.

Related codes:

Server layer API to obtain the percentage of data on a table or index stored in the cache

 handler::table_in_memory_estimate
 handler::index_in_memory_estimateCopy the code

In innoDB layer, a global variable buf_stat_per_index (corresponding type is buf_STAT_per_INDEx_T) is added to maintain the number of leaf pages in memory for each index, and a lock-free hash structure is implemented internally. The Key value of (m_space_id) < < 32 | m_index_id), when read in page or memory to create new page, if the corresponding page is the leaf page, incrementing counter; Decrement count when removed from page hash.

To reduce the performance impact, counters are stored using a lock-free hash structure, which corresponds to ut_LOCK_free_hash_T. Hash is an array of fixed length. The array element is (key, val). The hash value is calculated based on the key and the array size is used to find the corresponding slot. When the array is full, a new, larger array is created, and all searches need to query both arrays before the data is moved to the new hash. When all records are migrated to the new array and there are no threads accessing the old array, the old hash can be removed.

The counter itself, stored in the hash, also takes into account the multi-core and NUMA architecture to avoid CPU cache invalidation due to simultaneous updates. This problem can be obvious in a large core scenario. Innodb encapsulates counting operations into the ut_lock_free_CNt_t class. It maintains counter with an array, updates CPU no as index, and accumulates the array when it needs to get the value of counter.

The Lock free hash is not a hash structure for general scenarios: for example, in case of a conflict, other key slots may be occupied, and when the hash is insufficient, a new array must be migrated. In fact, mysql implements an LF_hash itself, so there is no need to migrate data when extending the Hash.

You can read the number of pages in each index cache from information_schema. innodb_cacheD_INDEXES.

When the interface is defined and Innodb provides statistics, the optimizer can use this information to calculate overhead:

Cost_model_table::page_read_cost
Cost_model_table::page_read_cost_index

· Cost Model, histogram and optimizer Cost optimization

Cost Model

Configurable cost constants

Related codes:

The global cache Cost_constant_cache

The thread cost Model is initialized

Table Cost Model

Cost_estimate

In the future

reference:

Histogram

use

The relevant code

reference

other

Optimize rec_per_key

Introduces data cache state calculation overhead

Related Posts

Before a file to delete data I, today finally found a way to release C disk space

In O2O scenario, how to optimize personality recommendation from 1 to N?

Kernel issues that 80% of Linux operations don’t understand