MySQL has been released to MySQL8.0. In the new version, we can see that the optimizer part of MySQL, which was criticized before, has been changed a lot. Since the author used to work in 5.6 environment and recently switched to the latest 8.0 version, this article covers some parts related to the optimizer that I am interested in. It mainly includes cost Model of MySQL5.7 and histogram function of MySQL8.0.
Based on the latest version of MySQL8.0.12, this paper mainly introduces the usage and relevant codes of cost Model and histogram
Cost Model
Configurable cost constants
Why do I need to configure the cost Model constant? We know MySQL has been around for decades, but the optimizer still uses Hardcode weights to measure IO, CPU, etc. These weights are actually based on years or even decades of experience. Think about how fast hardware has evolved over the years. With dozens or hundreds of core servers in use by large companies, SSDS have long gone mainstream and NVME is on the rise. High-speed RDMA networks are making their way into homes. All these even affect the realization and reform of database system. Obviously, those Hardcode weights are out of date, and we need to provide a way for users to define them, or even better, to intelligently set them automatically based on the hardware environment.
MySQL5.7 introduces two new system tables that are exposed to the user for updates as follows:
root@(none) 04:05:24>select * from mysql.server_cost; +------------------------------+------------+---------------------+---------+---------------+ | cost_name | cost_value | last_update | comment | default_value | +------------------------------+------------+---------------------+---------+---------------+ | disk_temptable_create_cost | NULL | 2018-04-23 13:55:20 | NULL | 20 | | disk_temptable_row_cost | NULL | 2018-04-23 13:55:20 | NULL 0.5 | | | key_compare_cost | NULL 13:55:20 | 2018-04-23 | NULL 0.05 | | | memory_temptable_create_cost | NULL 13:55:20 | 2018-04-23 | NULL | 1 | | memory_temptable_row_cost | NULL | 2018-04-23 13:55:20 | NULL 0.1 | | | Row_evaluate_cost | NULL 13:55:20 | 2018-04-23 | NULL | | 0.1 +------------------------------+------------+---------------------+---------+---------------+ 6 rowsin set(0.00 SEC) Default: 'default_value' (generated columnfloat GENERATED ALWAYS AS (
(case `cost_name`
when _utf8mb3'disk_temptable_create_cost' then20.0 the when _utf8mb3'disk_temptable_row_cost' then0.5 the when _utf8mb3'key_compare_cost' then0.05 the when _utf8mb3'memory_temptable_create_cost' then1.0 the when _utf8mb3'memory_temptable_row_cost' then0.1 the when _utf8mb3'row_evaluate_cost' then 0.1 else NULL end)) VIRTUAL
root@(none) 04:05:35>select * from mysql.engine_cost;
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| engine_name | device_type | cost_name | cost_value | last_update | comment | default_value |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| default | 0 | io_block_read_cost | NULL | 2018-04-23 13:55:20 | NULL | 1 |
| default | 0 | memory_block_read_cost | NULL | 2018-04-23 13:55:20 | NULL | 0.25 |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
Copy the code
You can do this with an UPDATE statement, such as:
root@(none) 04:05:52>update mysql.server_cost set cost_value = 40 where cost_name = 'disk_temptable_create_cost'; Query OK, 1 row affected (0.05sec) Rows matched: 1 Changed: 1 Warnings: 0 root@(none) 04:07:13>select * from mysql.server_costwhere cost_name = 'disk_temptable_create_cost'; +----------------------------+------------+---------------------+---------+---------------+ | cost_name | cost_value | last_update | comment | default_value | +----------------------------+------------+---------------------+---------+---------------+ | disk_temptable_create_cost | 40 | 2018-06-23 16:07:05 | NULL | 20 | +----------------------------+------------+---------------------+---------+---------------+ 1 rowin setroot@(none) 10:10:12> Flush Optimizer_costs; (0.00 SEC) flush Optimizer_costs; Query OK, 0 rows affected (0.00 SEC)Copy the code
As you can see, the usage is very simple as it contains two tables: server_cost and engine_cost, which are configured for the server layer and engine layer respectively
Related codes:
The global cache Cost_constant_cache
The global cache maintains a current cost model, and the user thread determines whether it initializes the local pointer at lex_start. If it does not, it copies the pointer to the cache locally
Initialize global cache:
Cost_constant_cache::init: Cost_model_constants, which contains two types of information: The class structure of server layer cost model and engine layer cost model is as follows: Cost_constant_cache ----> Cost_model_constants ---> Server_cost_constants //server_cost ---> Cost_model_se_info -->SE_cost_constants //engine_cost from the storage engine if the storage engine provides get_cost_constantsCopy the code
Read configuration from system table for initialization and Flush Optimizer_costs and update cache:
read_cost_constants()
|--> read_server_cost_constants
|--> read_engine_cost_constants
Copy the code
Because the user can update the system table dynamically, after executing Flush Optimizer_costs, it is possible that the old version is still in use in some sessions, and therefore the reference count is required. The old version ref Counter is lowered to 0 before being released
The thread cost Model is initialized
- Cost_model_server
THD::m_cost_model: THD::m_cost_model: THD::m_cost_model: THD::m_cost_model
Cost_model_server::init const Cost_model_constants *m_cost_constants = cost_constant_cache->get_cost_constants(); // add a reference count, Const Server_cost_constants * M_SERVER_COST_CONSTANTS = M_COST_CONSTANTS ->get_server_cost_constants(); // Get the global pointerCopy the code
THD does not create its own cost model, but only references Pointers in the cache
Table Cost Model
Struct TABLE::m_cost_model, type: Cost_model_table
The value is taken from the Cost Model object stored in the above THD
Cost_estimate
The unified object type cost_estimate stores the calculated cost result, which contains four dimensions:
double io_cost; ///< cost of I/O operations
double cpu_cost; ///< cost of CPU operations
double import_cost; ///< cost of remote operations
double mem_cost; ///< memory used (bytes)
Copy the code
In the future
Right now, you can’t get a reasonable configuration value unless you’ve done enough testing based on your workload, but how you configure it and what’s a reasonable value is something that I think can be adjusted automatically. The key is to figure out how the configuration corresponds to the hardware conditions. This is something we can work towards in the future.
reference:
1. The MySQL Optimizer Cost Model 3. Optimizer Cost Model Improvements in MySQL 5.7.5DMR 4. Slide: MySQL Cost Model
Related Worklog:
WL#7182: Optimizer Cost Model API
WL#7209: Handler interface changes for new cost model
WL#7276: Configuration data base for Optimizer Cost Model
WL#7315 Optimizer cost model: main memory management of cost constants
WL#7316 Optimizer cost model: Command for online updating of cost model constants
Histogram
Histograms are also a much anticipated feature of MySQL, and are in fact common in other database products, as a good guide to optimizer path selection. The data distribution for the specified column is stored using a histogram. MariaDB has supported this feature since the early 10.0.2 release, and MySQL has supported it in the latest 8.0 release
use
MySQL uses the ANALYZE TABLE syntax to perform histograms:
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL]
TABLE tbl_name
UPDATE HISTOGRAM ON col_name [, col_name] ...
[WITH N BUCKETS]
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL]
TABLE tbl_name
DROP HISTOGRAM ON col_name [, col_name] ...Copy the code
Here’s a simple example:
Let's take a normal Sysbench table as an example: root@sb1 05:16:33>show create table sbtest1\G *************************** 1. row *************************** Table: sbtest1 Create Table: CREATE TABLE `sbtest1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `k` int(11) NOT NULL DEFAULT'0',
`c` char(120) NOT NULL DEFAULT ' ',
`pad` char(60) NOT NULL DEFAULT ' ',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=200001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.01 sec)
Create the histogram and store it in the data dictionaryroot@sb1 05:16:38>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k with 10 BUCKETS; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics createdfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.55 sec)
root@sb1 05:17:03>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k,pad with 10 BUCKETS;
+-------------+-----------+----------+------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-----------+----------+------------------------------------------------+
| sb1.sbtest1 | histogram | status | Histogram statistics created for column 'k'. |
| sb1.sbtest1 | histogram | status | Histogram statistics created for column 'pad'. |
+-------------+-----------+----------+------------------------------------------------+
2 rows in set(7.98 SEC) Delete histogram from pad column: root@sb1 05:17:51>ANALYZE TABLE sbtest1 DROP histogram ON pad; +-------------+-----------+----------+------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+------------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics removedfor column 'pad'. |
+-------------+-----------+----------+------------------------------------------------+
1 row in set(0.06 SEC) root@sb1 05:58:12>ANALYZE TABLE sbtest1 DROP HISTOGRAM ON k; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics removedfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.08 sec)
The default number of buckets is 100root@sb1 05:58:27>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics createdfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.56 sec)
Copy the code
Histogram statistics are stored in the InnoDB data dictionary and can be obtained from the INFORMATION_SCHEMA table
root@information_schema 05:34:49>SHOW CREATE TABLE INFORMATION_SCHEMA.COLUMN_STATISTICS\G *************************** 1. row *************************** View: COLUMN_STATISTICS Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`@`localhost` SQL SECURITY DEFINER VIEW `COLUMN_STATISTICS` AS select `mysql`.`column_statistics`.`schema_name` AS `SCHEMA_NAME`,`mysql`.`column_statistics`.`table_name` AS `TABLE_NAME`,`mysql`.`column_statistics`.`column_name` AS `COLUMN_NAME`,`mysql`.`column_statistics`.`histogram` AS `HISTOGRAM` from `mysql`.`column_statistics`where can_access_table(`mysql`.`column_statistics`.`schema_name`,`mysql`.`column_statistics`.`table_name`)
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.00 sec)Copy the code
Column_statistics system table mysql.column_statistics is hidden and not exposed
Here’s a simple example:
root@sb1 05:58:55>ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k WITH 4 BUCKETS; +-------------+-----------+----------+----------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+----------------------------------------------+ | sb1.sbtest1 | histogram | status | Histogram statistics createdfor column 'k'. |
+-------------+-----------+----------+----------------------------------------------+
1 row in set (0.63 sec)
Query the histogram information on the table
root@sb1 06:00:43>SELECT JSON_PRETTY(HISTOGRAM) FROM INFORMATION_SCHEMA.COLUMN_STATISTICS WHERE SCHEMA_NAME='sb1' AND TABLE_NAME = 'sbtest1'\G
*************************** 1. row ***************************
JSON_PRETTY(HISTOGRAM): {
"buckets": [[38671, 99756, 0.249795, 17002], [99757, 100248, 0.500035, 492], [100249, 100743, 0.749945, 495], [100744, 172775, 1.0, 16630]],"data-type": "int"."null-values": 0.0."collation-id": 8,
"last-updated": "The 2018-09-22 09:59:30. 857797"."sampling-rate": 1.0."histogram-type": "equi-height"."number-of-buckets-specified": 4
}
1 row in set (0.00 sec)
Copy the code
As you can see from the output JSON, the histogram generated after the above statement is executed has four buckets with data type Int and equi-height, i.e. contour histogram (the other type is contour histogram, i.e. SINGLETON). In each Bucket, the information described includes the upper and lower bounds of the value, the frequency, and the number of different values. This information can be used to obtain a more accurate distribution of data, allowing the optimizer to determine a better execution plan based on these statistics.
MySQL may select a constant width histogram if there are a large number of duplicate columns. For example, MySQL may select a constant width histogram if there are a large number of duplicate columns. For example, MySQL may select a constant width histogram if there are a large number of duplicate columns.
root@sb1 10:41:17>SELECT JSON_PRETTY(HISTOGRAM) FROM INFORMATION_SCHEMA.COLUMN_STATISTICS WHERE SCHEMA_NAME='sb1' AND TABLE_NAME = 'sbtest1'\G
*************************** 1. row ***************************
JSON_PRETTY(HISTOGRAM): {
"buckets": [[10, 0.499995], [20, 1.0]],"data-type": "int"."null-values": 0.0."collation-id": 8,
"last-updated": "The 2018-09-22 14:41:17. 312601"."sampling-rate": 1.0."histogram-type": "singleton"."number-of-buckets-specified": 100
}
1 row in set (0.00 sec)
Copy the code
As above, for the SINGLETON type, each bucket contains only two values: the column value, and the corresponding cumulative frequency (that is, what percentage of the data is smaller or equal to the value in the current bucket)
Note the sampling-rate here, where the value is 1, which means that all the data in the table are read for statistical purposes. However, for large tables, we may not want to read too much data, because excessive memory consumption may occur. So MySQL also provides a parameter histogram_generation_max_mem_size to limit memory usage.
If the DML in the table is small, then the histogram is basically stable, but if it is written frequently, then we may need to update the histogram regularly, MySQL itself does not actively update the histogram.
The optimizer calculates column filters from histogram, and most of the predicates are available. See official documentation for details
About the histogram impact query project, this blog and this blog
The relevant code
Code structure: taking MySQL8.0.12 as an example, the main codes are in the SQL /histogram directory:
ls sql/histograms/ equi_height_bucket.cc equi_height_bucket.h equi_height.cc equi_height.h histogram.cc histogram.h Singleton. cc singleton.h value_map.cc value_map.h value_map.h Namespace histograms | -- - > the Histogram / / base class | - > Equi_height / / such as Histogram, template class, instantiate the parameters for the data type, the need for type display definition / / file"equi_height.cc"| - > Singleton / / width histogram, only the value and the frequency of their occurrence is storedCopy the code
Create and store histogram:
The relevant functions and stacks for processing histogram are as follows:
Sql_cmd_analyze_table: : handle_histogram_command | - > update_histogram / / update the histogram | - > histograms: : update_histogram // Call the interface function in the namespace a. Determine columns: //histograms::field_type_to_value_map_type: check whether column types are supported // covered_by_singLE_part_index: If the column is Pk or UK, histogram // if the column is generated, the dependent column is found and added to itsetWhere b. Determine the half-partition ratio of the sample. This is limited by the histogram_generation_max_mem_size parameter. Will go to read a full table data analysis | - > fill_value_maps / / need analysis column data read from the table | - > ha_sample_init | - > ha_sample_next | - > handler: : sample_next / / read a record, by means of random Numbers to sampling Value_map < T > : : add_values / / to read data added to the map | - >... | - > ha_sample_end | - > build_histogram / / create a histogram object. Determine the histogram type: If the number of values is less than the number of buckets, Singleton is used, Otherwise use Equi_height type | - > Singleton < T > : : build_histogram | - > Equi_height < T > : : build_histogram | - > Histogram: : store_histogram / / the histogram information stored in the column_statistic table | - > dd: : cache: : Dictionary_client: : update < dd: : Column_statistics > | - > drop_histogram // Delete the histogramCopy the code
The use of the histogram
The way to use it is simpler:
First, add a member m_histograms to TABLE_SHARE, whose structure is an unordered map, key value is field index, and value is the corresponding histogram object
The correlation stack for obtaining column value filtering is as follows:
get_histogram_selectivity |-->Histogram::get_selectivity |->get_equal_to_selectivity_dispatcher |->get_greater_than_selectivity_dispatcher |->get_less_than_selectivity_dispatcher |-->write_histogram_to_trace // Wrote the optimizer_traceCopy the code
MySQL supports the use of histograms for a variety of operation types, including:
col_name = constant col_name <> constant col_name ! = constant col_name > constant col_name < constant col_name >= constant col_name <= constant col_name IS NULL col_name IS NOT NULL col_name BETWEEN constant AND constant col_name NOT BETWEEN constant AND constant col_name IN (constant[, constant] ...) col_name NOT IN (constant[, constant] ...)Copy the code
Through the histogram, we can judge the filter of column values according to the conditions on the column to help select a better execution plan. In the absence of histograms we need to index columns to get a relatively accurate distribution of column values. But we know that indexes are expensive to maintain, and histograms have the flexibility to be created on demand.
reference
WL#5384 PERFORMANCE_SCHEMA, HISTOGRAMS WL#8706 Persistent storage of Histogram data WL#8707 Classes/structures for Histograms WL#8943 Extend ANALYZE TABLE with histogram support WL#9223 Using histogram statistics in the optimizer
other
Optimize rec_per_key
Relevant worklog: WL# 7338: Interface for improved records per key estimates WL#7339 Use improved records per key estimate interface in optimizer
MySQL uses the rec_per_key interface to estimate the number of records (indicating the number of records corresponding to each index Key). However, in earlier versions, this number is an integer, which is rounded for decimal, and cannot represent the exact rec_per_key, thus affecting the selection of indexes. Therefore, in version 5.7, MySQL uses the rec_per_key interface to estimate the number of records corresponding to each index Key. Changes its recorded value to type float
Introduces data cache state calculation overhead
Relevant worklog:
WL#7168 API for estimates for how much of table and index data that is in memory buffer
WL#7170: InnoDB buffer estimates for tables and indexes
WL#7340 IO aware cost estimate function for data access
In previous versions, the optimizer had no way of knowing the state of the data, whether it was cached in memory or needed to be read from disk. This lack of information led the optimizer to assume that the data belonged to disk and calculate overhead. This can lead to inefficient execution of plans.
Related codes:
Server layer API to obtain the percentage of data on a table or index stored in the cache
handler::table_in_memory_estimate
handler::index_in_memory_estimateCopy the code
In innoDB layer, a global variable buf_stat_per_index (corresponding type is buf_STAT_per_INDEx_T) is added to maintain the number of leaf pages in memory for each index, and a lock-free hash structure is implemented internally. The Key value of (m_space_id) < < 32 | m_index_id), when read in page or memory to create new page, if the corresponding page is the leaf page, incrementing counter; Decrement count when removed from page hash.
To reduce the performance impact, counters are stored using a lock-free hash structure, which corresponds to ut_LOCK_free_hash_T. Hash is an array of fixed length. The array element is (key, val). The hash value is calculated based on the key and the array size is used to find the corresponding slot. When the array is full, a new, larger array is created, and all searches need to query both arrays before the data is moved to the new hash. When all records are migrated to the new array and there are no threads accessing the old array, the old hash can be removed.
The counter itself, stored in the hash, also takes into account the multi-core and NUMA architecture to avoid CPU cache invalidation due to simultaneous updates. This problem can be obvious in a large core scenario. Innodb encapsulates counting operations into the ut_lock_free_CNt_t class. It maintains counter with an array, updates CPU no as index, and accumulates the array when it needs to get the value of counter.
The Lock free hash is not a hash structure for general scenarios: for example, in case of a conflict, other key slots may be occupied, and when the hash is insufficient, a new array must be migrated. In fact, mysql implements an LF_hash itself, so there is no need to migrate data when extending the Hash.
You can read the number of pages in each index cache from information_schema. innodb_cacheD_INDEXES.
When the interface is defined and Innodb provides statistics, the optimizer can use this information to calculate overhead:
- Cost_model_table::page_read_cost
- Cost_model_table::page_read_cost_index