MySQL query cache mystery

Have feelings, have dry goods, wechat search [three prince Ao Bing] pay attention to this different programmer.

This article has been included in GitHub github.com/JavaFamily, there are a line of large factory interview complete test sites, information and my series of articles.

preface

We know that cache design is ubiquitous in RDBMS databases, such as the bug-ridden Oracle database, which boasts 2500W lines of code. SQL execution plans can be cached in the library cache to avoid hard parsing (syntax analysis -> semantic analysis -> generate execution plans) of the same SQL. SQL execution results are cached in the RESULT cache component, which effectively converts physical IO into logical IO and improves SQL execution efficiency.

MySQL QueryCache is similar to Oracle. It caches SQL statement text and the corresponding result set. It seems to be a great Idea. Removed in 5.7 and Removed in version 8.0, MySQL QueryCache is Removed.

QueryCache introduction

MySQL query slow (QC: The QueryCache stores the text of the SELECT statement and the result set sent to the client. If the same SQL is executed again, the Server will retrieve the result from the QueryCache and return it to the client instead of parsing the SQL again. The query cache is shared between sessions, so that a cached result set generated by one client can respond to another client executing the same SQL.

Back to the question at the beginning, how do YOU tell if SQL is shared?

It can be judged by whether the SQL text is exactly the same. Only when all characters, including case and space, are exactly the same, can it be shared. The advantage of sharing is that hard parsing can be avoided and the result can be directly obtained from QC and returned to the client.

--SQL 1
select id, balance from account where id = 121;
--SQL 2
select id, balance From account where id = 121;
Copy the code

The following is an example of how an Oracle database generates sql_id from SQL_TEXT. If the SQL_id is different, the SQL is not shared, and hard parsing occurs.

#! /usr/bin/perl -w
use Digest::MD5  qw(md5 md5_hex md5_base64);
use Math::BigInt;
my $stmt = "select id, balance from account where id = 121\0"; 
my $hash = md5 $stmt; 
my($a,$b,$msb,$lsb) = unpack("V*",$hash);
my $sqln = $msb*(2支那32)+$lsb;
my $stop = log($sqln) / log(32) + 1;
my $sqlid = ' ';
my $charbase32 = '0123456789abcdfghjkmnpqrstuvwxyz';
my @chars = split ' ', $charbase32;
for($i=0; $i < $stop-1; $i++){
  my $x = Math::BigInt->new($sqln);
  my $seq = $x->bdiv(32**$i)->bmod(32);
  $sqlid = $chars[$seq].$sqlid;
}
print "SQL is:\n $stmt \nSQL_ID is\n $sqlid\n";
Copy the code

You can see that SQL 1 and SQL 2 generate different SQL_id values from the code, so they are not shared.

SQL is:    select id, balance from account where id = 121 
SQL_ID is  dm5c6ck1g7bds
SQL is:    select id, balance From account where id = 121 
SQL_ID is  6xb8gvs5cmc9b
Copy the code

If you are asked to compare the contents of two Java code files, all you need to do is understand the code thoroughly, and you can transform your own business logic.

QueryCache configuration

mysql> show variables like '%query_cache%';
+------------------------------+----------+
| Variable_name                | Value    |
+------------------------------+----------+
| have_query_cache             | YES      |
| query_cache_limit            | 1048576  |
| query_cache_min_res_unit     | 4096     |
| query_cache_size             | 16777216 |
| query_cache_type             | OFF      |
| query_cache_wlock_invalidate | OFF      |
Copy the code

Variable_name	Description
have_query_cache	Query whether cache is available, YES- YES; NO- Not available. If you use standard binary MySQL, the value is always YES.
query_cache_limit	Controls the maximum size of a single query result set. The default is 1MB.
query_cache_min_res_unit	Query the size of cache fragment data blocks. The default value is 4KB, which meets most service scenarios.
query_cache_size	Query the cache size, in Bytes, 0 disables QueryCache. Note: Do not set the cache size too large, since threads lock QueryCache during updates, so you may see lock contention issues for very large caches.
query_cache_type	When the query_cache_size > 0; This variable affects how qc works and has three values: 0,1,2,0: disables caching or retrieval of cached results;1: cache is enabled, except for the SELECT SQL_NO_CACHE statement;2: only statements starting with SELECT SQL_CACHE are cached.

Query_cache_min_res_unit instructions

The default size is 4KB, and if many of the query results are small, the default data block size may cause memory fragmentation, which may force the query cache to remove the query from the cache due to insufficient memory.

In this case, you can reduce the value of query_cache_MIN_res_unit. The number of free blocks and queries removed due to pruning is given by the values of the Qcache_free_blocks and Qcache_lowmem_prunes status variables. If a large number of queries have a large result set, You can increase the value of this parameter to improve performance.

QueryCache is usually enabled

#Modify the MySQL configuration file /etc/my.cnf, add the following configuration, and restart the MySQL server.
[mysqld]
query_cache_size = 32M
query_cache_type = 1
Copy the code

QueryCache use

Start with some test data and test the scenarios with QueryCache disabled and enabled.

Create table users and insert 100W data.
CREATE TABLE `users` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `name` varchar(20) NOT NULL DEFAULT ' ' COMMENT 'name',
  `age` tinyint NOT NULL DEFAULT '0' COMMENT 'age',
  `gender` char(1) NOT NULL DEFAULT 'M' COMMENT 'gender',
  `phone` varchar(16) NOT NULL DEFAULT ' ' COMMENT 'Mobile phone Number',
  `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation time',
  `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Modification time'.PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='User Information Table';

select count(*) from users;
+----------+
| count(*) |
+----------+
|  1000000 |
Copy the code

Disable the queryCache scenario

Without QueryCache, hard parsing occurs every time the same query is executed, consuming a lot of resources.

#Disable QueryCache configuration
query_cache_size = 0
query_cache_type = 0
Copy the code

Repeat the following query to observe the execution time.

-- Execute the query statement for the first time
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id      | name       | age | gender | phone       | create_time         | update_time         |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
|  997855 | User997854 |  54 | M      | 15240540354 | 2020- 12- 15 14:34:50 | 2020- 12- 15 14:34:50 |.10 rows in set (0.89 sec)
Execute the same query statement a second time
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id      | name       | age | gender | phone       | create_time         | update_time         |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
|  997855 | User997854 |  54 | M      | 15240540354 | 2020- 12- 15 14:34:50 | 2020- 12- 15 14:34:50 |.10 rows in set (0.90 sec)
-- profile Tracking
mysql> show profile cpu,block io for query 1;  
+----------------------+----------+----------+------------+--------------+---------------+
| Status               | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+----------------------+----------+----------+------------+--------------+---------------+
| preparing            | 0.000022 | 0.000017 |   0.000004 |            0 |             0 |
| Sorting result       | 0.000014 | 0.000009 |   0.000005 |            0 |             0 |
| executing            | 0.000011 | 0.000007 |   0.000004 |            0 |             0 |
| Sending data         | 0.000021 | 0.000016 |   0.000004 |            0 |             0 |
| Creating sort index  | 0.906290 | 0.826584 |   0.000000 |            0 |             0 |
Copy the code

As you can see, the execution time of the same SQL query is approximately 0.89 seconds, and the time is mostly spent in the Creating sort Index phase.

Enable the queryCache scenario

When query caching is enabled, the SQL text and query results will be cached in QC when the query statement is executed for the first time, and the same SQL execution will be performed next time to obtain data from QC and return it to the client.

#Disable QueryCache configuration
query_cache_size = 32M
query_cache_type = 1
Copy the code

-- Execute the query statement for the first time
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id      | name       | age | gender | phone       | create_time         | update_time         |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
|  997855 | User997854 |  54 | M      | 15240540354 | 2020- 12- 15 14:34:50 | 2020- 12- 15 14:34:50 |.10 rows in set (0.89 sec)
-- Execute the query statement for the second time
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id      | name       | age | gender | phone       | create_time         | update_time         |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
|  997855 | User997854 |  54 | M      | 15240540354 | 2020- 12- 15 14:34:50 | 2020- 12- 15 14:34:50 |.10 rows in set (0.00 sec)
-- profile tracks data
mysql> show profile cpu,block io for query 3;
+--------------------------------+----------+----------+------------+--------------+---------------+
| Status                         | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+--------------------------------+----------+----------+------------+--------------+---------------+
| Waiting for query cache lock   | 0.000016 | 0.000015 |   0.000001 |            0 |             0 |
| checking query cache for query | 0.000007 | 0.000007 |   0.000000 |            0 |             0 |
| checking privileges on cached  | 0.000004 | 0.000003 |   0.000000 |            0 |             0 |
| checking permissions           | 0.000034 | 0.000033 |   0.000001 |            0 |             0 |
| sending cached result to clien | 0.000018 | 0.000017 |   0.000001 |            0 |             0 |
Copy the code

The first execution of QueryCache does not cache THE SQL text and data, and the execution time is 0.89s. Since QC is enabled, the SQL text and execution results are cached in QC. The second execution executes the same SQL query statement, directly hits QC and returns data, without hard parsing. Sending cached result to client Sends the QC data back to the client.

Query cache hit ratio

Query the cache-related status variable

mysql>SHOW GLOBAL STATUS LIKE 'QCache\_%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_free_blocks      | 1        |  Query the number of memory blocks available in the cache.
| Qcache_free_memory      | 33268592 |  Query the amount of memory available for the cache.
| Qcache_hits             | 121      |  -- The number of times the result set was obtained from QC.
| Qcache_inserts          | 91       |  -- The number of times the query result set is added to QC, meaning the query is no longer in QC.
| Qcache_lowmem_prunes    | 0        |  -- The number of queries that were removed from the query cache due to insufficient memory.
| Qcache_not_cached       | 0        |  -- The number of uncached queries.
| Qcache_queries_in_cache | 106      |  -- The number of queries registered in the query cache.
| Qcache_total_blocks     | 256      |  Query the total number of blocks in the cache.
Copy the code

Query cache hit ratio and average size

                                          Qcache_hits
Query cache hit rate = ------------------------------------------------ x 100%
                       Qcache_hits + Qcache_inserts + Qcache_not_cached
                       
                              query_cache_size = Qcache_free_memory
Query Cache Avg Query Size = --------------------------------------- 
                                     Qcache_queries_in_cache
Copy the code

Impact of update operation on QC

For example, in the transfer logic of the payment system, the account should be locked first and then the balance should be modified. The main steps are as follows:

Query_ID	Query	Description
1	reset query cache	Clear the query cache.
2	select balance from account where id = 121	First execution, missed QC, added to QC.
3	select balance from account where id = 121	Hit QC, return result directly.
4	update account set balance = balance – 1000 where id = 121	Update, lock query Cche for update, cache data invalid.
5	select balance from account where id = 121	Cache invalidated, missed, added to QC.
6	select balance from account where id = 121	Hit QC, return result directly.

In this case, QC is not suitable, because the first execution of the query SQL did not hit, return results to the client, after adding SQL text and result set to QC, the next execution of the same SQL directly return results from QC, no hard parsing operation, but each Update is the first Update data. Then lock QC and update the cache results, will lead to the previous cache results invalid, execute the same query SQL or not hit, have to add to QC again, such frequent lock QC-> check QC-> add QC-> update QC is very consumption of resources, reduce the database concurrent processing capacity.

Why was QueryCache abandoned

Common Service Scenarios

Service systems can be divided into OnLine Transaction Processing (OLTP) and OnLine Analysis Processing (OLAP) by operation types. It can also be divided into BOSS (Business Operation Support System) and BASS (Business Analysis Support System). To summarize the characteristics of these two types of systems.

Suitable for QueryCache scenarios

First, the size of the QueryCache QC is only a few megabytes, which is not a good place to set the cache too large, and because threads lock QueryCache during updates, you may see lock contention issues for very large caches. So what are the situations that help you benefit from query caching? Here are the ideal conditions:

The same query is issued repeatedly by the same or multiple clients.
The underlying data being accessed is static or semi-static in nature.
Queries can be resource intensive and/or build short but computationally complex result sets that are small.
Concurrency and query QPS are not high.

These four conditions is ideally, the actual business systems are CRUD operations, data update more frequently, query interface of QPS is higher, so can meet the above ideally business scenario is very few, I can think of is the configuration tables, these basic data dictionary table is static or half static, can through the QC to improve the query efficiency.

Not suitable for QueryCache scenarios

If the table data changes rapidly, the query cache will be invalidated, and the server will become overloaded and slower to process due to the constant removal of queries from the cache. If the data is updated every few seconds or more frequently, the query cache will be less likely to be appropriate.

At the same time, the query cache uses a single mutex to control access to the cache, effectively imposing a single-threaded gateway on the server SQL processing engine, which can become a performance bottleneck in the case of high query QPS, seriously slowing down query processing speed. Therefore, query caching is disabled by default in MySQL 5.6.

Delete QueryCache

The query cache is deprecated as of MySQL 5.7.20, and is removed in MySQL 8.0.deprecation includes query_cache_type, From default disabling MySQL 5.6, deprecating MySQL 5.7, and removing MySQL 8.0 completely, Oracle has taken a number of factors into consideration to make this choice.

The QueryCache feature is too demanding for business scenarios, and it is difficult to match the actual business. Moreover, after enabling it, the concurrency and processing capacity of the database will be reduced a lot. To summarize the main reasons why MySQL was Removed from Disabled->Deprecated->Removed QueryCache.

At the same time, QueryCache fragmentation can also cause server load increase, affecting the stability of the database, Oracle official search QueryCache, there are many bugs, which determines MySQL 8.0 directly Remove this feature decisively.

conclusion

MySQL QueryCache was originally designed to reduce the cost of hard parsing caused by repeated SQL queries, while converting physical IO to logical IO to improve the efficiency of SQL execution. However, MySQL has gone through multiple iterations. At the same time in the hardware storage development of fast today, QC almost no benefit, and will reduce the database concurrent processing capacity, and finally in version 8.0 directly removed Removd.

In fact, cache design ideas are ubiquitous in hardware and software. Hardware: RAID cards and CPUS all have their own caches, but there are too many software aspects, such as OS cache, database buffer pool and Java program cache. As a R&D engineer, it is very important to select appropriate cache schemes according to business scenarios. If none of them are suitable, customized development cache is required. To better Match their own business scene, today we talk about so much, I hope to help you.

I’m Aobing, the more you know, the more you don’t know, thank you for your talent: likes, favorites and comments, we’ll see you next time!

This article is constantly updated. You can search “Santaizi Aobing” on wechat and read it for the first time. Reply [Information] There are the interview materials and resume templates for first-line big factories prepared by me.