One, foreword

MySQL > select * from ‘MySQL’;

mysql> select version(); + -- -- -- -- -- -- -- -- -- -- -- + | version () | + -- -- -- -- -- -- -- -- -- -- -- + | 5.7.17 | + -- -- -- -- -- -- -- -- -- -- - + 1 row in the set (0.00 SEC)Copy the code

Table structure:

mysql> desc test; +--------+---------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra |  +--------+---------------------+------+-----+---------+----------------+ | id | bigint(20) unsigned | NO | PRI | NULL |  auto_increment | | val | int(10) unsigned | NO | MUL | 0 | | | source | int(10) unsigned | NO | | 0 | | + + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- - + + -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + 3 rows in the set (0.00 SEC)Copy the code

Id is the autoincrement primary key, and val is a non-unique index.

A large amount of data, 5 million in total:

mysql> select count(*) from test; + -- -- -- -- -- -- -- -- -- -- + | count (*) | + -- -- -- -- -- -- -- -- -- -- + | 5242882 | + -- -- -- -- -- -- -- -- -- -- + 1 row in the set (4.25 SEC)Copy the code

We know that efficiency problems occur when the offset in the limit offset rows is large:

Mysql > select * from test where val=4 limit 300000,5; +---------+-----+--------+ | id | val | source | +---------+-----+--------+ | 3327622 | 4 | 4 | | 3327632 | 4 | 4 | | 3327642 4 4 | | | | 3327652 | | 4 4 | | 3327662 | | | + -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- - + 5 rows in the set (15.98 SEC)Copy the code

To achieve the same goal, we usually rewrite it as:

Mysql > select * from test a inner join (select id from test where val=4 limit 300000,5) b on a.id=b.id; +---------+-----+--------+---------+ | id | val | source | id | +---------+-----+--------+---------+ | 3327622 | 4 | 4 | 4 | 3327622 | | 3327632 | | 3327632 | | 3327642 | | | 3327642 | | 3327652 | | | 3327652 | | 3327662 | | | 4 3327662 | + -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- - + 5 rows in the set (0.38 SEC)Copy the code

The time difference is obvious.

Why the above results? Select * from test where val=4 limit 300000,5; Query process:

Query index leaf node data. Query all required field values on the cluster index according to the primary key value on the leaf node.

Something like this:

As above, you need to query the index node 30,005 times, query the data of the cluster index 30,005 times, and finally filter out the first 300,000 results and extract the last 5 results. MySQL spends a lot of random I/O queries on clustered index data, and 300,000 random I/O queries do not show up in the result set.

The question must be asked: since you started with an index, why not go down the index leaf nodes to the final five nodes and then query the actual data in the clustered index? This only takes 5 random I/ OS, similar to the process shown below:

Actually, I was gonna ask you the same question.

confirmed

Let’s verify the above inference in practice:

Select * from test where val=4 limit 300000, select * from test where val=4 limit 300000 We need to know if MySQL has a way to count the number of times a data node is queried through an index node in a SQL query. I tried the Handler_read_* series first, and unfortunately none of the variables met the criteria.

I can only confirm this indirectly:

InnoDB has buffer pools. It contains recently accessed data pages, including data pages and index pages. So we need to run two SQL to compare the number of data pages in the buffer pool. Select * from test a inner join (select id from test where val=4 LIMIT 3000005) b> Select * from test where val=4 limit 300000,5; ‘, because the previous SQL only accessed the data page 5 times, and the next SQL accessed the data page 30,005 times.

Select * from test where val=4 limit 300000,5

mysql> select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name; The Empty set (0.04 SEC)Copy the code

As you can see, there are no data pages for the test table in the buffer pool.

Mysql > select * from test where val=4 limit 300000,5; +---------+-----+--------+ | id | val | source | +---------+-----+--------+ | 3327622 | 4 | 4 | | 3327632 | 4 | 4 | | 3327642 4 4 | | | | 3327652 | | 4 4 | | 3327662 | | | + -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- - + 5 rows in the set (26.19 SEC) mysql > select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name; +------------+----------+ | index_name | count(*) | +------------+----------+ | PRIMARY | 4098 | | val | 208 | + -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + 2 rows in the set (0.04 SEC)Copy the code

As you can see, there are 4098 data pages and 208 index pages in the buffer pool for the test table.

Select * from test a inner join (select id from test where val=4 LIMIT 3000005) Restart the mysql.

mysqladmin shutdown
/usr/local/bin/mysqld_safe &
Copy the code
mysql> select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name; The Empty set (0.03 SEC)Copy the code

Run the SQL:

Mysql > select * from test a inner join (select id from test where val=4 limit 300000,5) b on a.id=b.id; +---------+-----+--------+---------+ | id | val | source | id | +---------+-----+--------+---------+ | 3327622 | 4 | 4 | 4 | 3327622 | | 3327632 | | 3327632 | | 3327642 | | | 3327642 | | 3327652 | | | 3327652 | | 3327662 | | | 4 3327662 | + -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- - + 5 rows in the set (0.09 SEC) mysql > select index_name, count (*) the from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name; +------------+----------+ | index_name | count(*) | +------------+----------+ | PRIMARY | 5 | | val | 390 | + -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + 2 rows in the set (0.03 SEC)Copy the code

The first SQL loaded 4098 pages into the buffer pool, while the second SQL loaded only 5 pages into the buffer pool. In line with our predictions. It also confirms why the first SQL was slow: a large number of useless rows were read (300,000) and then discarded.

In addition, this can cause a problem: loading a lot of data pages into the buffer pool, which will pollute the buffer pool and occupy the buffer pool space.

Problems encountered

To ensure that the buffer pool is empty on every restart, we need to turn off innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup, These options control how much data is dumped from the buffer pool when the database is shut down and how much data is loaded onto disk when the database is started.

References:

1. explainextended.com/2009/10/23/…

2. dev.mysql.com/doc/refman/…


Recommend 3 original Springboot +Vue projects, with complete video explanation and documentation and source code:

Build a complete project from Springboot+ ElasticSearch + Canal

  • Video tutorial: www.bilibili.com/video/BV1Jq…
  • A complete development documents: www.zhuawaba.com/post/124
  • Online demos: www.zhuawaba.com/dailyhub

【VueAdmin】 hand to hand teach you to develop SpringBoot+Jwt+Vue back-end separation management system

  • Full 800 – minute video tutorial: www.bilibili.com/video/BV1af…
  • Complete development document front end: www.zhuawaba.com/post/18
  • Full development documentation backend: www.zhuawaba.com/post/19

【VueBlog】 Based on SpringBoot+Vue development of the front and back end separation blog project complete teaching

  • Full 200 – minute video tutorial: www.bilibili.com/video/BV1af…
  • Full development documentation: www.zhuawaba.com/post/17

If you have any questions, please come to my official account [Java Q&A Society] and ask me