In fact, why mysql design prefetch mechanism is just to optimize performance. Now suppose you access data pages 1 through 56 in sequence. So mysql thinks you’re going to access page 60. It will of course load the data pages of the following data area into the buffer-pool. When you access page 60, you can perform the buffer-pool to get better performance.
LRU linked list is designed based on the idea of hot and cold data separation
The problem with our simple LRU list was that we mixed hot and cold data together, so now we separate the hot and cold data. I control it by configuring the innodb_old_blocks_pct parameter which defaults to 37. That means 37 percent of cold data.
So where was our data first loaded into the buffer-pool? It’s actually at the head of the cold data area. When will our cold data be transferred to the hot data? It’s accessed and used within 1 second of the data being loaded into the cold data area and then it goes to the head of the hot data. We can set innodb_old_blocks_time to 1000 by default which is our 1 second, we can also change it. Have we solved the problem with mysql prefetch? So let’s look at the picture and think about it. If the preread data loaded in all of a sudden is in the cold data area, then it must be accessed in the hot data area within 1 second. So the hot data area of our LRU linked list is the data that is often used and accessed. The cold data area stores cache pages that do not need to be paid attention to. When the Free linked list is not enough and some cache pages need to be eliminated, we directly go to the cold data area to find the tail node and start elimination.
Optimize the thermal data area
For the hot data area do I move to the head as soon as a cache is used I don’t think it should be. Because our hot data area is frequently accessed it is not appropriate to move list nodes frequently. So we have to move to the front for the bottom three quarters of the hot data area to be accessed. For the first quarter of the area used, it will not move. For example, let’s say our LRU linked list has 100 nodes in the hot data area. The first 25 nodes are not moved when they are used, and only the last 75 nodes are moved when they are used. This minimizes the movement of linked list nodes.
Let’s recall that when a cached page loads data into a buffer-pool, the first Free list removes the cached page, because Free holds Free cached pages that don’t load data. Then the flush linked list will record the dirty page and then the LRU will add you to the cold data area and if you are accessed within 1 second the cached page will go to the hot data area. All in all, Mysql needs to constantly operate these lists while doing CRUD. When is the data in the cache page flushed to disk?
- Periodically flush disk files (as shown above) it is not necessary to wait until there are no free cache pages to use before flushing them to disk files. It runs a scheduled task to flush cached pages from the cold data area to disk at regular intervals. These cached pages are then added back to the Free list, removed from the Flush list, and removed from the LRU list.
- There is no space to flush cached pages to disk. If only one data page is needed to load into the buffer-pool, it must find a node in the cold data area of the LRU list to flush that cache page to disk. Then the free list has one node free. Removed from LRU list and Flush list.
Here we share the mysql memory component buffer-pool and how we base our CRUD on the memory component. I’ve seen how three linked lists fit together.