We have already described the basic data storage unit of the Buffer Pool, the cache page. Cached pages store row after row of data, and each cached page has a description.

How do you initialize the Buffer Pool when MySQL starts? How do you load data files from disk into cached pages?

When MySQL is started, a Buffer Pool is allocated to the Buffer Pool based on the configuration. The default Buffer Pool size of 16K cache pages is then divided into each cache page and their corresponding description data blocks, which are relatively small.

The Buffer Pool will then create a free list and place the description blocks in the free list. The free list is common. If there are more than one Buffer Pool, they will share the free list.

As shown in the figure above, the free list is a bidirectional list, which contains Pointers to the description data blocks of each cache page, pointing to the description data in each Buffer Pool.

In addition, the free list has a base node that refers to the first and last nodes of the bidirectional list and keeps track of how many nodes there are in the free list, that is, how many free buffer pages.

Now with the free list, when we load the data page from disk to the Buffer Pool cache page, we can get a description block from the free list, and then we can get the corresponding free cache page of this description block.

After loading the data page from disk into the Buffer Pool, related description data will be written to the description block of the Buffer Pool, such as the tablespace that the data page belongs to, page number, and finally remove the node from free.

How do I know if the data page has been loaded into the Buffer Pool?

When we operate data, we must first check whether the data page corresponding to this data has been loaded into the cache page, if it has been loaded, it will be directly used. If it is not loaded, it finds a free cache page from the free list, writes the description data, and removes the node from the free list.

The basis for its judgment,isA hash table of a database.

This hash table uses table space number + data page number as key and buffer page address as value.

When you use a data page, check whether the page exists in the hash table. If it exists, the buffer page is found by value, if it does not exist, it is loaded, and the information about the page is written to the hash table.

When we load the data page to the cache page, we can manipulate the data. When you complete the update of the cache page data, the data on the cache page is inconsistent with the data on the disk. How to brush the data to the disk?

You can’t just flush all the cached pages to disk every time you change a piece of data, right? Some of them may have been loaded into the Buffer Pool for reading and not modified. There is no need to flush the Buffer.

InnoDB also has a Flush linked list, which is similar to the Free list. Each node in the Flush linked list holds a pointer to a block. When a page is modified, the corresponding block is added to the Flush linked list. As shown below:

When MySQL is started, a Buffer Pool will be allocated, and then a cache of pages will be divided. These pages are empty and will be added to the free list. When accessing data, check whether the cache page is in the data page cache hash table. If it is, use it directly. If it is not, load the data page from the disk file and remove the node from the free list. When the changes are complete, the cached page is added to the Flush list and is waiting to be flushed to disk.

Next, we need to consider a question: if we keep loading data pages from disk into the free list, the free list will keep removing free cache pages. There will always be no free cache pages in the free list, and then we need to load data pages into the free cache page.

The size of the Buffer Pool is fixed, and the number of free cache pages is also fixed when the free list is initialized. When all the cache pages are filled, some cache pages need to be eliminated.

Which ones to eliminate?

This brings in cache hit ratios.

If we have two cached pages, one that accesses 100 times in a minute and the other that accesses 10 times in a minute, we can say that the former has a higher cache hit ratio than the latter.

In this case, a cached page must be eliminated, which must be the one with a low cache hit ratio.

At this point, many students will certainly think of it, yes, is the LRU algorithm.

LRU, least recently used

InnoDB uses LRU linked lists to know which cached pages have been used least recently, and when you need to weed out some cached pages, you know which cached pages to weed out.

The feature of LRU list is that as long as you visit a node, it will move the node to the head of the list, that is to say, the most recently visited node must be at the head of the LRU list, the less visited node, naturally to the end of the list.

This way, when the cache page has not been accessed for a long time, it is at the end of the LRU linked list. When it is time to weed out the cache page, these infrequently accessed nodes can be weeded out and the disk file data page needed can be reloaded into the free cache page.

Conclusion:

This paper introduces three linked lists:

Free linked list, which manages idle cache pages in the Buffer Pool;

Flush linked list that manages cached pages that have been modified to flush files to disk;

Lru linked list, with LRU algorithm to manage which cache page access frequency is high, which cache page access frequency is low, when there is no idle cache page, you can put the low frequency of access to the buffer page expired, reload the cache page needed.

END