1. Buffer Pool Overview
What is a Buffer Pool? It literally means cache pool, and yes, that’s exactly what it means. It is a critical component of MySQL, so to speak, all MySQL add, delete and modify operations are performed in the Buffer Pool.
But isn’t the data on disk? Why does it have anything to do with a cache pool? That’s because if MySQL does all its operations on disk, it’s obviously inefficient. Why is it inefficient? Since the database needs IO to retrieve data from disk, and the database doesn’t know where on disk it is looking for data, it needs to perform random IO, which is a dead end. Therefore, MySQL operates on data in memory, that is, in the memory component Buffer Pool.
In fact, it is similar to Redis, because Redis is an in-memory database, its operations are carried out in memory, and there are certain policies to persist it to disk. What is the memory structure of the Buffer Pool? Do so many operations need to be kept in memory? Since it is similar to redis cache, is it also like Redis also has a certain elimination strategy?
In this article, we will introduce the memory structure of the Buffer Pool in detail, so that you can thoroughly understand each step of the execution process. Let’s take a look at MySQL from loading a disk file to completing a transaction. Let’s look at an overall flow chart, the data is loaded into the cache Pool in the disk, and then through a series of operations and eventually be brush into a process of disk, have been through what kind of things, this figure does not matter if I don’t understand, because this article focuses on Buffer Pool this whole process is to make everyone have a slightly impression.
2. How big is the Buffer Pool
The Buffer Pool is an area of memory in InnoDB. It must have its own size, and the default size is 128 MB. However, this size seems to be a little small, and your production environment can adjust the actual size of memory. Innodb_buffer_pool_size =2147483648 In bytes,
# View and adjust innodb_buffer_pool_size 1. SELECT @@innodb_buffer_pool_size/1024/1024/1024; SELECT @@innodb_buffer_pool_size/1024/1024; Set global Innodb_buffer_pool_size = 4227858432; set innodb_buffer_pool_size = 4227858432; ## Unit byteCopy the code
Its overall structure in InnoDB looks something like this
3. Data pages
MySQL is loading data into a Buffer Pool when adding, deleting, or modifying data. MySQL does not operate on rows as we do. Instead, it abstracts the concept of data pages. The default size of each page is 16KB, and these parameters can be adjusted. However, it is recommended to use the default, after all, MySQL can do as much as possible already. Each data page stores multiple pieces of data. MySQL will first locate the data page of the data and then load the data page into the Buffer Pool when adding, deleting or modifying the data.
4. Cache pages
When data pages are loaded into the Buffer Pool, there is also a concept called a Buffer page, which is also 16KB in size. However, MySQL also provides some additional space for each Buffer to describe information about the corresponding Buffer page. For example: The size of the descriptive data block is about 15% of the size of the cached page (about 800KB).
When was the cache page created? Innodb_buffer_pool_size = innodb_buffer_pool_size = innodb_buffer_pool_size = innodb_buffer_pool_size = innodb_buffer_pool_size = innodb_buffer_pool_size Because [description data] also occupies a certain amount of memory space, when the memory area is applied, MySql will divide the memory area into each cache page and the corresponding description data according to the default cache page size (16KB) and the size of the corresponding 'cache page *15%' size (about 800B)Copy the code
5. Free lists
Each data page is loaded into a cached page, but how does MySQL know which cached page has data and which cached page does not? In other words, how does MySQL distinguish which cached pages are idle and can be used to hold data pages?
In order to solve this problem, MySQL has designed a two-way linked list for the Buffer Pool, called the free list. The free list is used to store the description block of the free Buffer Pool. The description data for each free cached page forms a bidirectional linked list, which is called the free list. The function of the free list is to store the description data of free cached pages. In addition, the free list also has a base node, which refers to the head and tail of the list, and also records the number of nodes (i.e. the number of available free cached pages).
At this point, he can be described with the following picture:
When loading the data page into the cache pool, MySQL will fetch a description of the data from the free list, fetch the corresponding cache page based on the description node information, and then put the data page information into the cache page, and remove the description node from the list. This is when the data page is read from the cache page in the Buffer Pool.
But how does MySQL know which data pages are cached and which are not? In fact, there is a later hash table structure in the database, which is used to store the table space number + data page number as the key of the data page, and the corresponding address of the cache page as its value. In this way, when the data is loaded, the key in the hash table will determine whether the data page is cached.
6. Flush list
MySql always loads the data as pages in the Buffer Pool. This is done in memory, and then there is a background thread to flush the dirty data to disk, but the background thread must know what to flush.
To solve this problem, MySQL has designed a Flush linked list, which records the description of the cached page where the dirty data has been modified. If the data in memory is different from the data in the database or database, the data is called dirty data. Dirty data is called dirty data because the data in the cache pool has been modified but has not been flushed to disk.
Similarly, the description of the cache page in which the data has been modified is maintained in Flush, so the dirty data is maintained in Flush.
In addition, when a dirty data page is flushed to disk, it frees up space and goes back to the Free list.
7, LRU linked list
If the system has been adding, deleting and modifying the database, the basic internal process of the database is:
Let’s also use the Redis class as an analogy to help you understand how this works. Flush is similar to redis key expiration, so redis does not run out of memory in general, but there are always special cases where problems arise in extreme and marginal conditions.
If Redis runs out of memory, does it have some elimination strategy? The most basic rule is to eliminate keys that are not often used. Similarly, the Buffer Pool, which also runs out of memory, is maintained through the LRU linked list. LRU (Least Recently used Uesd)
MySql will flush the least recently used cached page data to disk, so how does MySql determine LRU data? MySql specifically designed LRU linked lists for this purpose, and also introduced another concept: cache hit ratio
The cache hit ratio can be understood as the frequency at which the cache is used, for example: Now there are two cache pages. In 100 requests, the cache page A has been hit 20 times, and the cache page B has been hit 2 times. Obviously, the hit rate of the cache page A is higher, which means that A is more likely to be used in the future, while B will be considered by MySQL to be basically not used.Copy the code
Speaking of which, how exactly does the LRU work? Assuming that the MySQL to load data into the buffer pool, he will be loaded in the cache pages in the order was loaded in inserted into the LRU list head (is the head of the linked list method), assuming the MySQL now has A, B, C data to be loaded separately to the cache pages in A, B, C, and then the LRU list is roughly like this.
MySQL will insert the description of page B into the header of the LRU list as shown in the following figure:
Then another request comes in, the data is already cached in cache page C, and the LRU will look something like this:
At the end of the day, every time the query data if the data is already in the cache pages, so will the cache pages corresponding to the description on the head of the LRU list and if not in the cache pages, degaussing plate to find, if find arrived, it is loaded into the cache, and described the data corresponding to the cache pages of information is inserted into the LRU list in the head. This means that the most recently used cache pages are listed first, and the following description is not often used.
Finally, if the Buffer Pool runs out, MySQL will flush the last node of the LRU list to disk, and the Buffer Pool will free up memory. Let’s take a look at the overall flow chart
Trouble with LRU lists
The trouble here refers to the pre-read mechanism of MySQL itself
When MySQL loads data from disk, it loads adjacent data pages into the cache. MySQL > select * from 'MySQL' where 'MySQL' = 'MySQL'; select * from 'MySQL' where 'MySQL' = 'MySQL'; This may not be true. The data may be large, it may be at the beginning of the data page, or it may be at the end of the data page. Therefore, MySQL will load adjacent data pages of a data page into the cache pool to improve efficiency.Copy the code
The figure above shows that the adjacent pages of B are also loaded in front of the C description data, while the hit ratio of C is much higher than that of the adjacent pages of B. This is the problem caused by the LRU itself.
Innodb_read_ahead_threshold. Its default value is 56, which means that if the number of pages accessed in a row exceeds this threshold, the prefetch mechanism will be triggered. Load all data pages in the next adjacent area into the cache. If the Buffer Pool contains 13 consecutive data pages in an area, and these pages are frequently accessed, the prefetch mechanism will be triggered directly, and the rest of the area will be loaded into the cache. Random prefetch is implemented by: Innodb_random_read_ahead: innodb_random_read_ahead: innodb_random_read_ahead: innodb_random_read_ahead: innodb_random_read_ahead: innodb_random_read_aheadCopy the code
SELECT * FROM students; SELECT * FROM students; SELECT * FROM students; SELECT * FROM students; But loading so much data to the head of the list pushes all the other frequently hit pages straight to the back.
Above the indications are that the proofs mechanism problems is very big, now that so much, then why did the MySQL to proofread the mechanism, is ultimately in order to improve the efficiency, * * the introduction of a new technology, often brings new challenge to * *, let’s together to see the MySQL is how to solve the problems brought by the preload.
LRU linked list based on hot and cold data separation
The so-called hot and cold separation is to divide the LRU linked list into two parts, one is the hot data that is often used, and the other is the cold data that is loaded in but rarely used. Innodb_old_blocks_pct is controlled by the innodb_old_blocks_pct parameter, which defaults to 37, or 37%. As shown in the figure, it is roughly as follows:
When data is loaded from disk into the cache pool, it is first placed in the head of the cold data area. Then, if the data is accessed again after a certain period of time, the corresponding description data of the cache page of the data is placed in the head of the hot data area linked list.
After a certain amount of time, if the item is loaded into the cache and then accessed again, then the item is moved to the head of the hot list, but is never used again, is there still the same problem?
MySQL uses innodb_old_blocks_time to set the number of days after the data is loaded into the cache pool before it is accessed again to transfer the data to the head of the hot data block list. Milliseconds, or 1 second later, if the data is accessed again, then the data will be moved from the cold data area of the LRU list to the hot data area.
Now let’s go back to the question above
1. The data will be stored in the cold data section of the LRU list. The data in the linked list refers to < description data corresponding to data in the cached page >). After the specified time, if some cached pages are accessed, the description data of the cached page will be put in the head of the linked list in the hot data area. As above, the data is first in the cold data area, and then after a certain period of time, the data page will be transferred to the hot data area of the linked list header, so this is a good solution to the problem caused by the full table scanCopy the code
Consider the problem of not having enough memory in the Buffer Pool
The Buffer Pool does not have enough space. That is, there are not enough free cached pages to use. 1. This problem appears very simple at this time. Directly brush the description data of the last node of the cold data area of the linked list to the disk with the corresponding cache page.Copy the code
But that’s not perfect enough. Why is that? What we’ve been talking about is that the cold data gets accessed, and then under certain rules it gets loaded to the head of the hot data link, but now the data that a request needs to access is in the hot data link, Transfer the description of the cache page directly to the head of the hot list.
This is obviously not the case, because the hot data itself is frequently accessed, so if the linked list is moved every time it is accessed, it is bound to cause performance degradation (the impact may be uncontrollable even in extreme cases), so MySQL also has related rules for hot data transfer.
The rule is: if the access to the data in the cache pages in the top 25% of the thermal data area, so the cache pages corresponding to the description of the data will not be transferred to thermal data chain table head, only when the access cache pages corresponding to the description of data in a hot zone list after 75%, the cached page description of data will be transferred to the thermal data chain table head.
For example, if the hot data section has 100 cached pages, the linked list contains cached page descriptions. If the number of pages accessed is in the top 25, the hot list will not change. If the number of pages accessed is between 26 and 100 (i.e. the data is in the bottom 75% of the hot list), the hot list will be moved to the head.
At this point, MySQL’s optimization of LUR lists is perfect. Does it make a lot of sense to look at this? Well, that’s it for the LRU list.
Buffer Pool linked list summary
If a page is used, the description of the page is removed from the free list. If a page is used, the description of the page is removed from the free list. If a page is flushed to disk, the page's description is removed from the Flush. It is also removed from the LRU list (since the data is no longer in the Buffer Pool, it has been flushed to disk, so there is no need to record it in the LRU list), and the description of the cached page is added to the free list because the cached page has become free. After the LRU linked list data page is loaded into the corresponding cache page in the Buffer Pool, the corresponding description data of the cache page will be put into the head of the cold data in the LRU linked list. When the cold data is accessed again after a certain period of time, it will be transferred to the head of the hot data list. If the data being accessed is in the hot data area, it will not be moved if it is in the first 25%, and will still be moved to the head of the hot data list if it is in the last 75%Copy the code
The background thread flushes the cache page corresponding to the description data of the last node of the cold data area into the disk file
11. Concurrency performance of the Buffer Pool
If multiple requests are added, deleted or modified at the same time, will they operate the linked lists in the Buffer Pool in parallel? Is there any problem if it’s parallel?
MySQL actually handles this problem with a very simple consideration: The Buffer Pool can only be operated by one thread at a time. Only one thread can perform these operations at a time. In order to ensure data consistency, MySQL must lock the Buffer Pool during operation.
At this point, people must be wondering at this point. What’s the efficiency of serial? Remember, all of this is done in memory, and it’s actually instantaneous, and it’s done in memory on the order of milliseconds or even subtler levels.
But then again, serial execution is serial no matter how fast, although not a performance bottleneck, what better way to optimize this? Surely MySQL has already designed these rules. Buffer pools can have multiple Buffer pools, which can be configured using the MySQL configuration file.
Innodb_buffer_pool_size =8589934592 # Number of instances of the Buffer Pool innodb_buffer_pool_instance=4Copy the code
In the production environment, you are advised to use this policy if the hardware is not tight. If you have a question at this time (if not, you are not thinking carefully), you should have this question:
# q: If multiple threads access different Buffer pools, the data loaded by different threads must be in different Buffer pools. Suppose thread A loads data page A into Buffer Pool A. Thread B loads the data page B into Buffer Pool B, and then performs both operations. At this point, thread C arrives and reaches Buffer Pool B, but the data page C accesses is in Buffer Pool A. Will C still load data page A at this time? Will this happen? Will the same data pages be cached in different Buffer pools? # Buffer Pool = Buffer Pool = Buffer Pool = Buffer Pool = Buffer Pool In fact it has already been mentioned, that is data page caching hash tables (see below), which contain the table space number + data page number = cached page address, so the MySQL while loading data in the data page based on a series of mapping relationships to determine whether a data page is loaded, loaded into the cache pages, So MySQL can accurately determine whether a data page is loaded, which cache page is loaded, and there is no possibility of repeated loading.Copy the code
Dynamically adjust the size of the Buffer Pool
So far, this paper has introduced the memory structure of the Buffer Pool in detail, how its data is stored, how the disk is flushed, how it is loaded, and what kind of form it exists in and so on. Now we will continue to dig and explain the relevant knowledge points of the Buffer Pool at one time. Let’s now discuss whether the size of the Buffer Pool can be dynamically adjusted.
Let’s say our current Buffer Pool size is 2GB and we want to expand it to 4GB. Let’s talk about what MySq needs to do if we want to do this. First, MySQL needs to apply to the operating system for a 4 gb contiguously-addressed-contiguousmemory space, and then copy the data from the original Buffer Pool to the new Buffer Pool.
Is that possible? If the original Buffer Pool is 8GB and the Buffer Pool is 16GB, then copying the original Buffer Pool to the new Buffer Pool will be very time-consuming, so MySQL does not support this operation. But in fact, such a requirement is objective existence, so how does MySQL solve it?
To deal with this, MySQL has designed a chunk mechanism (HTTP protocol also uses this idea, so we will find many good ideas borrowed from each other) to solve this problem
Chunk is a mechanism designed by MySQL to split the Buffer Pool into chunks of equal size. The default size of each chunk is 128 MB (which can be adjusted with innodb_buffer_pool_chunk_size). This means that the Buffer Pool is made up of chunks. By default, the size of a chunk is 128MB, which means that a 2GB Buffer Pool consists of 16 chunks, each of which has its own cache page and description data. Free lists, Flush lists and LRU lists are sharedCopy the code
This is true if there are multiple Buffer pools
There is no mention of how MySQL uses the chunk mechanism to adjust size. In fact, if the current Buffer Pool is 2GB and contains 16 chunks, and you want to expand it to 4GB, you only need to apply for new chunks one by one.
In this way, there is no need to apply for a large contiguous space, and there is no need to replicate data. This allows for dynamic resizing (no one asks: This is just enlargement, how to shrink? Gnu). I have to say MySQL is really smart.
13. How to set the Buffer Pool size in the production environment
Is the Buffer Pool bigger the better? Theoretically yes. If a machine has 16GB of memory, allocate 15GB to the Buffer Pool. This is obviously not possible because the operating system takes up memory. That must take up memory, too. A reasonable size based on many practical production experiences is the size of the machine memory (50%~60%).
Finally, take a look at your INNODB parameters. Show Engine INNODB status
show engine innodb status; -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Buffer Pool AND the MEMORY -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Buffer Pool the final size of the Total MEMORY allocated - Buffer Pool size -- How many buffers are available in the LRU list -- How many buffers are available in the LRU list -- Old Database pages -- Flush cache pages -- Modified DB Pages -- Pending reads -- LRU list Flush Pending writes number of cache pages to flush from LRU list: LRU 0, Flush List 0, single Page 0 Pages Made Young 260368814, not young 0 -- The number of Pages that are transferred from cold data to hot data per second, And the number of cached pages per second that are accessed in the cold data area but not in the hot data area 332.69 youngs/s, 0.00 non-youngs/s -- The number of cached pages that have been read, created and written, Pages Read 249280313, created 1075315, written 32924991 359.96 reads/s, 0.02 creates/s, 0.23 writes/s -- How many hits are made in the BufferPool cache, and how much data is moved from the cold data to the hot data per 1000 accesses. Buffer Pool hit rate 867/1000, young-making rate 123/1000 not 0/1000 -- The number of cached pages in lRU list lru len: I/O sum[5198]:cur[0],Copy the code
14. Concluding remarks
In this article, we discussed the memory structure of Buffer Pool in detail, including the whole process from free linked list to LRU linked list, from Buffer Pool to chunk, loading a data page from disk to Buffer Pool, and finally flushing the data page back to disk. What he did with every step of the way.
After we discussed this article together, we didn’t immediately have a feeling of seeing through MySQL, but this is just the premise. The purpose of learning these is to better understand MySQL so that we can use it more easily in our work. Because only in the case of knowing the underlying principle, can be familiar with his working principle, encountered problems can be appropriate medicine.
Like this article friends, welcome to pay attention to the public number programmer xiao Grey, watch more exciting content