Concern public number: IT elder brother, read a dry goods article every day, a year later you will find a different self.
My host memory is only 100G, now I want to scan a 200G table, will the DB host memory used up?
Logical backup, is not to do the whole library scan? If this would eat up memory, wouldn’t the logical backup already dead?
So large table full table scan, it should look ok. Why is that?
Impact of full table scan on the Server Layer
Suppose we now want to perform a full table scan on a 200GB InnoDB table DB1.t. Of course, you save the scan results on the client side, using commands like this:
mysql -h$host -P$port -u$user -p$pwd -e
"select * from db1.t" > $target_file
Copy the code
InnoDB data is stored on the primary key index, so a full table scan is actually a direct scan of the primary key index of table T. Since there are no other criteria for this query, each row can be placed directly into the result set and returned to the client.
So where does this result set exist?
The server does not need to save a complete result set. The process of fetching and sending data is as follows:
-
Get a line and write it to net_buffer. The size of this memory is defined by the parameter net_buffer_length, which defaults to 16K
-
The row is repeatedly fetched until the net_buffer is full and the network interface is called to issue it
-
If the message is sent successfully, the net_buffer is cleared, and the next line is fetched and written to the net_buffer
-
If the send function returns EAGAIN or WSAEWOULDBLOCK, the socket send buffer is full and the host is waiting. Continue sending until the network stack is writable again
Process for sending query results:
Visible:
-
The maximum amount of memory consumed by a query is net_buffer_length, not 200GB
-
The socket send buffer cannot reach 200 GB (the default value is /proc/sys/net/core/wmem_default). If the socket send buffer is full, the process of reading data is suspended
So MySQL is actually “read and write”. This means that if the client receives the transaction slowly, it will take longer for the MySQL server to execute the transaction because the result cannot be sent out.
For example, the following state is the result of the server show processList when the client does not read the socket Receive buffer.
Server sending block:
If State is always Sending to client, it indicates that the network stack on the server is full.
If the client uses the -quick parameter, the mysql_use_result method is used: read a line and process a line. Assume that the logic of a service is complex. If the logic to be processed after each row of data is read is slow, the client takes a long time to read the next row of data.
Therefore, for normal online services, if a query returns few results, you are advised to use the mysql_store_result interface to directly save the query results to the local memory.
Of course, if the query does not return many results. If there are too many, the client is consuming nearly 20GB of memory because of a large query, in which case the mysql_use_result interface is used instead.
If you see many threads Sending to client in MySQL that you are responsible for maintaining, you need to ask your business development students to optimize the query results and evaluate whether the number of returns is reasonable.
To quickly reduce the number of threads in this state, make net_buffer_length larger.
Sometimes you see a lot of queries with “Sending Data” status on the instance, but it’s ok to view the network. Why is Sending Data taking so long?
The state of a query statement changes like this:
-
After the MySQL query is executed, the status is set to Sending Data
-
Then, meta data is sent to the client about the column of the execution result
-
The process of executing the statement continues
-
When the execution is complete, set the state to an empty string
That is, “Sending Data” does not necessarily mean “data is being sent”, but may be at any stage in the executor process. For example, you can construct a lock wait scenario and see the Sending Data state.
SQL > alter table lock;
State of Sending data
Session2 is holding an equal lock and is Sending Data.
-
“Sending to client” is displayed only if a thread is in the state of “waiting for the client to receive the result”.
-
If displayed as “Sending Data”, it simply means “in process”
Therefore, the query results are sent to the client in segments, so the full table is scanned and the query returns a large amount of data without bursting memory.
This is server level processing logic, how to handle in InnoDB engine?
Impact of full table scanning on InnoDB
One function of InnoDB memory is to store the results of updates and to avoid random write to the disk along with the redo log.
Data pages in memory are managed in Buffer pools (BPS for short), which speed up updates in WAL.
BP can also speed up queries.
Due to WAL, when a transaction commits, the data page on disk is old. If a query reads the data page, do you immediately apply the redo log to the data page?
Don’t need. Because at this point, the results of the in-memory data page are up to date, just read the in-memory page. In this case, the query does not need to read the disk, directly from the internal access results, fast. Therefore, Buffer pools can speed up queries.
The acceleration effect of BP on query depends on an important index, namely: memory hit ratio.
You can view a system’s current BP hit ratio in the show Engine Innodb status result. In general, for a stably serviced online system, the memory hit ratio should be above 99% to ensure the required response time.
Run show engine Innodb status to see “Buffer pool hit rate”, which shows the current hit ratio. For example, the hit ratio is 100%.
It is best if all the data pages required by the query can be obtained directly from memory, which corresponds to a 100% hit ratio.
The size of the InnoDB Buffer Pool is determined by the innodb_buffer_pool_size parameter. It is recommended to set it to 60% to 80% of the available physical memory.
A decade or so ago, a single machine used hundreds of gigabytes of data and a few gigabytes of physical memory. Now, although many servers can have 128GB or higher memory, but the data volume of a single machine has reached T level.
Therefore, it is common for innodb_buffer_POOL_size to be smaller than the amount of disk data. If a Buffer Pool is full and a data page is being read from disk, an old data page is being discarded.
InnoDB memory management
The Least Recently Used (LRU) algorithm is Used to eliminate the most Recently unused data.
Basic LRU algorithm
InnoDB manages BP’s LRU algorithm, which is implemented with linked lists:
-
State1, the linked list header is P1, indicating that P1 is the recently accessed data page
-
At this point, a read request accesses P3, so it becomes state 2, and P3 is moved to the front
-
State 3 indicates that the data page accessed this time does not exist in the linked list, so a new data page Px needs to be applied in BP and added to the head of the linked list. However, you cannot apply for new memory because the memory is full. Empty the Pm data page at the end of the list, save the Px content, and place it at the head of the list
Finally, the data page Pm that has not been accessed for the longest time is eliminated.
What if you want to do a full table scan at this point? If you want to scan a 200GB table, but the table is a historical data table, no business normally access it.
Then, scanning according to this algorithm, all the data in the current BP will be eliminated and stored in the content of the data page accessed in the scanning process. That is to say, BP mainly contains the data of this historical data table.
Not for a library that is doing business services. You can see that THE BP memory hit ratio drops dramatically, disk pressure increases, and SQL statements become slow to respond.
Therefore, InnoDB cannot use the original LRU directly. InnoDB has optimized it.
Improved LRU algorithm
InnoDB divides the linked list into New and Old sections in a 5:3 ratio. LRU_old refers to the first location of the old region, which is 5/8 of the list. That is, 5/8 near the head of the list is the New region, and 3/8 near the end of the list is the old region.
The improved LRU algorithm execution process is as follows:
-
State 1, to access P3, since P3 is in the New area, as before optimization LRU, move it to the head of the list = “state 2”
-
To access a new data page that does not exist in the current list, the data page Pm is still eliminated, but the newly inserted data page Px is placed at LRU_old
-
Data pages in the old section must make the following judgments each time they are accessed:
-
If the data page has been in the LRU list for more than 1s, it is moved to the head of the list
-
If the data page exists in the LRU list for less than 1s, the position remains unchanged. 1S is controlled by the innodb_old_blocks_time parameter. The default value is 1000, in ms.
This strategy is tailored to handle operations like full table scans. Also scan 200GB history table:
-
During the scan, data pages that need to be inserted are placed in the old area
-
There are multiple records in a data page, and the data page will be accessed several times. However, due to sequential scanning, the interval between the first access and the last access of the data page will not exceed 1 second, so it will still be retained in the old area
-
If you continue to scan subsequent data, the previous data page will not be accessed again, so you will never have a chance to move to the head of the list (New area) and will soon be eliminated.
It can be seen that the biggest benefit of this strategy is that in the process of scanning this large table, although BP is also used, it has no impact on the Young area at all, thus ensuring the query hit ratio of Buffer Pool response to normal business.
summary
MySQL uses the logic of calculating and issuing at the same time. Therefore, for query results with a large amount of data, the complete result set will not be saved on the server. Therefore, if the client does not read the result in time, the MySQL query process will be blocked, but the memory will not be exhausted.
Inside the InnoDB engine, large queries do not cause memory inflation due to the elimination strategy. In addition, due to InnoDB’s improved LRU algorithm, the impact of cold data on the Buffer Pool can be controlled by full table scanning.
Full table scan still consumes I/O resources. Therefore, full table scan cannot be performed directly in the primary database during service peak hours.
Concern public number: IT elder brother, read a dry goods article every day, a year later you will find a different self.