Send subtopic
Interviewer: Have you ever operated Linux?
Me: Yes
Interviewer: What command should I use to check memory usage
Me: Free or top
Interviewer: Tell me what you can see with the free command
Me: Well, you can see the memory and cache usage as shown below
- Total total memory
- Used Used memory
- Free Free memory
- Buff /cache Used cache
- Avaiable Available memory
Interviewer: Do you know how to clean the used cache (buff/cache)
I: em… I don’t know
Interviewer: Sync; Echo 3 > /proc/sys/vm/drop_caches buff/cache
I :(send points, heart great joy) benefits greatly have, clear out the cache we have more available memory space, like the PC above xx guard’s small rocket, click, release a lot of memory
Interviewer: em…. Go back and wait for notice
Talk about SQL Join
Interviewer: Change the subject and talk about your understanding of join
Me: Ok (get it wrong again and you’re done, take your chance)
review
A JOIN in SQL can join a specified table based on certain criteria and return the data to the client
There are two ways to join
inner join
In the connection
left join
Left connection
right join
The right connection
full join
All connection
The above pictures are from here
Interviewer: If you need to use join statement in project development, how can you optimize and improve performance?
Me: Divided into two cases, the data scale is small, the data scale is large.
Interviewer: And?
I:
-
Data size is small, all dry into memory is done
-
Large data scale
You can optimize the speed of the JOIN statement by adding indexes. You can reduce the number of joins by using redundant information. Minimize the number of table joins
Interviewer: It can be concluded that the JOIN statement is relatively performance-expensive, right?
I: yes,
Interviewer: Why?
The buffer
Me: There must be a comparison process when executing the JOIN statement
Interviewer: Yes
I: It is slow to compare two tables one by one, so we can read the data from two tables into a block of memory. For example, MySQL InnoDB engine uses the following statement to check the related memory area.
As shown in the figure below, the size of join_BUFFer_SIZE will affect the performance of our join statement
Interviewer: What else?
A big premise
Me: Any project will eventually come online and inevitably generate data on a scale that can’t be too small
Interviewer: That’s right
Me: Most of the data in the database is eventually saved to the hard disk and stored as files.
Take MySQL’s InnoDB engine as an example
- InnoDB to
page
(page) is the basic IO unit, and each page is 16KB in size - InnoDB creates a table for each table to store data
.ibd
file
validation
Me: That means we have to read as many files as we want to join, and while we can take advantage of indexes, we still have to constantly move the head of the hard disk
Interviewer: So frequent movement of the head will affect performance
Me: Yes, don’t open source frameworks like to say they have greatly improved performance with sequential reads and writes, such as hbase and Kafka
Interviewer: That’s right. Do you think Linux has optimized for that? Hint, you can run the free command again to see
Me: Wonder why the cache takes up 1.2 gigabytes
Image source here
Interviewer: Have you ever thought about it
buff/cache
What’s in it?- why
buff/cache
It takes up that much memory, so the available memory is 1, 2availlable
There are1.1 G
? - Why can you clean it up with two commands
buff/cache
Occupied memory that you want to releaseused
Can only be done by terminating the process?
Product, you fine product
After thinking about it for a few minutes
Me: The buff/cache is not important, so clearing it will not affect the operation of the system
Interviewer: Not exactly
Me: Is it? I remember a sentence from CSAPP (In-depth Understanding of computer Systems)
The essence of a storage hierarchy is that each tier of storage devices is a cache for the lower tier
Translation: Linux treats memory like a cache for your hard drive
Related information tldp.org/LDP/sag/htm…
Interviewer: Now do you know how to answer the scoring question
I: I…
The Join algorithm
Interviewer: Given another chance, what would you do if you were to implement the Join algorithm?
Me: Without an index, the nested loop is done. If there are indexes, they can be used to improve performance.
Interviewer: Back to join_buffer, what do you think is stored in join_buffer?
Me: During the scan, the database selects a table and puts the data it wants to return and compare with other tables into join_buffer
Interviewer: What happens when there is an index?
I: this is relatively simple, directly read the index tree of two tables for comparison on the completion of the ao, I introduced here no index processing mode
Nested Loop Join
The outerTable has 100,000 rows of data and the innerTable has 100 rows of data. The outerTable has 100 rows of data and the innerTable has 100 rows of data. It needs to be read 10,000 times.
Of course no database engine currently uses this algorithm (too slow)
Block nested loop
Block blocks, which means that one Block of data is fetched to memory at a time to reduce I/O overhead
MySQL InnoDB uses this algorithm when no indexes are available
Consider the following tables T_A and t_b
InnoDB automatically uses the Block nested loop algorithm when a join cannot be performed using an index
conclusion
When I was in school, the database teacher liked to test the database paradigm most. Until I went to work, I learned that everything should be based on performance. If redundancy is possible, then join if redundancy is impossible. Try increasing your JOIN_buffer_size or switching to a solid state drive.
The appendix
Next announcement => Self-description of the mechanical hard disk
Review of “Xiao Ke’s MySQL Learning Road”
- Why should we try to avoid FileSort?
The resources
- Delve into computer systems. chapter 6 memory hierarchies
- Experiments and Fun with the Linux Disk Cache are a few examples of how hard disk caching affects application performance
- Linux ATE My RAM Free parameter description
- How to clear the buffer/pagecache (disk cache) under Linux
- How does MySQL run: From the root understand MySQL
- Block Bested loop From the official Documentation of MariaDB explains the implementation of the block-nested-loop algorithm