preface
Recently in view of the Internet company interview asked knowledge points, summed up the Java programmer interview involves most of the interview questions and answers to share with you, I hope to help you review before the interview and find a good job, but also save you on the Internet to search for information time to learn.
Full version Java interview questions address: Java backend questions integration
What is an index?
An index is like the table of contents in a dictionary and we usually go to the table of contents and look up the key words or the letters and then look up it’s a lot faster than going through the dictionary
Second, why have index?
However, when we use mysql database, we also have an index like a dictionary to query, which is definitely much faster
2.1 question:
1. Where is mysql data stored?
disk
2. Slow data query, where is the general card?
IO
3. To read data from disk, do you read as much as you need?
Disk to proofread
Locality principle: Data and programs tend to cluster, and previously accessed data is likely to be queried again, spatial locality, temporal locality
Disk prefetch: Data interaction between memory and disk usually has a minimum logical unit, page. Pages are usually sized by the operating system, 4k or 8K, and we can take multiples of the page to read when we interact with the data.
The InnoDB storage engine reads 16K of data at a time
4. Where is the index stored?
Disk, when querying data, the index is loaded into memory first
5. What information does the index need when it is stored? What field values need to be stored?
Key: The value stored in the actual data row
Address of the file
Offset: indicates the offset
6. What data structure is used to store data in this format?
key-values
Hash table, tree (binary tree, red-black tree, AVL tree, B tree, B+ tree)
7. The mysql index is not stored in the format described above, why?
OLAP: Online analytical processing —- Analyzes massive historical data to generate decision making policies —- Data warehouse – Hive
OLTP: Online transaction processing —- requires very short time to return corresponding results —- database – relational database (mysql, Oracle)
Mysql index data structure
3.1 Hash Table:
A HashMap array with a linked list is not suitable for indexing.
1. Hash conflict will cause uneven data hashing, resulting in a large number of linear queries and a waste of time
2. Range query is not supported. When you perform range query, you must traverse one by one
3. High requirements on memory space
Advantages: If it is equivalent query, very fast
Is there a hash index in mysql?
1. The memory storage engine uses hash indexes
2. Innodb supports adaptive hash
Create table test(id int primary key,name varchar(30)) engine=' InnoDB /memory/myisam' -- default innoDB copy code after 5.1Copy the code
3.2 the tree:
Tree this data structure has many, we commonly have: binary tree, BST, AVL, red and black tree, B tree, B+ tree
① Binary tree: unordered insertion
So that’s the structure of our tree, but the binary tree inserts are unordered, which means that when you need to find something, you still have to walk through it one by one
②BST(binary search tree) :The inserted data is ordered, the left subtree must be smaller than the root node, and the right subtree must be larger than the root node ——– Use binary lookup to improve efficiency
In this way, if you want to query data, you can quickly narrow the scope through binary search, reducing the time complexityBut if the inserts were in ascending or descending order, the tree would look like this:
Then the binary search tree will degenerate into a linked list, and the time will become O(n) again.
③AVL: balanced binary treeIn order to solve the above problem, the shortest subtree can be no more than 1 height apart from the oldest tree by rotating the tree left or right
As we can see from the figure, when sequential inserts are performed, rotation is automatically performed to achieve balance, but the performance of the query is compensated by the loss of insert performance. When we insert a lot of data and query is very small, rotation of insert data also consumes a lot of time④ Red-black tree (solved as many read and write requests)The same way you balance the tree by turning it left and right, and changing colors, the oldest tree has to be no more than twice the size of the shortest subtree
The query performance and insert performance are approximately balanced. However, as data is inserted and the discovery tree depth becomes deeper, the tree depth becomes deeper, which means that the more I/O times, the data read efficiency is affected
5. B treeIn order to solve the problem of too much data insertion and tree depth becoming deeper, we use B-tree to change the original ordered binary tree into ordered multi-fork tree
Select * from table where id=14;
- Step 1, load disk 1 into memory, find 14<16, find address disk 2
- Step 2, load disk 2 into memory, find 14>11, find address disk 7
- Step 3: Load disk 7 into memory, find 14=14, read data, fetch data, end thinking: B tree is perfect?Question 1:B-tree does not support fast search of range query. If we query the data of a range and find a boundary of the range, we need to go back to the root node to search again. We need to traverse from the root node for many times, even if we find another boundary of the range, the query efficiency will be reduced.Question 2:If data stores row records, the size of rows will increase as the number of columns increases. In this case, the amount of data that can be stored in a page is reduced, the tree is correspondingly higher, and disk I/OS are increased. Think 2: How many records can a three-tier B-tree store?A:Assuming that a data is 1K, innoDB storage engine reads 16K data ata time, and the three layers are 161616 = 4096; But often in development, the data of a table is much larger than 4096, do you want to continue to add layers, which will not increase IO
Why use B+ tree?
When you actually store the table data, how do you store it?Key A complete row of dataB + tree
B+ tree improves B tree by placing all data in leaf nodes, which are connected by bidirectional Pointers, and the lowest leaf node forms a bidirectional ordered linked list. For example, select * from table where id between 11 and 35?
- Step 1, load disk 1 into memory, find 11<28, find address disk 2
- Step 2, load disk 2 into memory, find 10>11>17, find address disk 5
- Step 3: Load disk 5 into memory, find 11=11, read data
- In the fourth step, continue to search to the right, read disk 5, find 35=35, read data between 11 and 35, end it can be seen that, such range query speed is much higher than B tree
Comparing B trees and B+ trees?
-
Data is stored in the leaf node
-
No data is stored in non-leaf nodes
-
Each node of a B+ tree contains more nodes. The advantage of this is that the height of the tree can be reduced and the data range can be divided into multiple ranges. The more ranges, the faster the query
Question: Create index with int or varchar?
A: It depends, but keep the key as small as possible
Create index
Innodb: InnoDB data and index are stored in a single file. Idb myISam: MyISam index is stored in a single file. In MYI files, data is stored in. In the MYD
5.1 Clustered index and non-clustered Index
Innodb: InnoDB: innoDB: innoDB: innoDB: innoDB:
- There can only be one clustered index, but there are many non-clustered indexes
- When inserting data into InnoDB, you must include an index key
- The key of this index can be a primary key, or if there is no primary key, a unique key, or if there is no unique key, a self-generated 6-byte ROWId
Myisam: non-clustered index
MySQL innodb – B + treeIndexes and data are stored together, and the corresponding data can be read if the index is found
MySQL – myisam – B + treeIndex and the address of the storage data together, find the index to get the address value, and then find the corresponding data through the address
Back to the table 5.2
Next, I’m going to create a table of examples to show you
CREATE TABLE user_test(id INT PRIMARY KEY AUTO_INCREMENT,-- ID PRIMARY KEY uname VARCHAR(20), age INT, gender VARCHAR(10), KEY 'idx_uname' (' uname ')ENGINE = INNODB; INSERT INTO user_test VALUES(1,' zhang3 ',18,' male '); INSERT INTO user_test VALUES(NULL,' ma ',19,' ma '); INSERT INTO user_test VALUES(NULL,' user_test ',18,' male '); INSERT INTO user_test VALUES(NULL,'王老 7 ',22,'男'); INSERT INTO user_test VALUES(NULL,' user_test ',16,' female '); INSERT INTO user_test VALUES(NULL,' 10 ',26,' 10 '); Copy the codeCopy the code
Select * from user_test where uname = 'user_test '; SQL > alter table select * from uname; select * from uname; select * from unameCopy the code
First according to the uname query id, according to the id query to the information Such operations walked two B + tree, is back to the table When according to general index after access to the key value of the cluster index, according to the key value to get the data in the clustering index We can find that this operation is a waste of time, so we daily operation, Minimize the number of times you return to the table
5.3 Overwriting indexes
Select id,uname from table where uname = 'j3 '; Select * from uname; select * from uname; select * from uname; select * from unameCopy the code
5.4 Left-most match
Before said leftmost match, let’s chat a few nouns primary key (usually for one column) — — — — — — — — > joint primary key index (more than one column) — — — — — — — — > joint index (may contain multiple index column)
Select * from table where name =? Select * from table where name =? and age = ? ; Select * from table where name =? ; Select * from table where age =? ; Select * from table where age =? and name = ? ; There is an optimizer inside mysql that adjusts the corresponding sequential copy codeCopy the code
5.5 Index push down
An example of a feature supported by default after mysql5.7:
select * from table where name = ? and age = ? ; Mysql > select * from 'mysql'; -- Client :JDBC -- Server :server -- Storage engine: Before the data store does not have an index push down, the data store obtains data from the storage engine according to the name that meets the rules. After the index push down is filtered at the server layer, Obtain the corresponding data replication code from the storage engine according to the name and age conditionsCopy the code
Analysis: there is an index pushdown benefits, if we have 50 data, we get 10 data by filtering, without an index pushdown, will first obtain 50 get 10 to go out, and have pushed down, we will directly on storage engine is filtering into 10 full version Java interview questions address: Java end item after integration