Binary search method
Binary search is also called binary search. Used to find a record in an ordered set of records.
The basic idea is: the records are arranged in order (increasing or decreasing), and the skip method is used in the search process. That is, the midpoint position of the ordered sequence is taken as the comparison object first. If the element value to be found is less than the midpoint element, the column to be queried is reduced to the left half, otherwise it is the right half. The query range is halved by a single comparison.
If you have five,10,19,21,31,37,42,48,50,52 the number 10, to check the number of 48, the search process:
As you can see from the graph, it took 3 times to find 48. If the search is sequential, it takes 8 times, so binary search is more efficient than sequential search (average). But if you want to look up 5, you only need to do a sequential search once, and a binary search takes four times.
For the above 10 numbers, the average number of lookups is (1+2+3+4+5+6+7+8+9+10) /10=5.5. Binary lookup is (4+3+2+4+3+1+4+3+2+3) /10=2.9 times.
In the worst case, the number of sequential lookups is 10, while binary lookups are 4
Binary and binary search trees and balanced binary trees
B+ trees are derived from binary search trees, and then from balanced binary trees, B trees.
Binary search tree definition: the left subtree is always less than the root of the key value, the right subtree is always greater than the root of the key value. Therefore, the sorted output of the key value can be obtained through middle-order traversal
If you want to construct a binary search tree with maximum performance, you need the binary search tree to be balanced, which leads to a new definition —– balanced binary tree, or AVL tree.
Balanced binary tree definition: first compound the definition of binary search tree, secondly must satisfy the height of the two subtrees of any node maximum difference of 1.
Balanced binary trees are fast to query, but maintaining a balanced binary tree is expensive. Typically, one or more left and right rotations are required to achieve the balance of the inserted or updated tree. As follows:
Third, B + tree
B+ tree is a classical data structure like binary tree and balanced binary tree.
B+ trees evolved from B trees and indexed sequential access (ISAM, which was the data structure MyISAM engine originally referenced), and B trees are no longer used in practice.
B+ trees are a balanced lookup time designed for disks or other direct storage assistive devices.
In B+ tree, all record nodes are stored on leaf nodes of the same layer in order of key values, which are connected by Pointers of leaf nodes.
As follows: its height is 2, each page holds 4 records, fan out is 5. All records are stored on leaf nodes and in sequence.
B+ tree insert operation
Insertion into a B+ tree must ensure that the records in the leaf node remain sorted after insertion, and consider three cases of insertion into a B+ tree, each of which results in a different insertion algorithm. As follows:
1. For the B+ tree as shown in the following figure, if the user inserts the value 28 and finds that the current leafPage and IndexPage index pages are not full, insert them directly.
Figure (1)
Figure (2)
2. Insert the key value 70 from the figure above, then the original leafPage is full, but the IndexPage is not. At this point, the situation after inserting leafPage is 50, 55, 60, 65, and 70, and the leaf nodes are split according to the median value of 60, which can be obtained as below.
(3)
It may be necessary to do a lot of split pages for newly inserted keys to maintain balance. Because the B+ tree structure is primarily for disks, and splitting implies disk operations, page splitting should be minimized if possible. So the B+ tree will balance the binary tree’s Rotation.
Rotation occurs when the leafPage is full, but its left and right siblings are not. The B+ tree does not rush to split the page operation, but moves the record to the sibling node of the page. Normally, the left sibling is checked first for rotation. If so, the insertion of 70 should be left handed:
Figure (4)
3. Finally, insert 95, then compound the third case, that is, leafPage and IndexPage are full, and split twice
Figure (5)
Delete the B+ tree
B+ trees use fill factor to control tree deletion changes, and 50% is the minimum value that can be set for the fill factor.
The deletion operation of B+ tree must also ensure that the records in the leaf node are still sorted after deletion. Like insertion, the deletion operation of B+ tree also needs to consider the following three situations:
1. Delete according to the B+ tree in Figure (5). First delete the record whose key value is 70:
Then delete the record whose key value is 25, but the value is still in IndexPage. Therefore, after deleting 25 in LeafPage, update the right sibling node 28 of 25 to PageIndex, as shown in the figure:
Finally, delete the 60 key. After deleting records with key value of 60 in LeafPage, if the Fill Factor is less than 50%, the merge operation is required. Similarly, after deleting relevant records in IndexPage, the merge operation of IndexPage is required.