preface

I passed Ali last week as a client developer (yes, Java!). . During the interview, the interviewer asked about B+ tree, and the interviewer kept nodding his head when answering the question. Today, I will talk about B+ tree in detail.

Balanced binary trees

It is an empty tree or the absolute value of the difference in height between the left and right subtrees is no more than 1, and both subtrees are balanced binomial trees.

B tree (B-tree)

M order B tree definition

The m-order B-tree is a balanced M-way search tree, either empty or satisfying the following conditions:

  1. Each node in the tree has a maximum of m children

  2. All nodes except root and leaf nodes have at least (m+1)/2 children

Ceil of m over 2, which is m plus 1 over 2, rounded up

  1. If the root node is not a leaf node, the root node has at least 2 children

  2. All leaf nodes are in the same layer and carry no information

  3. In addition to leaf nodes, nodes contain keyword attributes, and the number range is [M/ 2-1, M-1], that is, the number of keywords = the number of children -1.

Non-leaf node

  • Key words: K[1], K[2]… , K[m-1], and K[I] < K[I +1], that is, the keyword is ordered.

  • Pointer of children: P[1], P[2]… , P[M]

  • Keyword is less than K P [1] to [1] the subtree, keyword is greater than K P [M] to [M – 1) subtree, other P [I] to keyword belongs to * * (K] [I – 1 K [I]) * * subtree.

Time complexity: O(nlogn).

advantages

The advantage of the B-tree over the B-+ tree is that if the frequently accessed data is close to the root node, the non-leaf nodes of the B-tree store the address of the keyword data, so the data retrieval will be faster than the B-+ tree.

B + tree

M order B+ tree definition

B+ tree is a deformed form of B tree, m order B+ tree satisfies the following conditions:

(1) Each node has at most M children.

(2) Except root node and leaf node, each node has at least (m+1) /2 children.

(3) If the root node is not empty, the root node has at least two children.

(4) Add a chain pointer to all leaf nodes, and all keywords appear in leaf nodes.

(5) Except for leaf nodes, the number of children of nodes is equal to the number of keywords. Note that the non-leaf node of the B+ tree does not store the address of the keyword data, but rather the index to the keyword in the leaf node. (So any keyword lookup must take a path from root to leaf.)

Nonleaf node subtree pointer P[I], pointing to the subtree whose keyword value belongs to [K[I], K[I +1]) (b-tree is the open interval)

advantages

  1. B+ tree has fewer levels: compared with B+ tree, EACH non-leaf node stores more key words, and the tree has fewer levels, so data query is faster;

  2. B+ tree is more stable in query speed: B+ all keyword data addresses are stored on leaf nodes, so the search times are the same, so the query speed is more stable than B tree.

  3. B+ tree has the natural sorting function: all the leaf node data of B+ tree forms an ordered linked list, which is more convenient for querying the data in the size range, with high data tightness and higher cache hit ratio than B tree.

  4. B+ tree traversal of all nodes is faster: A B+ tree traversal of the entire tree only requires traversal of all leaf nodes, rather than traversal of each layer like a B tree, which is conducive to the database to perform full table scan.

To adapt to the scene

Typically used in databases and operating system file systems.

Node splitting

  • The full node is split. After the full node, M/2 node generates a new node, and the first element of the new node points to the parent node.

  • The parent node is full, and the parent node is split.

  • If the root node is full, the root node needs to be classified, and the height of the tree increases.

advantages

It can keep data stable and orderly, and its insertion and modification have relatively stable logarithmic time complexity

conclusion

Let’s have fun. Let’s have fun. Don’t joke about the interview.

The B+ tree was pretty much busted in the interview. In addition to the balanced binary, B, and B+ trees mentioned in this article, the application scenarios of B+ trees are highly topical, such as the B+ tree structure used in MySQL and some file systems. This article is limited space, I hope you should remember the knowledge before the interview.