Do you really know binary trees?

Series article guide

Kiner algorithm brush topic (a) : linked list and linked list ideas
Kiner algorithm: Recursion and stack (solve the expression evaluation problem)
Kiner algorithm brush topic (3) : thread pool and task queue
Do you really know binary trees?
Do you really know binary trees?
Kiner algorithm: Heap and Priority queue
Kiner algorithm brush topic (five) : Heap (priority queue)

The open source project

All articles in this series will be collected and managed in GitHub. Welcome ISSUE and Star.

GitHub portal: Kiner algorithm

Tree structure base

Whereas each node in a linked list can only point to the next node (the linked list here is a one-way list), each node in a tree can have several child nodes. Therefore, a tree structure can be expressed as follows:

interface TreeNode {
  data: any;
  nodes: TreeNode[]
}
Copy the code

The tree of degrees

PS: In the graph structure, there is also the concept of degree, which is divided into out degree and in degree. If the tree is regarded as a part of the graph, then strictly speaking, the degree of the tree is actually out degree. However, in a tree structure, we usually use the concept of degree to describe how many children the current tree node has.

That is, each node has several children, so the degree of a binary tree is at most 2, and the degree of a linked list (which can be viewed as a tree with only one child) is at most 1.

Theorem: In a binary tree, nodes with degree 0 have one more node than nodes with degree 2

Proof:

If a tree has n nodes, then the tree must have n-1 edges, that is, the number of points = the number of edges +1(this is true for all trees, not just binary trees) :

This tree has seven nodes, and the lines between nodes, which are edges, only have six.

So, we assume that the number of nodes with degree 0 is N0, the number of nodes with degree 1 is N1, and the number of nodes with degree 2 is n2. Since it is a node with degree 0, the number of its edges is N0=0, the number of node edges with degree 1 is N1= N1 *1, and the number of node edges with degree 2 is n2 =n2*2, then the total number of nodes is:
# As can be seen from the above, the expression for the number of edges is as follows
N0=0;
N1=1*n1;
N2=2*n2;
# Number of nodes = number of edges +1
n0 + n1 + n2 = N0 + N1 + N2 + 1;
# Substitute N0,N1,N2 to get:
n0 + n1 + n2 = 0 + n1 + 2*n2 + 1;
# Simplification:
n0 = n2 +1;
The number of nodes with degree 0 is always one more than the number of nodes with degree 2
Copy the code
Thus, we have proved the above theorem, which may be easier to understand if described in a different way:

In a binary tree, as long as you know how many leaves there are, the number of leaves with degree 2 is the number of leaves minus 1, and conversely, the number of leaves with degree 2 is the number of leaves with degree 2 plus 1

Tree traversal

    5
   / \
  1   4
     / \
    3   6
Copy the code

The name of the	Traversal sequence
The former sequence traversal	Root node left subtree right subtree 5->1->4->3->6
In the sequence traversal	Left subtree root right subtree 1->5->3->4->6
After the sequence traversal	Left subtree Right subtree root 1->3-> 6->4->5
Sequence traversal	That is, from top to bottom, the root node and all the children of the root node are the children of the next layer 5->1->4->3->6

The idea of trekking

Trees are inherently good data structures for recursive traversal, because every time you process a left subtree or a right subtree, you’re actually doing recursive traversal.

Pre-order traversal: “root node” “Output result of recursive traversal of left subtree” “Output result of recursive traversal of right subtree”

Middle order traversal: “Recursively traverses the output of the left subtree” “root node” “recursively traverses the output of the right subtree”

Post-order traversal: “Recursively traverses the output of the left subtree” “recursively traverses the output of the right subtree” “root node”

Divergent thinking

See here, some of you may feel deja vu, is there some knowledge about trees. In fact, we discussed this topic earlier when we studied the stack data structure. We know that the stack is naturally suitable for expression evaluation, so what is the logical structure of the stack in the process of expression evaluation? For example, 3*(4+5). In fact, although we are using the idea of a stack in our solution, we are actually logically simulating the operation of a tree. Don’t believe me? Let’s see:

[*] [3] [+] [4] [5]Copy the code

Above, we borrow this expression into a tree structure. When we encounter () in the expression, it indicates that the subexpression in the expression needs to be processed first, so we regard it as a subtree of our binary tree. As we all know, the idea of tree traversal is recursive traversal, which is to solve the problem layer by layer. In this way, in the process of recursive call, it will solve the subproblem of the right subtree first, get the result, and then carry out the final calculation with the result calculated by the left subtree to get the final result. If you are interested in the results of stack data, you can refer to another article: 0003- Recursion and stack (Solving expression evaluation Problems).

Restore the binary tree

If we know any two of the pre-order traversal results, middle-order traversal results and subsequent traversal results, we can restore a binary tree completely. Such as:

Preorder traversal the outputOne, five, two, three, fourThe output is traversed in order5, 1, 3, 2, 4Copy the code

Above is the output of the two traversal methods. We know that the first node of the preceding traversal must be the root node, so at this point, we already know that the root node of the original binary tree is 1. Next, we take the node of 1 into the output of the middle traversal and find the position of 1. Since the output result of middle-order traversal is left root right, it is not difficult to know that the middle-order traversal output to the left of 1 is the middle-order traversal output of the left subtree of the original binary tree, and the middle-order traversal output to the right of the original binary tree is the middle-order traversal output of the right subtree of the original binary tree. Thus, we can divide the middle-order traversal output into the following blocks:

# Sequence traversal result in cutting5, 1, 3, 2, 4# left subtree root right subtree
The left subtree has only one node, 5, but the right subtree is still a sequence, so we continue to go down.

# From above, we already know the sequence of left and right subtrees of the original binary tree, so we also cut the following preorder traversal resultsOne, five, two, three, four# root left subtree right subtree

# after cutting the preorder traversal results, we find the right subtree of the sequence, the sequence of his first is right subtrees of root node, which is 2, after finding the root node, is very simple, repeat the above steps, the order in the binary tree traversal results in the right subtree of son can find the right subtree of the left tree and right subtree 3 and 4, respectively, to this, We've already restored the binary tree
    1
   / \
  5   2
     / \
    3   4
Copy the code

Wouldn’t it be easy to have a tree with five nodes? Now, let’s do a slightly harder thinking problem.

Given the pre-order traversal results and middle-order traversal results of the binary tree with 10 nodes, restore the binary tree

Preceding traversal output sequence: 1, 2, 4, 9, 5, 6, 10, 3, 7, 8

Middle-order traversal output sequence: 4 9 2 10 6 5 1 3 8 7

 
# from 2. We know that 1 is the root node, so the left subtree sequence is: 4 9 2 10 6 5; Right subtree sequence: 3, 8, 74. I like the way I like it1 is the root node2. 1 2 4 9 5 6 3 7 8According to 1.2, 2 is the root node, so the sequence of the left subtree is 4, 9; Right subtree sequence: 10, 6, 51.1 Middle sequence: 4 9 2 10 6 52 is the root node2 4 9 5 6 104 is the root node, so 9 is the right subtree1.1.1 Middle sequence: 4 and 94 is the root node1.2.1 Preceding sequence: 4 9# It can be seen from 1.2.2 that 5 is the root node, so the sequence of the left subtree is: 10, 61.1.2 Middle order: 10, 6, 55 is the root node1.2.2 Preface: 5, 6, 10As can be seen from 1.2.2.2, 6 is the root node, so 10 bits of the left subtree1.1.2.1 Middle sequence: 10 66 is the root node1.2.2.2 Prefixes: 6 10# It can be seen from 2.2 that there are three root nodes, so the sequence of the right subtree is: 8, 72.1 Middle sequence: 3 8 7# assert: 3 is the root node2.2 Preface: 3, 7, 8According to 2.2.2, 7 is the root node, so 8 is the left subtree2.1.1 Middle sequence: 8 77 is the root node2.2.2 Prefixes: 7 8# The binary tree looks like this1 / \ 2 3 / \ 4 5 7 / / 9 6 8/10Copy the code

Common classification of binary trees

Complete Binary Tree

Only a binary tree that lacks nodes to the right of the last level is called a complete binary tree, that is, the left side of a complete binary tree is full, and only the right side is allowed empty nodes

                  1
            /           \
           2             3
          /   \         /  \
         4     5       6
Copy the code

A complete binary tree is an excellent data structure that has two main characteristics that give us a better experience in terms of performance and implementation.

The node number can be calculated

From the complete binary tree above, we can see a rule:

For a node numbered n, its left child root node must be numbered 2n, and its right child’s root node must be numbered 2n+1. For example, the left child root node of figure 2 is numbered 4, which means 2*2=4. The right child root node is numbered 5, that is, 2*2+1=5.

So what can we do with this?

As we know, a normal binary tree, in addition to the data domain for storing data, also needs additional storage space for storing Pointers to the left and right subtrees, that is, pointer domain. If we can directly calculate the number of the left and right subtrees of the current node by the rule above, then there is no need to store the storage address of the left and right subtrees. When a tree is large enough, this can save us a considerable amount of storage space.

The above method of replacing the address of record storage by calculation extends an algorithm idea that we often use in our daily work:

Conversion of recording and calculating ideas

Record (save time, consume space, no calculation, direct value, namely: space for time)

Save information and use it when you need it
Calculation formula (space saving, time consuming, no need to store, calculate the value, namely: time for space)

The 2 in 1+1=2 is what we get by evaluating the expression 1+1

These two methods have their own advantages and disadvantages. It is meaningless to compare the advantages and disadvantages of these two methods without considering the problem itself. We should consider the specific problem and see which method can bring you more benefits.

Scenario 1: When memory space is limited and computation time is not required, such as running a program on a machine with small memory, we will choose the calculation formula and exchange time for space

Scene 2: when we memory space is enough big, and the computing speed is required, such as enterprise application server running on a real-time calculation data, we will choose the recording, in space, in time, because of the application of an enterprise, the general memory is large enough, can also be dynamic expansion, at that time, the time the benefits will far outweigh the benefits of space.

Continuous storage space can be used for storage

In addition to the node number (node address) to calculate the features, fully binary tree because his number is continuous, ascending and continuous sequence from top to bottom, so we can put the complete binary tree stored in a continuous storage area, such as: in the array, the array subscript zero deposit element node 1, deposit element node 2 for 1…

Using this feature, we in the implementation of a complete binary tree, can need not like realize common binary tree alone define a structure, and define the data and pointer domains respectively to store data and pointer respectively, we can use an array to store data directly, this also is we the most common form of full binary tree.

Let’s imagine this: you are in the program implementation using a one-dimensional linear structure, namely the array to say, but in your mind, should be to convert it into two-dimensional tree structure to thinking about the problem, it is also a relatively advanced programming logic thinking ability, let the data structures we see can in the heart of the mind “compiled” into its true runtime. Of course, it takes a lot of practice to acquire this kind of ability. At least, as I write about this trip, I can’t achieve this level.

Full Binary Tree

A binary tree with no nodes of degree 1 is called a full binary tree, meaning that all nodes have either no children or two children

PS: We often see many articles and blogs on the Internet that put the definition of perfect binary tree on the full binary tree, which is actually wrong. See the specific definition of perfect binary tree below.

                  1
            /            \
           2              3
         /   \           /  \
        4     5	        6    7
             / \
            8   9
Copy the code

Perfect Binary Tree

All nodes have a degree of 2. This shows that the definition of a perfect binary tree is different from that of a full binary tree. We can say that a perfect binary tree is a special full binary tree

                  1
            /           \
           2             3
         /   \          /  \
        4     5	       6    7
Copy the code

Deep understanding of tree structure

node

The nodes of a tree represent a set, and the child nodes represent disjoint subsets under the parent set. This may be difficult to understand, but let’s look at the following graph:

      5
  /       \
 2         3
# The top binary tree, 5 nodes, can be treated as a complete set, while the bottom two children, 2 and 3, are two disjoint subsets of the complete set. The sum of the two subsets should equal the complete set
Copy the code

From the graph above, we can draw a conclusion:

A node of the tree represents a set, while child nodes represent the disjoint subsets below the set. The sum of all subsets gives the set

edge

Each edge of the tree represents the relationship

The role of the tree

The main use is for search operations in various scenarios

Learn the role of binary trees

Helps understand the basics of high-level data structures

Complete binary tree (maintain set maximum value of the magic weapon)
- The heap
- Priority queue
Multi-fork tree/forest
- Solve string and related conversion problems of the magic weapon
  - The dictionary tree
  - AC automata
- A magic weapon to solve connectivity problems
  - Check and set
Binary sort tree
- An underlying implementation of an important data retrieval container in the language standard library
  - AVL tree (binary balanced tree)
  - 2-3 tree (binary equilibrium tree)
  - Red-black tree (binary balanced tree)
Important data structures underlying file systems and databases
- B tree /B+ tree (multi-fork equilibrium tree)

The best way to practice your recursive skills

Learn the top-level way of thinking recursively

Design/understand a recursive program:
Mathematical induction => structural induction

If k0 is true, assuming ki is true, then k of I plus 1 is also true. As in solving the Fibonacci sequence:
function fib(n) {
	// Make sure that k0 is correct, i.e. the preconditions (boundary conditions) are correct. In this case, k0 is 1 for n=1 and 2 for n=2
  if(n <= 2) return n;
  return fib(n - 1) + fib(n - 2);
}
Copy the code
Give recursive functions an explicit meaning

In the code above, fib(n) represents the value of the NTH Fibonacci sequence

Thinking about boundary conditions

In the above code, our boundary is given as 1 for n=1 and 2 for n=2, requiring special treatment for this boundary

Implement the recursive process

Once you’ve dealt with the boundary problem, you can continue recursively

If you were asked to design a binary tree traversal program, what would you do?

Traversal the binary tree with root as the root node
Boundary condition: if root is empty, root is directly returned without traversal
Recursive process: pre-traversal the left subtree and pre-traversal the right subtree respectively

// Traversal the binary tree with root as the root node
function pre_order(root) {
  // Boundary condition: if root is empty, return root directly
  if(! root)return root;
  console.log(root.val);
  // Recursive process: preorder traversal left subtree and preorder traversal right subtree respectively
  pre_order(root.left);
  pre_order(root.right);
}
Copy the code

Use the left child has brother method to save space

To convert any non-binary tree to a binary tree, such as a trinomial tree to a binary tree:

Always ensure that the left side of the binary tree is the child node and the right side is the sibling node

# original trigeminal tree1 / | \ 2, 3, 4 / \ 5 6Convert to binary tree in the same way as left child and right brother1/2\3/5\4\6# because 2 is the child of 1, we put it on the left subtree, because 3 is the brother of 2, we put it on the right subtree of 2, because 4 is the brother of 3, we put it on the right subtree of 3, and 5 is the child of 3, we put it on the left subtree of 3, and 6 is the brother of 5, so we put it on the right subtree of 5
Copy the code

As you can see, the right subtree of the root node is always empty when only a tree is converted into a binary tree by the left child and right brother method. Then, can we effectively use this right subtree to merge multiple trees into a binary tree? For example, the following example merges two binary trees together to form a forest.

# If you want to merge the following two trees into a binary tree1 7 / | \ \ 2, 3, 4, 8, 9, 5 6 7 1/2 \ \ / 3 8 / \ \ 5 4 9\6# Thus, we have merged the two trees into one tree, the forest. This tree looks like a binary tree, but it actually represents a forest of two trees
Copy the code

Known as the Alpha Go monte carlo algorithm source code in the implementation of the concrete implementation of the algorithm, tree search algorithm framework called confidence limit tree algorithm (UCT) is to use the left child right brother method to implement a search tree, used to represent the entire board, normally, if you want to store the situation of a chessboard, Will be stored in a tree structure, but because there are too many checkerboard situations, it is possible to form a tree with more than 100 forks. In Alpha Go, to avoid this situation, the tree with more than 100 forks is converted into a binary tree by the representation of the left child has a brother. Those of you who are interested can check out Pachi.

So why does this save space? You think about it, a fork tree, his each node has three hands domain is used to store the subtree, regardless of whether there is a subtree, all want to reserve the space, the trigeminal tree as those listed above, there are 6 nodes, a total of 18 pointer field, including five effective pointer field only (the so-called effective pointer is a pointer domains is not pointing to the empty, That is, number of edges = number of nodes -1), then there are 18-5=13 Pointers that are empty. If there are 12 valid pointer fields and 5 valid pointer fields, then only 12-5=7 pointer fields are empty, which obviously saves a lot of space compared to the previous 13.

A k-fork tree with n nodes will have at most K * N edges, and its edges will actually only have n-1, so it will waste: K *n – (n-1)=(k-1)*n+1 edges, which means that as we branch more, we waste more space, so we need to convert a k-tree into a binary tree, because the wasted edges of a binary tree are: n+1, depending on which node we actually store data on.

conclusion

At this point, we have talked about some basic knowledge of binary trees, in order to control the length and the acceptance of different bases of partners, I will not carry out a further discussion. I would have to brush the binary tree algorithm with you to consolidate some knowledge of the binary tree, but this will lead to a smelly and long article, so I will split it into two articles. The next article will skip the basics of trees and go straight to the binary tree algorithm problem, with the portal: hand rip binary tree algorithm problem