Algorithm in the tree related topics is relatively difficult, at the same time in the beginning of the brush algorithm, it is best to brush from the tree problem, because his problem solving ideas are helpful to all kinds of questions in the future. And as we’ve practiced on the tree-related problems we’ve found that there are frameworks that we can use. This article first briefly introduces the definition of tree data structure, and then explains the general idea of tree problem disintegration. When faced with a data structure, we need to look at its storage and traversal, after all, data structures exist to facilitate traversal.

Tree definition

Represents the logical one-to-many relationship, each node has zero or more children; Nodes that have no parents are called root nodes; Each non-root node has one and only one parent; In addition to the root, each child can be divided into multiple disjoint subtrees. And why is it called a tree? Because if you turn the tree upside down, that’s what you see all the time.


Common tree definitions

  • Multi-fork tree: A tree structure that does not limit the number of children per node.
  • Binary tree: Each node has a maximum of two child nodes. The above tree structure is binary tree. The one on the left is the left subtree, and the one on the right is the right subtree.

Binary trees are widely used, so the following storage and traversal uses binary trees as an example.

  • Full binary tree: A binary tree in which all nodes at each level have two children except the last level, which has no children.
  • Complete binary trees: leaf nodes only exist at the last or second-to-last level. And if the nodes in the lowest layer are not satisfied, then all the nodes are continuously arranged on the left, and the empty space is on the right.

  • Balanced tree: It is an empty tree or the absolute value of the difference in height between the left and right subtrees is no more than 1, and both subtrees are balanced binary trees. There are many common ways to use balanced tree, such as red black tree, AVL and so on.
  • Binary search tree: if its left subtree is not empty, the value of all nodes in the left subtree is less than the value of its root node. If its right subtree is not empty, the value of all nodes in the right subtree is greater than the value of its root node. Its left and right subtrees are also binary sort trees respectively.
  • AVL: binary search tree with balance condition, because the depth of balance is shallower, so the search performance is higher.

The tree’s store

So how do we store the tree structure? In computers, there are only two types of storage, an array and a linked list. Arrays are stored on contiguous memory, so we can find the address of memory directly based on the offset, but when we insert and delete elements, we need to move them forward or back to ensure continuity, which is inefficient. List, equivalent to a chain, do not need to be stored in contiguous memory, as long as the first node to save a reference, so that we can’t random access through the memory of the offset directly find the address, and need to traverse back from the first position, random access efficiency is lower, but the efficiency is higher, insert and delete because we don’t have to move the elements, Just change the reference to the next node. Tree structures can also be stored through arrays and linked lists.

An array of storage

How do you use array storage? Facing the above tree structure, it is intuitive to think that we can store ABCDEFG layer by layer, but find that this will lose the left subtree or the right subtree. So in the case of missing left and right nodes, we also insert null values to replace the missing nodes. So the storage is ABCD NULL EF NULL G. If there are many empty nodes, the storage efficiency is very low. So it’s very efficient to store a complete binary tree. When we learned about heap sorting, we learned that a heap is also a tree, and a complete binary tree, so a typical implementation of array storage is a heap. Array storage requires no node references and is simpler to operate. The implementation in Java is PriorityQueue.

Linked list to store

In the form of a linked list, we have a pretty good idea how to store it, as long as we store references to child nodes at each node. If we have a binary tree, we can store only the left and right nodes, and if we have a multi-tree, we can store each child node in an array. On the basis of linked list storage, there are many commonly used implementations, such as binary search tree, AVL tree, red black tree, interval tree, etc.

Tree traversal

We found a specified node in a tree, and obviously we need to walk through all the nodes in the tree. If it’s an array store, we just iterate through the array. In the case of linked list storage, there are two traversal modes: depth-first traversal DFS and breadth-first traversal BFS.

DFS depth-first search

Depth first, that is, go deep to find the leaves first. We output a tree. For the root node, the left subtree, and the right subtree, we have three output orders: root left, root right, and root left. Also called pre-order traversal, mid-order traversal, post-order traversal. For the tree above, we use depth-first, and the traversal results are as follows:

  • Preorder traversal: output the root node, then output its left node, then output its root node because it is a tree again. Loop until all the left subtrees of the root node are processed, and then output the right node, the same way. Is FCADBEHGM
  • Middle order traversal: go all the way down to the left subtree and find the output A. When the tree represented by A is output, return to the parent node C of A, because the left node has been output, and output the root node C. Let’s go to the right subtree D. Again, keep looking for his left subtree, repeating the steps above. Is ADBDFHEMG
  • After traversal: also find the left subtree, output A, A represents the tree output, back to C, because the right subtree output, to the right subtree D, also find the left subtree, then find the right subtree, and finally output the root. Is ABDCHMGEF

The language description is weak, so let’s go straight to the code

The former sequence traversal

class TreeNode {
    int val;
    TreeNode left, right;
}

void traverse(TreeNode root) {  if(root == null) return;  Log.d(root.val); Traverse by (root. Left);Traverse by (root. Right);} Copy the code

In the sequence traversal

class TreeNode {
    int val;
    TreeNode left, right;
}

void traverse(TreeNode root) {  if(root == null) return;  traverse(root.left);  Log.d(root.val);  traverse(root.right); } Copy the code

After the sequence traversal

class TreeNode {
    int val;
    TreeNode left, right;
}

void traverse(TreeNode root) {  if(root == null) return;  traverse(root.left);  traverse(root.right);  Log.d(root.val); } Copy the code

We’ve done it recursively, but we can also do it on a stack. You can figure out how to do it, and we’ll cover DFS using a stack in a later article.

BFS breadth first search

Breadth-first, that is, to traverse the broad direction first, that is, to visit all the children of each node first. For a tree, a very vivid interpretation is the sequential traversal, layer by layer.For the above tree, the BFS result is FCEADHG NULL NULL B NULL NULlNULL M. How do we do that in code

So if you think about it for a second we can use the attributes of the queue, first in, first out, put in the root node, take out, put in the left and right child nodes, loop out, put in the left and right nodes. This completes the sequence traversal.

Sequence traversal code

class TreeNode {
    int val;
    TreeNode left, right;
}

public ArrayList<Integer> printBFS(TreeNode root) {  ArrayList<Integer> list = new ArrayList<Integer>();  if(root == null) { return list;  }  // Use queues  Queue<TreeNode> queue = new LinkedList<TreeNode>();  queue.offer(root);  while(! queue.isEmpty()) { // Put it into the left and right subtrees  TreeNode treeNode = queue.poll();  if(treeNode.left ! =null) {  queue.offer(treeNode.left);  }  if(treeNode.right ! =null) {  queue.offer(treeNode.right);  }  list.add(treeNode.val);  }  return list; } Copy the code

Commonly used framework for solving problems

Most binary tree problems are traversal problems, that is, add, delete, change and check problems, the basic way is the above two DFS and BFS.

  • BFS is the same as above, using queues to manipulate data.
  • DFSRecursion is also commonly used to solve problems. Recursive operations are difficult to think aboutA trilogy:
    1. What are the subproblems to be dealt with? How do I break it down into subproblems?
    2. What is the return value, and what information should each level of recursion return upward?
    3. What are the termination conditions? When does recursion end?

In fact, recursion is a lot of repeated operations on itself, so we need to break down this big problem into subproblems, and then the call itself must have a header, can’t be infinite, what are the termination conditions. And how to merge the data for each subproblem, that is, how to deal with the data results for each subproblem.

The summary above may still be in a fog, in the future daily questions, we will start from the simple difficulty of binary tree questions, and then move to intermediate and advanced, and slowly you will understand this framework.