Basic knowledge

  • (1) Basic concept of tree
  • (2) binary tree
  1. The definition of binary tree and its main characteristics
  2. Sequential storage structure and chain storage structure of binary tree
  3. Traversal of binary trees
  4. The basic concept and construction of clue binary tree
  • (3) Trees and forests
  1. The storage structure of a tree
  2. Forest and binary tree conversion
  3. Traversal of trees and forests
  • (4) the application of tree and binary tree
  1. Binary search tree
  2. Balanced binary trees
  3. Huffman tree and Huffman coding tree

Knowledge framework

(3) Trees and forests

1. Storage structure of the tree

The storage structure of the tree can be sequential storage structure and chain storage structure, which is required to uniquely reflect the relationship between nodes in the tree.

Three commonly used storage structures

  1. Parent representation method

A set of continuous space is used to store each node, and a pseudo-pointer is added to each node to indicate the location of its parent node in the array. Where, the subscript of the root node is 0, and its pseudo-pointer field is -1.The storage structure of the parent representation:

#define MAX_TREE_SIZE {typedef struct{// ElemType data; // Data element (data field) int parent; // Parent position field (array subscript)}PTNode; PTNode nodes[MAX_TREE_SIZE]; // type pedef struct{PTNode nodes[MAX_TREE_SIZE]; // Parent represents array int n; }PTree; // Tree type definitionCopy the code

The memory structure of the parent representation takes advantage of the fact that each node (except the root node) has only one parent, which makes it possible to get the parents of each node quickly, but searching for the children of each node requires traversing the entire tree.

  1. Child notation

If the children of each node are linked together with a single linked list to form a linear structure, then n nodes have N child linked lists (the child linked list of leaf nodes is an empty list). This storage method is very straightforward in finding children, whereas finding parents requires traversing n child lists of n nodes to which the pointer field of the child list points.

  1. Child brother notation (binary tree notation)

The child sibling representation is also known as binary tree representation, which is a binary linked list as the storage structure of the tree. Each node consists of three parts: node value, pointer to the first child node of the node and pointer to the next sibling node of the node (all the siblings of the node can be found along this field).

The storage structure of the child brother representation:

typedef struct CSNode{ ElemType data; Struct CSNode *firstchild, *nextsibling; }CSNode, *CSTree;}CSNode, *CSTree;Copy the code

The biggest advantage of this storage method is that it can easily convert the tree into a binary tree and find the child node of the node. However, it is complicated to search the parent node from the current node and the time cost is high. Adding a parent field to each node points to its parent makes it easy to find the parent.

2. Conversion between forest and binary tree

  • Tree to binary tree:

The rule for converting a tree into a binary tree is that each node has a left pointer to its first child and a right pointer to its next sibling in the tree, namely, “left child and right brother”. Where, since the root node has no brothers, the binary tree converted from the tree has no right subtree.Tree to binary tree:

  1. Add a line between siblings
  2. For each node, only its connections to the first child are kept, and all connections to other children are erased
  3. With the root as the axis, the rotation is 45° clockwise
  • Forest to binary tree:

Forest to binary tree:

  1. Convert each tree in the forest to the corresponding binary tree
  2. The roots of each tree can also be seen as brothers, with a line added between each tree
  3. Rotate 45° clockwise with the root of the first tree as the axis

Forest is converted into a binary tree rules: first the forest of each tree is converted into a binary tree, then the first tree to the root of the binary tree root, after conversion to convert the first tree left subtree as the left subtree of the binary tree, after the second tree as the converted binary tree right subtree, transforming the third tree as a binary tree after the right subtree of the right subtree, and so on.

  • Binary tree to forest:

Rules for binary tree conversion to forest:

If the binary tree is not empty, the root of the binary tree and its left subtree are converted to the binary tree form of the first tree, the right subtree of the binary root is converted to the binary tree form of the second tree, and so on until finally a binary tree without the right subtree is produced.

Tree traversal: Each node in the tree is accessed in some way only once.

  • First root traversal: If the tree is not empty, the root node is first accessed and then each subtree of the root node is traversed from left to right in the same order as the binary tree.
  • Back-root traversal: If the tree is not empty, each subtree of the root node is traversed from left to right, and then the root node is accessed in the same order as the middle-order traversal of the corresponding binary tree.
  • Hierarchical traversal: Access each node in order, the same as the hierarchical traversal of the corresponding binary tree.

Forest traversal operation:

Sequential forest traversal: if the forest is not empty,

  • Access the root of the first tree in the forest;
  • The subtree forest of the root node in the first tree is traversed sequentially.
  • Walk through the forest after removing the first tree.

Middle-order traversal forest: if the forest is not empty,

  • A subtree that traverses the root of the first tree in a forest;
  • Access the root of the first tree;
  • The sequence traverses the forest after removing the first tree.

(4) the application of tree and binary tree

1. Binary search tree

Binary sort tree (binary lookup tree)

  • define

Binary sort tree (BST), also known as binary search tree. Binary sort trees are either empty trees or non-empty binary trees with the following properties:

  1. If the left subtree is not empty, the keyword values of all nodes in the left subtree are smaller than the keyword values of the root node.
  2. If the right subtree is not empty, the keyword values of all nodes in the right subtree are greater than the keyword values of the root node.
  3. The left and right subtrees themselves are also binary sort trees.

The value of the left subtree node < the value of the root node < the value of the right subtree node, so the middle order traversal of the binary sorting tree can get an increasing ordered sequence.

  • To find the

Non-recursive algorithm for binary sorting tree:

BSTNode *BST_Serach(BiTree T,ElemType key){ while(T! =NULL&&key! If (key<T->data) T=T->lchild; Else T=T->rchild; } return T; }Copy the code
  • insert

Insertion of a binary sort tree

If the original binary sorting tree is empty, the node is inserted directly. If the keyword value is less than the root value, the left subtree is inserted; otherwise, the right subtree is inserted.

Int BST_Insert(BiTree &T,KeyType k){if(T==NULL){T=(BiTree)malloc(sizzeof(BSTNode)) T->key=k; T->lchild=T->rchild=NULL; return 1; } else if(k==T->key) return 0; Else if(k<T->key) return BST_Insert(T->lchild,k); Return BST_Insert(T->rchild,k)}Copy the code

  • structure

Binary sorting tree construction algorithm:

Starting from an empty tree, enter the elements in turn and insert them into the appropriate positions in the binary sorting tree. If the binary sorting tree is not empty, the value of the new node is compared with the value of the root node. If the value is smaller than the value of the root node, the left subtree is inserted; otherwise, the right subtree is inserted. If the binary sort tree is empty, the new node is used as the root of the binary sort tree.

void Create_BST(BiTree &T,KeyType str[],int n){ T=NULL; Int I =0; BST_Insert(T, STR [I]); while(I <n){BST_Insert(T, STR [I]); i++; }}Copy the code
  • delete
  1. The deleted nodes are leaf nodes:

Delete it without breaking the binary tree

  1. The deleted node contains a left or right subtree:

Let the subtree of Z be the subtree of the parent, z, in place of z

  1. The deleted node contains left and right subtrees:

Replace z with a direct precursor (or direct successor) of Z, and then remove the direct precursor (or direct successor) from the binary sorting tree, thus switching to the first or second case.

  • Search efficiency analysis

The search efficiency of binary sorting tree mainly depends on the height of the tree.

Search length – The number of times a keyword is compared in a search operation is called the search length, which reflects the time complexity of the search operation

  • If the height of the tree is h, you need to compare h times to find the lowest node
  • Best case: minimum height of binary tree with n nodes is [log2n] + 1, average search length = O(log2n)
  • Worst case: Each node has only one branch, tree height H = node number n, average search length = O(n)

Average lookup length of successful lookups ASL: Length (depth) of each lookup times the number of lookups divided by the total number of lookups

2. Balanced binary trees

  • Definition: When inserting and deleting Binary Tree nodes, ensure that the absolute value of the height difference between the left and right subtrees of any node does not exceed 1. Such Binary Tree is called Balanced Binary Tree (AVL for short). The height difference between the left and right subtrees of the node is defined as the balance factor of the node, and the value of the balance factor of the balanced binary tree node can only be -1, 0, 1.

The average search length of a balanced binary tree is O(log2(n)), or O(n) if there is a single-branched tree with only one right or left child (similar to an ordered single-linked list).

  • Balanced binary tree insertion

Whenever in the binary sort tree to insert a node (or delete), first check whether the inserted into the path of nodes for the node insert lead to imbalance, if uneven find insert path first minimum unbalanced subtree (insert node nearest, balance factor, absolute value greater than 1 nodes to the node to the root of the subtree), On the premise of ensuring the characteristics of binary sorting tree, the position relation of each node is adjusted to achieve a new balance.

LL Balance rotation (right single rotation)

RR balanced rotation (left single rotation)

LR Balanced rotation (first left then right double rotation)

RL balanced rotation (first left then right double rotation)

  • Balanced binary tree lookup

The search process of balanced binary tree is the same as that of binary sorting tree. The number of keywords that can be compared to a given value during a lookup does not exceed the depth of the tree. The maximum depth of a balanced binary tree with n nodes is O(log2(n)), so the average search length of a balanced binary tree is O(log2(n)).

3. Huffman tree and Huffman coding tree

Huffman definition:

  • The value assigned to the node in the tree is called the weight of the node;
  • The length of the path from the root to any node (the number of edges passed), multiplied by the weight of that node, is called the weighted path length of that node.
  • The sum of the weighted path length of all leaf nodes in the tree is called the weighted path length of the tree, denoted as WPL=∑ (I =1–n) Wi *li. Where, wi is the weight of the ith leaf node, and L I is the path length from the root node to the ith leaf node.
  • In a binary tree with n weighted leaf nodes, the one with the least weighted path length (WPL) is called a Huffman tree, also known as an optimal binary tree.

Where, (c) has the smallest WPL and is a Haverman tree.

Structure:

  • Construction algorithm description:
  1. Given n nodes with weights w 1, W 2,.., w n respectively;
  2. The above nodes are respectively regarded as n binary trees containing only one node to form forest F.
  3. To construct a new node, choose two trees with the lowest weights from F as the left and right subtrees of the new node to construct a new tree, and set the weights of the new node to the sum of the weights of the left and right subtrees.
  4. Delete the two selected trees from F and add the new trees to forest F;
  5. Repeat steps 3 and 4 until only one tree remains in forest F.

According to the construction process, Huffman tree has the following characteristics:

  • Each initial node eventually becomes a leaf node, and the smaller the weight is, the larger the path length is from the node to the root node.
  • In the construction process, n-1 nodes are created, so the total number of nodes in Huffman tree is 2n-1.
  • Each time a new tree is constructed, two trees are selected as the left and right subtrees, so there is no node with degree 1 in Huffman tree.

Huffman coding

  • Prefix encoding: If no encoding is a prefix of another encoding, such encoding is called prefix encoding.
  • Fixed – length encoding: In data communication, if each character is represented in binary form of equal length, such encoding is called fixed – length encoding.
  • Variable-length encoding: If unequal binary representations of different characters are allowed, this encoding is called variable-length encoding.

It is not clear whether 0 and 1 represent left or right subtrees. Therefore, the order of left and right nodes is arbitrary, so the constructed Huffman tree is not unique, but the weighted path length of each Huffman tree is the same and optimal. (Weight: frequency or number of occurrences. Huffman tree is constructed according to the weight, small below and large above, making WPL minimum and weight path length minimum, which is optimal.)