This is the sixth day of my participation in the August More text Challenge. For details, see:August is more challenging
Huffman tree
Basic terminology
- Path and path length
- Path: A path in a tree between children or descendants accessible from one node down.
- Node path length: The number of branches on the path from one node to another.
- Node weight and weighted path length
- Node weight: Assigning a value to a node in a tree that has some meaning.
- Node weighted path length: the product of the length of the path from the root to the node and the weight of the node.
- The weighted path length of the tree
- The sum of the weighted path lengths of all leaves in the tree.
- Huffman Tree
- The binary tree with the least weighted path length is a Huffman tree.
Among all binary trees with n leaves and the same weight, there must be a tree whose weighted path length is the minimum value, which is called the “optimal binary tree”.
Construct the Huffman tree
Basic idea: make the node with big power close to the root
- According to the given n weights {w1, w2… , wn}, construct the set of n binary trees F = {T1, T2… Tn}, where each binary tree contains only one root node with weight value wi, and its left and right subtrees are empty trees;
- In F, two binary trees with the smallest weight of the root node are selected as the left,
The right subtree constructs a new binary tree, and the weight of the new binary root node is set as the sum of the weight of its left and right subtree roots.
- Delete these two trees from F, and add
A newly generated tree;
- Repeat these steps until F contains only one tree.
Realization of Huffman construction algorithm
A Huffman tree with n leaves has 2n-1 nodes
- Adopt sequential storage structure – one dimensional structure array
- Node type definition
typedef struct { ElemType elem; / / node values int weight; / / has a weight int parent, lch, rch; }HTNode, *HuffmanTree; Copy the code
- Tectonic HuffmanTree
- Enter the initial n leaves: set the weight value of HT[1..n]
- Make the following n-1 merges to produce HT[I], I =n+1.. 2 n – 1:
- Select HT[s1] and HT[s2] (parent = 0) from HT[1..i-1] (parent = 0)
- Change the parent value of HT[s1] and HT[s2] to parent= I
- HT[I] : weight=HT[s1]. Weight + HT[s2]. Weight, LCH =s1, RCH =s2
Status CreatHuffmanTree(HuffmanTree HT, int n){ if (n <= 1) // The number of nodes is invalid return ERROR; int m = 2 * n - 1; int i; int s1, s2; HT = new HTNode[m + 1]; // Element 0 is not used, HT[m] represents the root node for (i = 1; i <= m; ++i) { HT[i].lch =0; HT[i].rch = 0; HT[i].parent = 0; } for (i = 1; i <= n; ++i)// Enter weights cin >> HT[i].weight; for (i = n + 1; i <= m; ++i) {// Construct the Huffman tree Select(HT, i - 1, &s1, &s2); // In HT[k](1≤ K ≤i-1), select two nodes whose parent domain is 0 and whose weight is the smallest, and return their serial numbers S1 and S2 in HT if(s1 ! =0&& s2 ! =0) { HT[s1].parent = i; HT[s2].parent = i; // delete s1,s2 from F HT[i].lch = s1; HT[i].rch = s2; // S1 and S2 are the children of I respectively HT[i].weight = HT[s1].weight + HT[s2].weight; // The weight of I is the sum of the weights of the left and right children}}return OK; } Copy the code
The application of Huffman trees
Huffman code
- Algorithm implementation
Status CreatHuffmanCode(HuffmanTree HT, HuffmanCode& HC, int n) { if (n <= 1) return ERROR; int start, i; int f = 0, c; HC = new char* [n + 1]; char* cd = new char[n]; cd[n - 1] = '0'; for (i = 1; i <= n; ++i) { while(f ! =0) { // Start at the leaf and work backwards to the root start = n - 1; c = i; f = HT[i].parent; if (HT[f].lch == c) cd[start] = '0'; else cd[start] = '1'; c = f; f = HT[f].parent; } HC[i] = new char[n - start]; // Encode the array strcpy(HC[i], &cd[start]); } delete cd; cd = NULL; return OK; } Copy the code
- Important conclusions
- Huffman codes are unequal length codes
- Huffman encoding is prefix encoding, that is, the encoding of any character is not the prefix of another character encoding
- There is no node of degree 1 in Huffman code tree. If the number of leaf nodes is n, the total number of nodes in Huffman coding tree is 2n-1
- Sending process: Send character data according to the encoding table obtained by Huffman tree
- Receiving process: according to the provisions of left 0, right 1, from the root node to a leaf node, complete the decoding of a character. Repeat this process until the data is received