Writing in the front
- Full text data structure
Heap sort
- Select sort, unstable sort
- The value of each node is greater than or equal to the value of its left and right children, called
Big pile top
The ascending - The value of each node is less than or equal to the value of its left and right children, called
Small cap pile
The descending order - For an array, convert the coordinates of the array to
Sequential storage of binary trees
The format of the ongoingSize top heap
- Sorting,
arr.length / 2
Is the number of non-leaf nodes,- 1
The largest coordinate of a non-leaf node
Public static void heapSort(int[] arr) {for (int I = arr. Length / 2-1; i >= 0 ; i--) { adjustHeap(arr,i,arr.length); } // Swap the top element with the end element, and sink the largest element to the end of the array. for (int i = arr.length - 1; i > 0 ; i--) { temp = arr[i]; arr[i] = arr[0]; arr[0] = temp; I adjustHeap(arr,0, I); I adjustHeap(arr,0, I); I adjustHeap(arr,0, I) } System.out.println(Arrays.toString(arr)); }Copy the code
- Resize the array to a big top heap
/** * add I to the tree corresponding to non-leaf nodes, Public static void adjustHeap(int[] arr,); public static void adjustHeap(int[] arr, int i, int length) { int temp = arr[i]; For (int k = I * 2 + 1; k < length; K = k * 2 +1) {if (k +1 < length && arr[k] < arr[k+1]) {if (k +1 < length && arr[k] < arr[k+1]) {if (k +1 < length && arr[k] < arr[k+1]) {if (k +1 < length && arr[k] < arr[k+1]) {if (k +1 < length && arr[k] < arr[k+1]) { If (arr[k] > temp) {arr[I] = arr[k]; // assign the large value to the current node I = k; } else {break; // give the value of k to I, at the end of the loop, give the value of the parent to the child. Arr [I] = temp; arr[I] = temp; }Copy the code
- Small cap pile
public static void adjustHeap(int[] arr, int i, int length) { int temp = arr[i]; for (int k = i * 2 + 1; k < length; k = k * 2 + 1) { if (k + 1 < length && arr[k] > arr[k+1]) { k++; If (arr[k] < temp) {arr[I] = arr[k]; // assign the large value to the current node I = k; } else {break; // give the value of k to I, at the end of the loop, give the value of the parent to the child. } } arr[i] = temp; }Copy the code
Hoffman tree
- Given n weights, as n leaf nodes, if the tree
Weighted Path Length (WPL)
If it reaches the minimum, it is the optimal binary tree, also known as the Huffman tree. The node with the larger weight is closer to the root - A path, from one node down to the path between the child or grandchild nodes
- The number of layers of the root node is -1, and the length of the path to the node at layer L is
L-1
- Weighted path length, gives a node a weight value, the length of the path from the root node to the node multiplied by the weight of the node
- The weighted path length of the tree, the sum of the weighted paths of all leaf nodes, is denoted as
weighted path length
, the nodes with larger weights are closer to the root node, and the smallest is the Huffman tree/optimal binary tree
Public static HuffmanNode createHuffmanTree(int[] arr) {// Put list ArrayList<HuffmanNode> Nodes = new ArrayList<>(); for (int i : arr) { nodes.add(new HuffmanNode(i)); } while (nodes.size() > 1) {// Sort collections.sort (nodes); HuffmanNode left = nodes.get(0); HuffmanNode left = nodes.get(0); HuffmanNode right = nodes.get(1); HuffmanNode parent = new HuffmanNode(left.value + right.value); parent.left = left; parent.right = right; // Remove nodes.remove(left) from the list; nodes.remove(right); // Add the newly built nodes.add(parent); } // Return the head of the Huffman tree return nodes.get(0); }Copy the code
- node
Class HuffmanNode implements Comparable<HuffmanNode> {HuffmanNode implements Comparable; HuffmanNode left; HuffmanNode right; public HuffmanNode(int value) { this.value = value; } @Override public String toString() { return "HuffmanNode{" + "value=" + value + '}'; } @override public int compareTo(HuffmanNode o) {return this.value - o.value; } public static void preOrderList(HuffmanNode node) {system.out.println (node); if (node.left ! = null) { preOrderList(node.left); } if (node.right ! = null) { preOrderList(node.right); }}}Copy the code
Huffman coding
- Algorithm, data file compression, variable word length coding (VLC) a kind of
- Variable length encoding, statistics of the number of occurrences of each character, the more words, the less the corresponding binary bits, but there will be polysemy
- Huffman coding, lossless compression, the number of character occurrence is constructed into a Huffman tree, the number as the weight; This character is encoded by the path from the root node to the leaf node to avoid ambiguity
- We use a feature of the Huffman tree
The larger the weight, the closer it is to the root node
, so the more times the character appears, the smaller the encoding length - If there are multiple weights of the same value in the Huffman tree, the resulting tree structure may be different, but the WPL is the same, so the generated Huffman code is different, but the compressed size/length is the same
The compression
- The effect will be one
A byte array is converted to a Huffman encoded byte array
/** * Return huffmanZip(byte[] bytes) {private static byte[] huffmanZip(byte[] bytes) { List<HuffmanCodeNode> nodes = getNodes(bytes); HuffmanCodeNode node = createHuffmanTree(nodes); Map<Byte, String> huffmanCodes = getHuffmanCodes(node); return zip(bytes, huffmanCodes); }Copy the code
- First the
Bytes is converted to list
To generate a Huffman tree
public static List<HuffmanCodeNode> getNodes(byte[] bytes) { ArrayList<HuffmanCodeNode> nodes = new ArrayList<>(); HashMap<Byte, Integer> map = new HashMap<>(); Integer count = 0; For (byte b: bytes) {count = map.get(b); if (count == null) { map.put(b, 1); } else { map.put(b, count + 1); ForEach ((b, val) -> Nodes.add (new HuffmanCodeNode(b,val))); return nodes; }Copy the code
- Generate a Huffman tree and get the root node
public static HuffmanCodeNode createHuffmanTree(List<HuffmanCodeNode> nodes) { while (nodes.size() > 1) { Collections.sort(nodes); HuffmanCodeNode left = nodes.get(0); HuffmanCodeNode right = nodes.get(1); HuffmanCodeNode parent = new HuffmanCodeNode(null, left.weight + right.weight); HuffmanCodeNode parent = new HuffmanCodeNode(null, left.weight + right. parent.left = left; parent.right = right; nodes.remove(left); nodes.remove(right); nodes.add(parent); } return nodes.get(0); }}Copy the code
- Generate the Huffman code table
Static map<Byte, String> huffmanCodes = new HashMap<>(); Static StringBuilder StringBuilder = new StringBuilder(); / / overloading, Public static Map<Byte, String> getHuffmanCodes(HuffmanCodeNode node) {if (node == null) {return null; } getHuffmanCodes(node.left, "0", stringBuilder); getHuffmanCodes(node.right, "1", stringBuilder); return huffmanCodes; } /** * Get the Huffman encoding of all the leaf nodes of the passed node, * @param code path where the left child is 0 and the right child is 1 * @param stringBuilder path */ public static void GetHuffmanCodes (HuffmanCodeNode node, String code, StringBuilder StringBuilder) {// Generate a new StringBuilder because every time it encounters a non-leaf node, StringBuilder stringCode = new StringBuilder(StringBuilder); stringCode.append(code); if (node ! = null) {if (node.data == null) {// Non-leaf nodes, the processing of the left getHuffmanCodes(node.left, "0",stringCode); / / right getHuffmanCodes (node. Right, "1", stringCode); } else {// Find a leaf node huffmancodes.put (node.data, stringcode.tostring ()); }}}Copy the code
- Generate the corresponding Huffman encoded byte array, because when the byte array is converted to a binary string, if the end of the byte array starts with 0, the beginning will be omitted, so it is used separately
endString
record
Static String endString = ""; /** * Take an array of bytes corresponding to a string, pass the Huffman encoding table, Returns a compressed byte array of huffmanCodes * @param bytes original character array * @param huffmanCodes huffmanCodes * @return original character encoding array * Java numbers are in the form of complement, * Positive three digits in one * negative complement = Original code keeps symbol unchanged bit-by-bit + 1 * byte[] A byte saving 8-bit signed binary requires -> -1 inverse -> Reserved symbol is, -> Decimal * 10101000(complement) => 10101000-1 => 10100111 => 11011000 => -88 */ private static byte[] zip(byte[] bytes, Map<Byte, String> huffmanCodes) { StringBuilder stringBuilder = new StringBuilder(); For (byte b: bytes) {stringBuilder.append(huffmancodes.get (b)); } // if it is not divisible by 8, add 7 and it must be divisible by 8; //int len = (stringBuilder.length() + 7) / 8; int len = stringBuilder.length() % 8 == 0 ? stringBuilder.length() / 8 : stringBuilder.length() / 8 + 1; if (stringBuilder.length() - (len - 1) * 8 ! EndString = stringBuilder. Substring ((len-1) * 8, stringBuilder. Length ()); Byte [] huffmanCodeBytes = new byte[len]; String strByte; Int index = 0; for (int i = 0; i < stringBuilder.length(); I += 8) {if (I + 8 > stringBuilder.length() -1) {strByte = stringBuilder.substring(I); } else { strByte = stringBuilder.substring(i, i + 8); } huffmanCodeBytes[index] = (byte) Integer.parseInt(strByte,2); index ++; } return huffmanCodeBytes; }Copy the code
Unpack the
- call
byte[] source = decode(huffmanCodes, res);
Copy the code
- To complete the decoding of compressed data, the essence is to pass in the compressed byte array and the corresponding Huffman encoding table and decode it into the original byte array
/** * @param huffmanCodes * @param huffmanBytes Private static byte[] decode(Map< byte, String> huffmanCodes, byte[] huffmanBytes) { StringBuilder stringBuilder = new StringBuilder(); // Convert the byte array to a string for (int I = 0; i < huffmanBytes.length; i++) { boolean flag = i == huffmanBytes.length - 1; stringBuilder.append(byteToBitString(! flag,huffmanBytes[i])); HashMap<String, Byte> map = new HashMap<>(); huffmanCodes.forEach((b, s) -> map.put(s,b)); Byte ArrayList< byte > list = new ArrayList<>(); int count; for (int i = 0; i < stringBuilder.length(); I += count) {// Scan the corresponding binary string count = 1; boolean flag = true; Byte b = null; While (flag) {// fetch one bit // let count move until an existing character String key = stringBuilder.subString (I, I + count); b = map.get(key); if (b == null) { count ++; } else {// Match flag = false; } } list.add(b); // I += count; Byte [] bytes = new byte[list.size()]; byte[] bytes = new byte[list.size()]; for (int i = 0; i < bytes.length; i++) { bytes[i] = list.get(i); } return bytes; }Copy the code
- Converts a byte to a binary string
@param flag indicates whether the byte value needs to be added. If it is true, it needs to be added. If it is false, it does not. If it's the last byte and you don't need to fill in the higher order * @param b corresponds to a byte, a binary string, Private static String byteToBitString(Boolean flag, byte B) {private static String byteToBitString(Boolean flag, byte B) {// Use a variable to save b // convert b to int int temp = b; if (flag) { //2^8 temp |= 256; / / bitwise or} / / 11111111111111111111111110101000 / / is actually take after eight String s = Integer. ToBinaryString (temp); // Convert to b's complement. If (flag) {return s.substring(s.length() -8); } else {return endString; }}Copy the code
For file
- Compress the file
/** * private static void zipFile(String srcFile, String desFile) { FileInputStream fis = null; FileOutputStream ops = null; ObjectOutputStream oos = null; try { fis = new FileInputStream(srcFile); Byte [] bytes = new byte[fis.available()]; // Create an array of the same size as the original file. / / read fis. Read (bytes); Byte [] huffmanBytes = huffmanZip(bytes); Ops = new FileOutputStream(desFile); Oos = new ObjectOutputStream(ops); // Write the Huffman encoding and file compression bytes as a stream of objects, using oos. WriteObject (huffmanBytes) to facilitate source file recovery; oos.writeObject(huffmanCodes); oos.writeObject(endString); } catch (Exception e) { e.printStackTrace(); } finally { try { if (ops ! = null) { ops.close(); } if (fis ! = null) { fis.close(); } if (oos ! = null) { oos.close(); } } catch (IOException e) { e.printStackTrace(); }}}Copy the code
- Unzip the file
private static void unzipFile(String zipFile, String desFile) { FileInputStream fis = null; ObjectInputStream ois = null; FileOutputStream ops = null; try { fis = new FileInputStream(zipFile); ois = new ObjectInputStream(fis); Byte [] huffmanBytes = (byte[]) ois.readObject(); Map<Byte, String> huffmanCodes = (Map<Byte, String>) ois.readObject(); String endString = (String) ois.readObject(); System.out.println(endString); // decode byte[] bytes = decode(huffmanCodes, huffmanBytes); // Write ops = new FileOutputStream(desFile); ops.write(bytes); } catch (Exception e) { e.printStackTrace(); } finally { try { if (ops ! = null) { ops.close(); } if (ois ! = null) { ois.close(); } if (fis ! = null) { fis.close(); } } catch (IOException e) { e.printStackTrace(); }}}Copy the code
summary
- Not very effective for any file, if the file has a lot of duplicate elements, the compression rate is high
- The important thing is to get the mapping and the order,
Mapping of characters to degrees
->The corresponding Huffman tree
.Mapping of characters to encodings
->Huffman coding table
; In accordance with theThe original sequence is stored as a binary string
->Stored in bytes in a byte array
, will be decompressedThe key value swap of the Huffman code
->Encoding and character mapping
, in the transformed coding table,Read the corresponding characters in order of the retrieved binary string