0x01 Basic Concepts

A heap is a special kind of tree that meets the following conditions

  • Complete binary tree

The number of nodes in all layers is full except for the last layer, and the nodes in the last layer are concentrated in the continuous position of the left part

  • Each node in the heap must have a value greater than or equal to (or less than or equal to) the value of its left and right children

  • A heap where each node is greater than or equal to its children is called a “big top heap” and the root is the maximum

  • A heap where each node is less than or equal to its children is called a “small top heap” and the root is the minimum

Because heaps are not as strict as binary search trees. He only requires that the children of a node be larger or smaller than that node, so the same set of data can be used to build many different forms of the heap

The heap said

The heap is a full binary tree, and most of the time arrays are used to store the heap

regular

  • Par = floor((i-1)/2)

  • The left child of I is 2 times I plus 1

  • The right child of I is 2 times I plus 2

0x02 Heap operation

There are two main types of heap operations: insert and delete

Either insertion or deletion may no longer satisfy the definition of the heap:

  1. The heap is a complete binary tree

  2. Each node in the heap must be greater than or equal to (or less than or equal to) its left and right children

The adjustment required after an insert or delete operation to rematch the characteristics of the heap is called heapify

2.1 Stacking is divided into two types

  • From the bottom up (up)

  • The current element continually compares the size up to the parent node:

  • Big top heap: The current element is larger than the parent node. Swap and let the larger node go up

  • Small top heap: The current element is smaller than the parent node. Swap and let the smaller node go up

  • Go down from the top

  • The current element keeps comparing sizes down to the two child nodes

  • Big top heap: The ratio of the current element to the larger of the children, the smaller of the children, and the smaller of the children

  • Small top heap: The ratio of the current element to the smaller of the children, the larger than the children, let the larger node down

2.2 insert

Heap from the bottom up

Insert into the tail of the heap, meet the complete binary tree condition, then heap.

Heap from the bottom up

Compare the size of the newly inserted node with that of the parent node. If the newly inserted node is larger than the parent node, we swap the two nodes. This process is repeated until the heap condition is satisfied.

class Heap { var a[]; // Array, starting with index 1 var n; Var count; // Number of data stored in the Heap Heap(capacity) {a = new Array[capacity + 1]; n = capacity; count = 0; } // insert(data) {if (count >= n) return; ++count; a[count] = data; // insert const I = count; Swim (a, count, I)}} swim(a,n, I) {while (I /2 > 0 && a[I] > a[I /2]) {swap(a, I, I /2); // swap() : I = I /2; }}}Copy the code

2.3 Deleting the heaptop element

Heap from the top down

The top element stores the maximum or minimum value of data in the heap. After removing the top element of the heap, the last element can be moved to the root node to meet the condition of a complete binary tree, and then heaped

Heap from the top down

The last node is placed at the top of the heap, and the larger (big top heap) comparison is found among the children. When smaller than a child node, the two nodes are interchanged and the process is repeated. That’s how you stack it from the top down.

RemoveMax () {if (count == 0) return -1; A [1] = a[count]; // The last element moves to the root node --count; sink(a, count, 1); } sink(a, n, I) {while (true) {const maxPos = I; if (i*2 <= n && a[i] < a[i*2]) maxPos = i*2; if (i*2+1 <= n && a[maxPos] < a[i*2+1]) maxPos = i*2+1; If (maxPos == I) break; swap(a, i, maxPos); i = maxPos; }}Copy the code

2.4 Heap time complexity

The heapification process is commutated along the path of nodes, so the time complexity of heapification is proportional to the height of the tree, that is, O(logn).

2.5 Complexity of Insert and Delete Time

The primary logic for inserting data and removing heaptop elements is heapification, so the time complexity is O(logn).

0 x03 heap sort

Heap sort step

The “big top heap” is used for ascending order

The “small top heap” is used for descending order

Implementing heap sort can be broken down into two steps, build heap and sort

1, build the heap

The array is built into a heap in place. “In place” means operating on the original array without resorting to another array

Approach 1 (Process arrays from front to back)

Array elements are processed from front to back, and when data is inserted into the heap, it is heaped from bottom to top. Until the last element is processed.

Suppose that the heap initially contains only one piece of data, element 6 with index=1

  • Step1:6, there is only one element, there is no need to compare;

  • Step2: Call the insert operation mentioned above, insert element 8 with index=2 into the heap.

  • Heapified: here 8 is larger than 6 at the top of the heap. After heapified, 8 and 6 switch places

  • Step3: Insert element 3 with index=3 into the heap.

  • Heap: where 3 is less than the element 8 at index 3/2=1, there is no need to swap;

  • Step4: Insert element 10 with index=4 into the heap.

  • Heap: where 10 is greater than 6 on index 4/2=2, swap, and then swap with 8 on index=2/2=1

  • Step5:…

  • Step6: insert data from index 2 to n into the heap successively to complete the construction of the heap.

Except for the first node, all need to be heaped. That is, n minus 1 nodes are heaped

Approach 2 (process arrays from back to front)

Process arrays from back to front, using top down heap-down until the first element is processed.

For complete binary trees, all subscripts from n/2+1 to n are leaf nodes. So the first non-leaf node is n over 2

Leaf nodes are heaped down only to compare themselves, so the heap starts from the last non-leaf node index=4

  • Step1: The element 8 with index=4 is less than the child node 16, and the position is changed

  • Step2: the element 19 with index=3 is less than the child node 20, and the position is changed

  • Step3: the element 5 of index=2 is less than child node 16, and the position is changed. The element 5 is less than child node 13 again, and the position is changed

  • Step4: the element 7 of index=1 is less than child node 20, and the position is changed. The element 7 is less than child node 19 again, and the position is changed. Finish building the heap.

Nodes with subscripts from n/2 to 1 are heaped.

conclusion

  • Heap comparison method 1 n-1 nodes heap, using the array from back to front is better

Time complexity

Since leaf nodes do not need to be heaped, nodes to be heaped start from the penultimate layer. During the heaped process of each node, the number of nodes to be compared and exchanged is proportional to its height K

The heap time is order n.

2, sorting,

Sorting is based on a heap that has already been built. The first element in the array is the top of the heap, which is the largest or smallest element. We can use the idea of removing the top of the heap to sort the heap.

  • Step1: swap the top element 9 with the NTH element (index=5), where the maximum value is 5(the last bit of the array) and the element with subscript 5 is at the top of the heap

  • Heapified: 5 is less than 6, the larger of the child nodes, and switches places with 6

  • Step2: Take the top element 6 at this time and exchange it with n-1 element (index=4)

  • Heapified: 1 is less than 5, the larger of the children, and switches places with 5

  • Step3: Take the top element 5 and exchange it with n-2 elements (index=3)

  • Heapified: 3 is greater than child 1, not swapped

  • Step4: Take the top element 3 and exchange it with n-3 elements (index=2)

  • Step5: Take the top element 1 again, and find that there is no comparable child node, and the heap sort is finished

When you sort, n elements need to be heaped, and the heaped time is O(logn), so the sorting time is O(nlogn).

Sorting result

  • Big top heap: Ascending array

  • Small top heap: Descending array

Heap sort complexity

  • Time complexity

  • Heap sort consists of two operations: heap build and sort. The time complexity of heap build is O(n) and the time complexity of sort is O(nlogn). Therefore, the overall time complexity of heap sort is O(nlogn).

  • Spatial complexity

  • Sorting takes no extra space, just a temporary variable for swapping elements, so heapsort has O(1) space complexity.

0 x04 pile of application

Priority queue

A priority queue is also called a queue, but it is not the same as a normal queue. The order in which a normal queue is enqueued depends only on the order in which it is enqueued, whereas a priority queue is always enqueued according to the priority of the element itself. A priority queue is a sorted queue.

The heap is the most straightforward and efficient way to implement priority queues. Because a heap is so similar to a priority queue, a heap can be considered a priority queue. Adding an element to the priority queue is equivalent to adding an element to the heap; Fetching the highest priority element from the priority queue is equivalent to fetching the top of the heap element (big top heap — maximum; Small top heap – Minimum)

Usage scenarios

  • High performance timer

Suppose we have a timer that maintains a number of scheduled tasks, each of which has a trigger execution point.

Ordinary timer

The timer sets the polling time (say, 1 second) and scans the task to see if any task reaches the set execution time. When you get there, you take it out and execute it.

disadvantages

  • The execution time of the most recent task may be a long time away from the current time, and many of the previous scans are in vain.

  • Each poll scans the entire task list, which can be time-consuming if the task list is large.

Optimize with priority queues

  • Setp1: Stores the tasks in a priority queue according to their execution time. The head of the queue (the top of the heap) stores the tasks that are executed first, so a small top heap is created.

  • Setp2: Take the execution time of the first team task and subtract it from the current time point to obtain an interval T. Then the timer can be set to T seconds later to execute the first team task. So the timer doesn’t have to do anything between the current point in time and (t-1) seconds.

  • Setp3:T After seconds pass, the timer takes the priority of the first task in the queue and executes it without traversing the task list. Then calculate the difference between the execution time point of the new queue leader task and the current time point, and take this value as the waiting time for the next task of the timer.

The optimized

The timer does not poll every second, nor does it traverse the entire list of tasks

Let’s use the heap to find Top K

Top K: Find the number with the first K large (or the first K small) in a pile of data

Usage scenarios

  • Array [4,5,3,7,1,8], take the first three large elements

Let’s do the top of the heap

  • Setp1: The first 3 elements are in descending order, so select the small heap top. Take the first three arrays [4,5,3], build a small top heap, get [3,4,5];

  • Setp2: iterate through the group to element 7, which is larger than the top element 3, remove 3, put 7 into the heap, after heaping-small top heap becomes [4,5,7]

  • Setp3: Then iterate through the array to element 1, smaller than the top of the heap element 4, not processed

  • Setp4: then iterate through the set to element 8, which is larger than the top element 4, remove 4 and put 8 into the heap. After heapization, the small top heap becomes [5,8,7]. At this point, the traversal ends and the elements in the top heap are the first 3 elements

Time complexity

Let’s say the length of the array is, let’s take the first K element

Heap sort: time complexity O(n), the size of the heap is at most K, so one heap operation time complexity O(logK), the worst case each element needs to operate, time complexity O(nlogk)

Use heap to find the median of dynamic data set

The median is the number in the middle of an ordered array. If the number of numbers is odd, the 2nd /n+1 number is the median; If you have an even number of numbers, the median has two numbers, the 2nd /n number and the 2nd /n+1 number, either one

  • Maintain two heaps, one big top heap and one small top heap.

  • N numbers, n is even. The first N /2 data is stored in the large top heap and the last N /2 data is stored in the small top heap to ensure that the data in the small top heap is larger than the data in the large top heap.

  • N is odd. The big top heap stores N /2+1 data, and the small top heap stores N /2 data

  • When a new data is added, if the new data is less than or equal to the top element of the big top heap, the new data is inserted into the big top heap. Otherwise, the new data is inserted into the small top heap.

  • When the number of data in the two heaps does not match the agreed number, the smaller heap moves the top element to the other heap until equilibrium

  • The top element in the big top heap is the median we’re looking for

Time complexity

The time complexity of insert data is O(logn), and the median is directly taken from the top element of the large top heap, which is O(1).