This is the sixth day of my participation in the August Text Challenge.More challenges in August

Abstract

Heap sorting requires a data structure called the big top heap. The big top heap is a binary tree structure. In essence, the number of parent nodes is greater than the number of left and right child nodes. The order of size of left and right child nodes is not limited, that is, the root node is the largest value.

The idea here is to keep swapping elements from the root of the big top heap and elements from the tail of the big top heap until there are no elements to swap. We’ll talk about the logic of the big top heap later.

logic

The sequence is first sorted through the big top heap. It then keeps pulling the top element out of the heap and putting it on the tail until the big top element is empty.

process

  1. The sequence is heaped in place
  2. Repeat until the number of heap elements is 1
    1. Swap top and bottom heap elements
    2. The number of elements in the heap minus 1
    3. Perform a bottom-up filtering for position 0

The following code explains the logic of the terms in-place heap and bottom-up filter.

implementation

First, build the heap in situ. To build a heap in place, you first process the sequence according to the sorting logic of the big top heap.

The sequence logic of the big top heap is that the value of the parent node is greater than the value of its left and right children, which can be thought of as a binary tree. Here, siftDown is used for in-place sorting, and only half of the sequence is looped through. Why? This will be explored in more detail when we look at the siftDown method below.


// Build the heap in place
// Bottom-up filter
heapSize = array.length;
for (int i = (heapSize >> 1) - 1; i >= 0; i--) {
	siftDown(i);
}
Copy the code

The top and tail elements of the heap are swapped, and then the number of sequence elements to be compared is reduced by 1, and the sequences to be compared are filtered using the siftDown method, preserving the big-top heap nature of the sequences. The original exchange is then continued until the number of sequences that can be compared is 1.

while (heapSize > 1) {
	// Swap the top and tail elements
	swap(0, --heapSize);
		
	// siftDown on position 0
	siftDown(0);
}
Copy the code

Big heapsiftDownmethods

Here’s a look at siftDown.

The relationship between the parent and child nodes of a binary tree follows this formula

  • leftChilder = partner * 2 + 1
  • rightChilder = parnter * 2 + 1 + 1
  • The number of half (leaf) nodes is 1/2 of the total

The siftDown method basically puts the elements in the index position in the appropriate position. So what is the right location?

Since the parent of the big top heap is larger than the left and right children, it is good to ensure that the element at index is larger than its left and right children.

If index < half is used as a loop comparison, then the question is: why not compare when index >= half?

This will mention a very clever point, first look at the nature of the big top heap, left and right child nodes have no specific order requirements, second, the value of the child node is less than the parent node. Therefore, according to the leaf node nature of binary tree, if the index position is in the leaf node position, then it is smaller than its parent node, so there is no need to compare (this is based on the filtering process when the sequence originally corresponds to the order of the big top heap, and there is a change in the element of one position).

And that’s why in the same sort in place above, you only start at half position, because starting at this position, you’re definitely going to compare its children, filter out the big ones, and put them in the right place.

There are three clever points in the code

  1. The loop starts at half of the sequence and does not compare if the position is not in the first half, as analyzed above
  2. When comparing, the largest node comparison of its left and right children is obtained. Check to see if the right child exists while fetching the right childrightIndex<heapSize. Because the big top heap conforms to the full binary tree (try to arrange the elements as far as possible to the left subtree).
  3. It’s called a binary tree, but it has no actual nodes, it’s still a linear sequence, and it’s a formula to get the position of the left and right subtrees, so it’s a tree in mind, a tree without a tree
	
/* * filter the element at index position */
private void siftDown(int index) {
	E element = array[index];

	int half = heapSize >> 1; // Remove non-leaf nodes
	// Index of the first leaf == number of non-leaf nodes
	// Index must be a non-leaf node
	while (index < half) {
		// Index has two cases
		// there is only one left child
		// there are both left and right children
			
		// The default left child is compared to it
		int childIndex = (index << 1) + 1;
		E child = array[childIndex];
		// Right child node
		int rightIndex = childIndex + 1;
		if (rightIndex < heapSize && cmp(array[rightIndex], child) > 0) {
			child = array[ childIndex = rightIndex];
		}
			
		if (cmp(child, element) < 0) break;
			
		// Place the child node at index
		array[index] = child;
		// Reset index
		index = childIndex;
	}
	array[index] = element;
}

Copy the code

Time and space complexity

  • Best, average time complexity: O(nlogn)
  • Worst time complexity: O((nlogn)
  • Space complexity: O(1)
  • It’s an unstable sort

digression

This sort used some knowledge of binary tree and big top heap, may look down there are a lot of questions, here please first look at the officer have an impression, I will share the knowledge of binary tree, and then come back to look at heap sort, will let you think wide open.