Summary of insert sort and heap sort algorithms

This paper has participated inProject DigginTo win the creative gift package and challenge the creative incentive money.

Review of Basic Knowledge

  • Stable: if a = B and a is in front of B before sorting, a is still in front of B after sorting
  • unstable: If a = b and a precedes B before sorting, then a after sortingmayIt comes after b

Insertion Sort

  • The basic idea

    First of all, we divide the data in the array into two intervals, sorted interval and unsorted interval. The initial sorted interval is the first element of the array, and then we take the elements in the unsorted interval, find the appropriate insertion position in the sorted interval to insert the elements, and ensure that the sorted interval data is always in order. This process is repeated until the elements of the unsorted interval are empty and the algorithm ends

  • Dynamic graph display

    From the diagram above, we can easily see the thought of insertion sort, is a bit like you caught a poker, and then in turn elements inserted into it should be in the position, assume that number has been lined up for the first order, so now the second number is inserted into the front order interval, so now the two Numbers is an orderly; If the previous n-1 number is sorted, then the NTH number is inserted into the previous n-1 sorted interval, making the n number become the sorted interval

  • Code implementation

    /** * Insert sort idea * Step 1: find the position where the element should be inserted, and then insert * Step 2: move right for elements larger than the current element *@param {*} arr* /
    const insertSort = arr= > {
      const len = arr.length;
      for (let i = 1; i < len; i++) {
        let preIndex = i - 1;
        let current = arr[i];
        while (preIndex >= 0 && arr[preIndex] > current) {
          arr[preIndex + 1] = arr[preIndex];
          preIndex--;
        }
        arr[preIndex + 1] = current;
      }
      return arr;
    };
    Copy the code
  • Code reading

    The first thing we need to record is len, the length of the array, and then the outer for loop. Why does I start at 1, because by default the 0th element is sorted, so we just need to start with the second element and compare it to the sorted interval, and then decide whether to write I <= len? The answer is no, why not?

    Because we’re just looping through the array here, so if you want to get to the last element of the array, which of course is len-1, you can’t get to len, and that’s a point to watch out for. Then we define two variables in the loop, one is preIndex, which represents the index of the previous element of the current loop element, and the other is current, which represents the current loop element.

    PreIndex Function: You can use preIndex to insert the elements of the previously sorted range into the sorted range, so that the elements can be moved to the right

    Current: After all operations that need to move right are completed, the current loop element is finally inserted into the current preIndex position

    If we look inside the while loop, first we lock the condition that preIndex >= 0, which makes sense, we don’t compare to a negative index, so we move the current element to the right if the previous element is larger than the current element, We don’t have to worry about overwriting the current element, that’s what the current variable does, and then preIndex moves to the left, compares it to current, finds an element smaller than current, and if it doesn’t find one, So it is necessary to plug current to the head of the sorted interval

    Then after the while loop ends, the current element is inserted into the preIndex + 1 position that it already found. Just to understand why it’s preIndex + 1, because it has to find the first value that is smaller than the current element, so anyway, PreIndex must find an index of an element less than current, so current needs to be inserted after preIndex, that is, preIndex + 1.

  • The code analysis

    • Spatial complexity analysis

      As we can see from the code above, no additional storage is required during the running of the program, so the space complexity is O(1).

    • Is it a stable sorting algorithm

      Yes! Let’s think about it. In the implementation of the code, we compare the current element to the previous element, and we use arr[preIndex] > current, which means that when arr[preIndex] === current, we stop the comparison. Insert the current loop element after arr[preIndex] === current, so it is a stable sorting algorithm

    • Time complexity analysis

      If it’s a sorted array, we don’t have to move it, we just have to compare it, we just have to go from I = 1 to I = n, so it’s O(n).

      If the array is in reverse order, each insertion is equivalent to inserting new data into the first position of the array, so a large amount of data needs to be moved, and the worst time complexity is O(n2).

      Since the average time to insert an element in an array is O(n), and then for insert sort, each insert is equivalent to inserting an element in the sorted range n times, the average time is O(n2).

Heap Sort

  • The basic concept

    Condition 1: The heap is actually a complete binary tree

    Condition 2: The value of every node in the heap must be greater than or equal to (or less than or equal to) the value of every node in its subtree

    Complete binary tree definition: the number of nodes is full except for the last layer, and the nodes of the last layer are arranged to the left

    If we look at the picture below, can you see which are the heap structures

    We can easily see that 1 and 2 are the big top heap, 3 is the small top heap, and 4 is not a heap

  • What data structure should we use to implement the heap

    The answer is arrays, because arrays save a lot of space, and we can quickly find the left and right children and parents of a node by using the subscript of an array

    We can look at the following example

    We can see that the left child of the node with subscript I is the node with subscript I * 2, the right child is the node with subscript I * 2 + 1, and the parent is the node with math.floor (I / 2)

  • The basic idea

    First create a pile of data structure, and then stacked pile of the first element and the tail element to swap, each time the operation will be the largest element exchange to the pile end, the size of the heap at this time will be reduced by 1 at a time, and then to the rest of the elements of n – 1, continue to construct into big top pile (small cap pile), then the pile top and pile end element swaps, through the construction of the pile, Repeat this process until the heap size is 1 and the algorithm ends

  • Dynamic graph display

    There is no better illustration of heap sort than this one, where you take an unsorted array, build the heap, then process it into a real heap structure, then sort it, swapping the top element and the bottom element

  • Code implementation

    /** * heap sort idea * Step 1: create a heap H[0...... n-1]; * Step 2: Swap the heap head (maximum) and heap tail; * Step 3: Reduce the size of the heap by 1 and call the Heapify method to adjust the top of the new array to the appropriate position; * Step 4: Repeat Step 2 until the heap size is 1. * /
    
    let len = 0;
    
    function buildMaxHeap(arr) {
      len = arr.length;
      for (let i = Math.floor(len / 2) - 1; i >= 0; i--) {
        // Create a big top heapheapify(arr, i); }}function heapify(arr, i) {
      / / heap of adjustment
      let left = 2 * i + 1;
      let right = 2 * i + 2;
      let largest = i;
    
      if (left < len && arr[left] > arr[largest]) {
        largest = left;
      }
    
      if (right < len && arr[right] > arr[largest]) {
        largest = right;
      }
    
      if(largest ! == i) { [arr[i], arr[largest]] = [arr[largest], arr[i]];// Continue to sink small elements into child nodesheapify(arr, largest); }}const heapSort = arr= > {
      buildMaxHeap(arr);
      for (let i = arr.length - 1; i > 0; i--) {
        // First swap the top element with the current last element of the array
        [arr[0], arr[i]] = [arr[i], arr[0]].// Then the remaining array that needs to be reheap-adjusted is reduced by 1
        len--;
        // Rebuild the big top heap
        heapify(arr, 0);
      }
      return arr;
    };
    Copy the code
  • Code reading

    BuildMaxHeap (), construct a large top heap, and then loop through it, swapping the top element with the current last element of the array. Then reduce the length of the array that needs to be reheaped, and finally construct the big top heapify. After cycling through the entire array, the array is sorted

    Let’s start by looking at how it initializes the big top heap. In buildMaxHeap, we first record the length of arR and assign it to len, the global variable. Then it started from the middle of the array, the array of head of circulation, calls the heapify method, first explains why since the middle of the array to head cycle, because we know that the leaf node following pile only with himself compared, no sense, also is the middle of the array to the right of all the elements are no children, So we just start with the last non-leaf node and heap it up

    The current array is not a heap, so it needs to be adjusted to fit the definition of heap again. This process is called heapification

    Let’s look at the following process

    Math.floor(len / 2) -1 we can clearly see that the subscripts in the figure above start at 1, but we can actually see that the subscripts in the figure above start at 0, so we can see that the middle element in the figure above is math.floor (len / 2) -1. So the index of the middle element is math.floor (9/2) -1 = 3, which corresponds to 8 in the figure above, and the elements after 8 in the array are leaf nodes. Then we continue to see what the heapify method used in the loop does

    Yes, here’s the point of adjusting operation, through comparing the size of the parent and child node will be the largest element constantly bubble to the top of the heap, because in the code, we adopt the subscript data structure for 0, so the left side of the left index is 2 * I + 1, on the right side of the index is 2 * I + 2, Then a temporary variable is assigned to store the largest element of the current parent node and child node, and a series of comparisons are made, bubbling up the larger element, as shown in the figure below

    With the previous figure, the left side starts with 8 and 16, and then the parent and child nodes meet the criteria, and then we replace 20 and 19 by buildMaxHeap, and then we replace 16 and 5 by buildMaxHeap. However, after replacing here, the following 5, 13 and 8 do not meet the heap condition again, so we continue to adjust the heap and swap the position of 5 and 13, which is heapify(ARR, largest); The point of this line of code is to keep bubbling down to the top of the heap, until you get to the top of the heap where 7 and 20 are interchangeable, and again the bottom doesn’t satisfy the heap, so 7 and 19 are interchangeable

    This loop just satisfies the definition of the heap and bubbles the largest element to the top of the heap, and then all we need to do is loop through the top and bottom of the heap, and then we subtract len of the global variable by one, and we readjust the top of the heap again, and we’re pretty familiar with that, So the objective of this global variable len minus one believe you also come out, because we are heapify method, through constructing the big top of the heap way find the biggest elements to sort, so every time to find the largest element and put it in at the end of the array, we don’t have to adjust the whole array length of pile, so every time minus one, This answers the question above, why did we define a global variable len, which is also used here, and then continue to call heapify to resize the heapify array with the heapify element just swapped

    For (let I = arr. Length – 1; i > 0; I –), why is it that I > 0 instead of I >= 0, because when I = 1, there must be a swap between the element at the top of the heap (index 0) and the element at index 1, so when I = 0, there is already a completely ordered array, so there is no swap

  • The code analysis

    • Spatial complexity analysis

      In the whole heap sort process, only a few temporary storage space is required, so the heap sort space complexity is O(1).

    • Is it a stable sorting algorithm

      Unstable. Why? Since the elements at the bottom of the heap are swapped with the elements at the top of the heap during sorting, it is possible to change the original relative order of the same data by swapping the values of the same elements to the top of the heap and then encountering the values of the same elements without sinking

    • Time complexity analysis

      BuildMaxHeap = O(n)

      Here, since the leaf nodes do not need to be heaped, the nodes to be heaped start from the penultimate layer. During the heaped process, the number of nodes to be compared and swapped is proportional to the height k of this node, that is, the higher the height, the more nodes to be compared. Then we sum the heights of all the non-leaf nodes


      S 1 = 2 0 h + 2 1 ( h 1 ) + 2 2 ( h 2 ) + . . . + 2 k ( h k ) + . . . + 2 h 1 1 S1 = 2^0 * h + 2^1 * (h – 1) + 2^2 * (h – 2) + … + 2^k * (h – k) + … + 2^{h -1}* 1

      Let’s multiply both sides of this equation by 2


      2 S 1 = 2 1 h + 2 2 ( h 1 ) + 2 3 ( h 2 ) + . . . + 2 k ( h k + 1 ) + . . . + 2 h 1 2 + 2 h 1 2S1 = 2^1 * h + 2^2 * (h – 1) + 2^3 * (h – 2) + … + 2^k * (h – k + 1) + … + 2^{h -1}* 2 + 2^h * 1

      If I subtract the two, I get


      S 1 = h + 2 + 2 1 + 2 2 + . . . + 2 h 1 + 2 h S1 = -h + 2 + 2^1 + 2^2 + … + 2^{h-1} + 2^h

      And then we can figure out


      S 1 = 2 h + 1 h 2 S1 = 2^{h+1} – h – 2

      And then what is h? That’s easy, log base 2n, so we figured out the time to build the heap is order n.

      And then we look at the time of sorting

      Heapify loops through the array, so the outer complexity is O(n) and the inner complexity is O(nlogn), so the sorting time is O(nlogn), and since we only count the largest one, So the final time is order nlogn.

  • conclusion

    If you are not familiar with time complexity calculation, you can refer to the complexity analysis of this article

    I hope you can see here, not only a piece of code algorithm, but I bring you the idea of solving the problem, as well as some of my skills in problem analysis

    This technique is not to see it again, of course, can fully grasp it twice, repeated study and explore, dare to put forward its own problems, don’t use ambiguous attitude to learning algorithm, there are a lot of boundary conditions is worth we pay more attention to, are often those details to let oneself become more powerful programmer

  • comments

    Please feel free to discuss in the comments section. The nuggets will draw 100 nuggets in the comments section after the diggnation project. See the event article for details

    More difficult, we move a small hand point praise, humble xiaoqiang, online praise ~