• Introduction to sorting algorithm
  • Exchange function
  • Optimization algorithm: bubble sort, select sort, insert sort
    • Optimized bubble sort
    • Optimized selection sorting
    • Optimized insertion sort
    • conclusion
  • Divide-and-conquer applications: merge sort, quicksort
    • Merge sort
    • Quick sort
  • Decision tree model
    • provelog(n!) = Θ (nlogn)
    • conclusion

Introduction to sorting algorithm

There are about ten kinds of common sorting algorithms, and these sorting algorithms can be roughly divided into two categories: sorting algorithm based on comparison and non-comparison sorting algorithm. Comparison based sorting algorithms include: bubble sort, selection sort, insertion sort, hill sort, heap sort, merge sort, quick sort; Non-comparison – based sorts include counting sort, radix sort, and bucket sort. This paper mainly introduces some of the comparison based sorting algorithms, including bubble sort, selection sort, insertion sort, merge sort and quicksort. The first three are mainly to provide the idea of algorithm optimization, and the last two are mainly to consolidate the idea of divide-and-conquer. At the end of the article, we will also explain why the worst case of comparison sorting algorithm is O(nlogn).

Exchange function

Let’s start by writing a general interchange function that we can use in the following algorithms

function exchange(arr, i, j) {
  const tmp = arr[i];
  arr[i] = arr[j];
  arr[j] = tmp; 
}
Copy the code

Optimization algorithm: bubble sort, select sort, insert sort

Because these three sorts are relatively simple, are also relatively easy to think of, so I am not going to write, I believe that you should also know how to write. These three kinds of sorting do not need to create an additional memory space related to n to store the variables in the sorting process, that is, no extra memory or only constant memory. On the other hand, all three sorting algorithms have room for improvement. The following is mainly about the optimization algorithm of these algorithms, mainly in order to provide some ideas of algorithm optimization.

Optimized bubble sort

By introducing a flag bit isExchange, the result of the last round of bubbling is verified. If there is no bubbling in the last round, it indicates that the result is in order and the loop can be exited in advance.

function BubbleSort(arr) {
  let isExchange = false;
  for(let i = 0; i < arr.length; i++) {
    isExchange = false;
    for(let j = i + 1; j < arr.length; j++) {
      if(arr[i] > arr[j]) {
        exchange(arr, i, j);
        isExchange = true; }}// If the last round did not change positions, the loop will exit early
    if(! isExchange)break;
  }
  return arr;
}
Copy the code

Optimized selection sorting

In each loop, instead of finding a maximum or minimum at a time, you can cut the time by half by finding both.

function SelectionSort(arr) {
  let maxIndex, minIndex;
  for(let i = 0; i < arr.length / 2; i++) {
    maxIndex = minIndex = i;
    // Find the maximum and minimum values in the unsorted section
    for(let j = i + 1; j < arr.length - i; j++) {
      if(arr[maxIndex] < arr[j]) maxIndex = j;
      else if(arr[minIndex] > arr[j]) minIndex = j; 
    }
    // Swaps the smallest and first digit in the unsorted part
    if(minIndex ! == i) { exchange(arr, i, minIndex);// If the maximum value is I, I and minIndex are swapped, so we need to fix maxIndex
      if(maxIndex === i) maxIndex = minIndex;
    }
    // Swaps the maximum and last digit in the unsorted portion
    if(maxIndex ! == arr.length -1 - i) exchange(arr, arr.length - 1 - i, maxIndex);
  }
  return arr;
}
Copy the code

Optimized insertion sort

Each time the insertion point is selected, the original method starts matching from scratch. This method is time-consuming when the data volume is large. In fact, the insertion point can be selected by binary search method.

function InsertSort(arr) {
  let maxIndex, minIndex;
  for(let i = 1; i < arr.length; i++) {
    const tmp = arr[i];
    if(tmp < arr[i - 1]) {
      const searchIndex = BinarySearch(arr, 0, i - 1, arr[i]);
      // Move all items after the subscript to the right by one unit
      for(let j = i; j > searchIndex; j--) {
        arr[j] = arr[j - 1];
      }
      // Insert the item to be insertedarr[searchIndex] = tmp; }}return arr;
}
/** * binary lookup * lowIndex leftmost subscript of the lookup item * highIndex rightmost subscript of the lookup item * target Target item to be searched ** /
function BinarySearch(arr, lowIndex, highIndex, target) {
  while(lowIndex < highIndex) {
    const halfIndex = Math.floor((highIndex + lowIndex) / 2);
    if(arr[halfIndex] > target) {
      highIndex = halfIndex;
    } else {
      lowIndex = halfIndex + 1; }}return lowIndex;
}
Copy the code

conclusion

The above optimization algorithm is an extension of the original algorithm, but its time complexity is still O(n2), and the optimization content is nothing more than these two parts:

  1. Time complexity in some special cases, such as optimized bubble sort
  2. The coefficient of the higher-order term of its time complexity function T(n) or the lower-order term, such as the other two

Divide-and-conquer applications: merge sort, quicksort

Merge sort

Merge sort is an effective sorting algorithm based on merge operation, which is a very typical application of divide-and-conquer. The ordered subsequences are combined to obtain a fully ordered sequence. That is, each subsequence is ordered first, and then the subsequence segments are ordered. If two ordered tables are merged into one ordered table, it is called binary merge. — Baidu Encyclopedia

Merge sort can be divided into the following steps:

  1. Divide the array into two parts, merge the two parts, and merge the result
  2. Merge process: Compare the two parts of the array from front to back, and insert smaller items into the new array before larger items
  3. Repeat appeal procedure
function MergeSort(arr) {
  if(arr.length < 2) return arr;
  const midIndex = Math.floor(arr.length / 2);
  const left = MergeSort(arr.slice(0, midIndex));
  const right = MergeSort(arr.slice(midIndex, arr.length));
  return Merge(left, right);
}
function Merge(left, right) {
  let result = [];
  let i = j = 0;
  while(i < left.length && j < right.length) {
    if(left[i] < right[j])
      result.push(left[i++]);
    else
      result.push(right[j++]);
  }
  if(i === left.length) result = result.concat(right.slice(j));
  if(j === right.length) result = result.concat(left.slice(i));
  return result;
}
Copy the code

Quick sort

By sorting the sorted data into two independent parts, one part of all the data is smaller than the other part of all the data, and then according to this method to the two parts of the data respectively for quick sorting, the whole sorting process can be carried out recursively, in order to achieve the entire data into an ordered sequence. — Baidu Encyclopedia

Quicksort, as its name suggests, is the fastest comparison-based sorting algorithm. As the data size increases, its time complexity increases almost linearly.

Quicksort can be divided into the following steps:

  1. Select values from an array at random, usually in the middle
  2. Put anything less than this value in the left array and anything greater than this value in the right array
  3. Do this for the left and right arrays respectively
function QuickSort(arr) {
  if(arr.length <= 1) return arr;
  const midIndex = Math.floor(arr.length / 2);
  const left = [];
  const right = [];
  for(let i = 0; i < arr.length; i++) {
    if(i === midIndex) continue;

    if(arr[i] < arr[midIndex]) left.push(arr[i]);
    else right.push(arr[i]);
  }
  return QuickSort(left).concat([arr[midIndex]], QuickSort(right));
}
Copy the code

Decision tree model

Comparison sort can be abstracted as a decision tree, which is a complete binary tree, which can represent the comparison operation of all elements by a particular sorting algorithm under a given input size. Control, data movement and other operations are ignored.

See below:

In a decision tree, the length of the longest simple path from the root to any reachable leaf represents the number of worst-case comparisons in the corresponding sorting algorithm. Therefore, the number of worst-case comparisons in a comparison sorting algorithm is equal to the height of its decision tree.

Why does the book say worst case here? This is because only in the worst case, when you compare any two numbers once, will the number of comparisons go from the root to a specific leaf. However, in most cases, we end the comparison early through optimization methods like those mentioned above in order to improve algorithm performance.

For a decision tree in which every permutation is a reachable leaf, the height of the tree can be determined. If the height of such a decision tree is h and the total number of leaves is L, corresponding to the comparison ordering of N elements, then we can easily obtain n! <= l, and because, for any binary tree, the number of leaf nodes l <= 2^h, the number of leaf nodes in a full binary tree is 2^h.

So I get n factorial. <= l <= 2^h

You take the log of both sides, you get h is greater than or equal to log n factorial.

And because the log (n! Theta (nlogn) (this is a theorem)

So h is greater than omega nlogn.

That is, in the worst case, any comparison sort requires omega (nlogn) comparisons.

provelog(n!) = Θ (nlogn)

log(n!) = log1 + log2 + ... + logn <= logn + logn + ... + logn <= nlogn

The log (n.) = O(nlogn)

Then take n! The second half of n/2… n

log(n!) = log1 + log2 + ... + logn >= log(n/2 + 1) + log(n/2 + 2) + ... + log(n - 1) + logn >= log(n/2) + log(n/2) + ... log(n/2) + log(n/2) >= n/2log(n/2)

The log (n.) = Ω (nlogn)

So the log (n.) = Θ (nlogn)

conclusion

The appeal’s verdict translates into plain English: In based on any of the two Numbers are compared for cases (i.e. not an early end sorting algorithms, such as bubble sort is an early end after optimization, the situation on the map to the decision tree model is taken from the node to a leaf node with path between a section of the path, rather than the entire length of path), any sorting based on the comparison, At least nlogn comparisons are required.

Starting from this principle, we can get: heap sort and merge sort are actually asymptotically optimal comparison sorting algorithms. Because both of these have a time complexity of O(nlogn) in the best, worst, and average cases, that is, their running time upper bound reaches O(nlogn). Quicksort, on the other hand, degrades to insertion sort in the worst case, when the partition is so unbalanced that the median value used for comparison is exactly the maximum/minimum value. But why is quicksort more often used in practice? Quick sorting can be done based on the original array, without additional memory overhead; Second, the coefficient of the high-order term is relatively low, which seems to be 1.38. If you are interested, you can check the information yourself. In the actual situation, it is often through the combination of a variety of sorting algorithms to complete the sorting algorithm we need. For example, the sort method in js is insertion sort for small arrays and quicksort for large arrays. Introspection sort starts with quicksort and moves to heap sort when the recursion depth exceeds a certain depth.

That concludes this chapter, and the next section: Basic data structures.