Array.prototype.sort() uses array.prototype.sort ()

Array.prototype.sort()

The sort() method sorts the elements of an array using an in-place algorithm and returns the array. The default sort order is built when converting elements to strings and then comparing their UTF-16 code unit value sequences

– MDN

const array = [1.30.4.21.100000];
array.sort();
console.log(array);
// [1, 100000, 21, 30, 4]

const numbers = [4.2.5.1.3];
numbers.sort((a, b) = > a - b);
console.log(numbers)
// [1, 2, 3, 4, 5]
Copy the code

V8 kind of Array. The prototype. The sort ()

The ES specification does not specify an algorithm for array.prototype.sort (). In V8, prior to version 7.0, array.prototype.sort () used insertion sort when the Array length was less than 10, otherwise it used quicksort.

Quicksort was abandoned after V8 7.0 because it was not a stable sorting algorithm and in the worst case time complexity was degraded to O(n2).

So a hybrid sorting algorithm called TimSort is adopted.

This algorithm was originally used in Python, and it is not strictly one of the above 10 sorting algorithms, but a hybrid sorting algorithm:

Insert sort is used in small subarrays, and then merge sort is used to merge the ordered subarrays, with O(nlogn) time complexity.

What is TimSort?

Before addressing the V8 Sort source code, let’s look at how TimSort is implemented to help us read the source code

Timsort was created specifically for Python by Tim Peter in 2001 and is based on the large number of ordered elements that exist in real data sets (without reordering). Timsort iterates through all the data, finds all the ordered partitions (Runs) in the data, and then merges them into one according to certain rules.

The specific process is:

  • Scan the array and look for what’s called_runs_A run can be considered a small sorted array, as well as a small sorted array, because these arrays can be simply reversed to become a run
  • Determine the minimum run length. The smaller runs are merged by insertion sort with the larger runs
  • Repeatedly merge some adjacent runs to avoid merging fragments with very different lengths until the whole sort is complete

How do you avoid merging with very different lengths run? In Timsort, there is A stack that records the starting index position and length of each run, and then pushes the runs onto the stack, such as the lengths of A, B, and C at the top of the stack

  • |C| > |B| + |A|
  • |B| > |A|

In the above example, because | A | > | | B, so B be merged into it before and after the two runs (A, C), the smaller of A | A |, then | A | | | C again. Following this rule, it is possible to merge runs of the same size to improve performance. Note that Timsort is a stable sort so only adjacent runs can merge.

So, sorted arrays will be sorted in O(n) time, because such arrays will produce only a single run and no merge operations will be required. The worst case is order n log n. Such algorithm performance parameters, along with Timsort’s inherent stability, are several reasons why we ultimately chose Timsort over Quicksort.

Write an array.prototype.sort () implementation

Now that we know the basic idea of Timsort and the sorting process, let’s write a simple version of Timsort:

// Merge two small arrays left and right into result
function merge(left, right) {
  let result = [],
      ileft = 0,
      iright = 0
  while(ileft < left.length && iright < right.length) {
    if(left[ileft] < right[iright]){
      result.push(left[ileft ++])
    } else {
      result.push(right[iright ++])
    }
  }
  while(ileft < left.length) {
    result.push(left[ileft ++])
  }
  while(iright < right.length) {
    result.push(right[iright ++])
  }
  return result
}

// Insert sort
function insertionSort(arr) {
    let n = arr.length;
    let preIndex, current;
    for (let i = 1; i < n; i++) {
        preIndex = i - 1;
        current = arr[i];
        while (preIndex >= 0 && arr[preIndex] > current) {
            arr[preIndex + 1] = arr[preIndex];
            preIndex--;
        }
        arr[preIndex + 1] = current;
    }
    return arr;
}

// timsort
function timsort(arr) {
    // An empty array or an array length less than 2 is returned
    if(! arr || arr.length <2) return arr
    let runs = [], 
        sortedRuns = [],
        newRun = [arr[0]],
        length = arr.length
    // Divide the run area and store it in the runs area, which is simply divided in ascending order and does not consider descending runs
    for(let i = 1; i < length; i++) {
        if(arr[i] < arr[i - 1]) {
            runs.push(newRun)
            newRun = [arr[i]]
        } else {
            newRun.push(arr[i])
        }
        if(length - 1 === i) {
            runs.push(newRun)
            break}}// Since it is only an ascending run, there is no need to use insertionSort to sort the runs themselves
    for(let run of runs) {
        insertionSort(run)
    }
    / / merge runs
    sortedRuns = []
    for(let run of runs) {
        sortedRuns = merge(sortedRuns, run)
    }
    return sortedRuns
}

/ / test
var numbers = [4.2.5.1.3]
timsort(numbers)
// [1, 2, 3, 4, 5]
Copy the code

A simple, complete implementation can be seen in the V8 array-sort implementation, which we’ll take a look at below

Array.prototype.sort() in v8

That is, TimSort is implemented in V8, and the specific implementation steps are as follows:

  1. If the array size is less than 2, return the array size
  2. Start cycle
  3. Find an ordered subarray called “run” with a currentRunLength
  4. Calculate the minimum combined sequence length minRunLength (this value dynamically varies from 32 to 64 depending on the array length)
  5. CurrentRunLength = minRunLength; currentRunLength = minRunLength; minRunLength = minRunLength; Push run into the stack pendingRuns
  6. PendingRuns ensures that any three consecutive runs (run0, run1, run2) in the stack meet the requirements of run0 > run1 + run2 && run1 > run2
  7. If the remaining subarray is 0, end the loop
  8. Merge all runs in the stack, and the sort is finished

Core source code interpretation

Here are three core functions:

  • ComputeMinRunLength: used to calculateminRunLength
  • CountAndMakeRun: Calculate the first onerunThe length of the
  • MergeCollapseAdjustment:pendingRunsTo make the stack length greater than3When allrunAll meetrun[n]>run[n+1]+run[n+2]run[n+1]>run2[n+2]
// Calculate the minimum combined sequence length minRunLength
macro ComputeMinRunLength(nArg: Smi): Smi {
  let n: Smi = nArg;
  let r: Smi = 0;  // Becomes 1 if any 1 bits are shifted off.

  assert(n >= 0);
  // If less than 64, return n (too small to disturb strange things)
  Otherwise, divide by 2 to get a result between 32 and 64
  while (n >= 64) {
    r = r | (n & 1);
    n = n >> 1;
  }

  const minRunLength: Smi = n + r;
  assert(nArg < 64| | -32 <= minRunLength && minRunLength <= 64));
  return minRunLength;
}
Copy the code
// Calculate the length of the first run
macro CountAndMakeRun(implicit context: Context, sortState: SortState)(
    lowArg: Smi, high: Smi): Smi {
  assert(lowArg < high);
  // This is where the array data is passed in
  const workArray = sortState.workArray;

  const low: Smi = lowArg + 1;
  if (low == high) return 1;

  let runLength: Smi = 2;

  const elementLow = UnsafeCast<JSAny>(workArray.objects[low]);
  const elementLowPred = UnsafeCast<JSAny>(workArray.objects[low - 1]);
  // Call the compare function to compare data
  let order = sortState.Compare(elementLow, elementLowPred);

  // TODO(szuend): Replace with "order < 0" once Torque supports it.
  // Currently the operator<(Number, Number) has return type
  // 'never' and uses two labels to branch.
  const isDescending: bool = order < 0 ? true : false;

  let previousElement: JSAny = elementLow;
  // Iterate over the subarray and calculate the length of run
  for (let idx: Smi = low + 1; idx < high; ++idx) {
    const currentElement = UnsafeCast<JSAny>(workArray.objects[idx]);
    order = sortState.Compare(currentElement, previousElement);

    if (isDescending) {
      if (order >= 0) break;
    } else {
      if (order < 0) break;
    }

    previousElement = currentElement;
    ++runLength;
  }

  if (isDescending) {
    ReverseRange(workArray, lowArg, lowArg + runLength);
  }

  return runLength;
}
Copy the code
PendingRuns: Run [n]>run[n+1]+run[n+2] and run[n+1]>run2[n+2]
transitioning macro MergeCollapse(context: Context, sortState: SortState) {
  const pendingRuns: FixedArray = sortState.pendingRuns;

  // Reload the stack size because MergeAt might change it.
  while (GetPendingRunsSize(sortState) > 1) {
    let n: Smi = GetPendingRunsSize(sortState) - 2;

    if(! RunInvariantEstablished(pendingRuns, n +1) | |! RunInvariantEstablished(pendingRuns, n)) {if (GetPendingRunLength(pendingRuns, n - 1) <
          GetPendingRunLength(pendingRuns, n + 1)) {
        --n;
      }

      MergeAt(n); // Merge the NTH run with the NTH +1 run
    } else if (
        GetPendingRunLength(pendingRuns, n) <=
        GetPendingRunLength(pendingRuns, n + 1)) {
      MergeAt(n); // Merge the NTH run with the NTH +1 run
    } else {
      break; }}}Copy the code

Three minutes a day