Timsort is a practical algorithm, which modifies the merge strategy by combining combinatorial insertion and merge algorithm with the characteristics of data in the real world, and finally forms an efficient and stable algorithm. This idea of engineering is worth learning.

In addition to some of the applications mentioned below, Timsort has also been introduced into Chrome V8 as the default array.prototype.sort algorithm. It is also important to note that Timsort requires O(n) memory space, and the size of the machine’s memory and data needs to be taken into account when it is actually used.

What is Timsort Algorithm?

What is Timsort?

Timsort is a data sorting algorithm. It is based on the idea that real-world data sets almost always contain ordered subsequences, so its sorting strategy is to identify ordered sequences and further sort them using merge sort and insertion sort. Timsort is one of the best sorting algorithms in terms of complexity and stability.

Unlike “bubbling” or “inserting” sorts, Timsort is fairly new – it was invented in 2002 by Tim Peters (named after him). Since then, it has become the standard sorting algorithm in Python, OpenJDK 7, and Android JDK 1.5. One need only look at wikipedia to see why.

Of the seemingly numerous algorithms in the table above, only seven are good (average or worst-case complexity O(nlogn)), of which only two are stable and the optimal complexity is O(n). Timsort is a Tree sort. The algorithm is based on the idea that real-world data sets to be sorted generally contain ordered (non-descending or descending) subarrays. And that is often the case. For data with this characteristic, Timsort performs more efficiently than any other algorithm in software engineering.

To the point

This algorithm does not involve any complicated data calculation. The truth is that Timsort is not really an independent algorithm, but rather a hybrid algorithm that has been carefully choreographed by the author and effectively combined with many algorithms. The operating mechanism of the algorithm can be summarized as follows:

  1. Use a specific algorithm to split an input array into multiple subarrays.
  2. Each subarray is sorted using a simple insertion sort algorithm.
  3. The sorted subarrays are merged by a merge sort algorithm.

Like other algorithms, the subtlety of design is often hidden in the details, and that’s the algorithm and the changes to merge sort.

algorithm

define

  • N: Indicates the length of the input array
  • Run: An ordered subarray of an array in non-descending or strictly descending order, that is, “A0 ≤ A1 ≤ A2 ≤…” Or “a0 > > a2 > a1…”
  • Minrun: indicates the minimum length of run. Its value is calculated from N by some logic.

Calculation of Minrun in the first step

The value of Minrun is based on N and follows the following principles:

  1. It should not be too large, because subarrays are subject to insertion sort, which works better for short arrays.
  2. It shouldn’t be too small, otherwise you get more merges when you merge sort.
  3. Ideally, N/minrun is equal to a power of 2 (or close to it), where merge sort is most efficient.

Here, the author cites his own experiment, and the results show that when minrun > 256, the first principle is not satisfied, and when minrun < 8, the second principle is not satisfied, and the algorithm runs most efficiently when minrun is between 32 and 65. There is an exception: when N < 64, minrun = N and Timsort becomes a simple merge sort. As you can see, minrun’s algorithm is pretty simple: it simply takes the six most significant bits of N, and then decides whether to add 1 based on whether there are any 1 bits left. The code looks something like this:

int GetMinrun(int n)
{
    int r = 0;  /* becomes 1 if the least significant bits contain at least one off bit */
    while (n >= 64) {
        r |= n & 1;
        n >>= 1;
    }
    return n + r;
}
Copy the code

The second step splits the array into runs and sorts them

So far, you have an input array of size N and a calculated minrun. The algorithm of this step is as follows:

  1. Sets the starting position of the input array to the starting point.
  2. Search for run (ordered array) in the input array. By definition, the run must include the current element and subsequent elements, but the number of subsequent elements is random. If the run is strictly descending, it is converted to non-descending (just a simple array inversion).
  3. If the current run is smaller than minrun, add (minrun — size(run)) elements from the following element to the run. Thus, a run ends up being greater than or equal to minrun, and some of the subsequences (ideally all of them) are ordered.
  4. Then, the partially ordered runs are sorted by insertion sorting algorithm. Since the number of runs is small, the algorithm runs fast and has high performance.
  5. Sets the starting point where the next element of the run is located.
  6. If the end of the input array has not been reached, go to step 2; otherwise, the step ends.

Step 3 Merge

At this stage, you have split an input array into multiple Runs. If the data in the input array is random, the size of run is close to minrun; If the data is ordered (which is what the algorithm expects), run will be larger than minrun. Now, you need to merge these runs and end up with a fully ordered array. In doing so, two conditions must be met:

  1. You should merge runs that are close in size (this is more efficient).
  2. The stability of the algorithm needs to be maintained, that is, no useless transpositions (for example, two adjacent equal numbers should not be swapped).

This can be done in the following ways:

  1. Create an empty stack and store the first run.
  2. Puts the current run on the stack.
  3. Evaluates whether the current run should be merged with the previous run. To do this, two conditions need to be checked (X,Y, and Z are the three runs at the top of the stack respectively):
  • X > Y + Z
  • Y > Z
  1. If one condition is not met, the array Y is merged with the smallest between X and Z, and the check continues until both conditions are met or the stack is empty (that is, all data is sorted).
  2. If there are runs that meet the preceding two conditions, go to Step 2 and save them to the stack. Otherwise, the step ends.

The purpose of this process is to maintain balance, for example, the merge operation would look like this:

So you can see that run, after doing this, is a good size for the next merge sort. Imagine an ideal situation where a group of runs has sizes 128, 64, 32, 16, 8, 4, 2, and 2 (ignoring the minrun limitation for a moment). In this case, none of the previous runs will trigger a merge until the last two runs, and then seven fully balanced merges will be performed from the last two.

The Run of the merger

In the third step of the algorithm, we merge two consecutive runs into one ordered run, which requires extra memory. The specific steps are as follows:

  1. Create a temporary array (temporary runs) of size from the smaller of the runs to be merged.
  2. Copy the smaller run to the temporary run (assuming the smaller run is at the bottom of the stack).
  3. Mark the first element of a temporary run and a larger run as the starting position.
  4. Compares the size of the elements in the two starting positions, moves the smaller to the target array, and moves the starting position back.
  5. Repeat step 4 until one of the arrays is empty.
  6. Appends all remaining elements of an unfinished run to the end of the destination array.

Note: If the smaller is at the top of the stack, the elements are compared from right to left.

Changes to merge sort

In the above merge sort, everything seems to work out perfectly. Except in one case, imagine the combination of the two arrays: A = {1,2,3… , 9999,10000} B = {20000,20001,… , 29999,30000} The steps mentioned above also apply, but after 10,000 comparisons and moves. Timsort also provides a modified version for this, called “galloping”. Details are as follows:

  1. Merge sort begins as described above.
  2. Movement of elements from a temporary or large RUN to the target array is recorded.
  3. If there are many elements (in the algorithm’s representation, the number is 7) that are removed from the same run, you can assume that the next element will come from the same run. Galloping mode is turned on, that is, binary lookup is applied on the run to locate elements for data comparison. Binary search is more efficient than linear search, so there will be far fewer searches.
  4. Finally, when elements in the run do not meet the criteria (or reach the end of the run), data can be moved in batches (more efficiently than moving individual elements).

This explanation can be A little vague, so let’s look at an example: A = {1,2,3… , 9999,10000} B = {20000,20001,… ., 29999,30000}

  1. On the first seven comparisons, elements 1,2,3,4,5,6, and 7 in run A are compared to element 20000, and 20000 is always the larger, so they are moved from run A to the target array.
  2. From the next comparison, galloping mode will be turned on: element 20000 will be compared with the elements in run A 8,10,14,22,38, n+2^ I… , 10000 for comparison. As you can see, the number of such comparisons would be far less than 10,000.
  3. When run A is empty, you know that all of its elements are smaller than those in Run B (it is also possible to stop when run A is not empty). The data in Run A will be moved to the target array.

That’s the end of the algorithm.