Time complexity gap

When you get a lot of data you can see that n log of n and n^2, n^ 3,2 ^n are almost vertical. So in front of big data are you still writing a brute force algorithm for two for’s in n squared time?

Arrays.sort()

Since the next two-pointer algorithm might have to sort the data first, let’s see what’s new here.

The first layer

/** * Sorts the specified array in numeric ascending order. The sorting algorithm is a two-pivot quicksort by Vladimir Yaroslavskiy, Jon Bentley, and * Joshua Bloch. This algorithm provides O(n log(n)) performance over many data sets, * causing other quicksorts to degrade to quadratic performance, and is generally faster than traditional (uniaxial) quicksort implementations. * *@paramA The array to sort */
public static void sort(int[] a) {
    DualPivotQuicksort.sort(a, 0, a.length - 1.null.0.0);
}
Copy the code

Guys, DualPivotQuicksort, on official documentation:

Cdn.rawchen.com/2021/10/tim…

I will not give the translation, a random such as NetEase Youdao ORC translation. What are you talking about? We’ll see later.

The second floor

/** * sorts the specified range using the given array * if possible, workspace array slices are used to merge **@paramA The array to sort *@paramLeave the index of the first element (including the first element) for sorting *@paramOn the right is the index * of the last element (inclusive) to be sorted@paramWorkspace array (slice) *@paramThe starting point of the available space in the working array *@paramWorkLen The available size of the working array */
static void sort(int[] a, int left, int right,
                 int[] work, int workBase, int workLen) {
    // Use quicksort for small arrays
    if (right - left < QUICKSORT_THRESHOLD) {
        sort(a, left, right, true);
        return; }... }Copy the code

If the length of the array to be sorted is less than the constant QUICKSORT_THRESHOLD of 286, quicksort is preferred over merge sort. Then go in.

The third layer

/** * Sort a range of arrays by double-pivot quicksort. * *@paramA The array to sort *@paramLeft The index * of the first element to be sorted, including the first element@paramRight The index * of the last element (inclusive) to sort@paramLeftmost indicates whether the portion is at the far left of the range */
private static void sort(int[] a, int left, int right, boolean leftmost) {
    int length = right - left + 1;
    // Use insertion sort on tiny arrays
    if (length < INSERTION_SORT_THRESHOLD) {
        if (leftmost) {
            /* * Traditional (unmarked) insertion sort, * optimized for server virtual machines, for the leftmost part. * /
            for (int i = left, j = i; i < right; j = ++i) {
                int ai = a[i + 1];
                while (ai < a[j]) {
                    a[j + 1] = a[j];
                    if (j-- == left) {
                        break;
                    }
                }
                a[j + 1] = ai; }}else{... }... }... }Copy the code

If the length of the array to be sorted is less than this constant INSERTION_SORT_THRESHOLD is 47, insert sort is preferred over quicksort.

For array lengths ranging from 47 to 286, the array picks five pivot numbers (pivot numbers).

int[] arr = new int[200];
int length = arr.length;
int left = 0;
int right = arr.length - 1;
// Cheap approximate length / 7
int seventh = (length >> 3) + (length >> 6) + 1;

int e3 = (left + right) >>> 1; / / midpoint
int e2 = e3 - seventh;
int e1 = e2 - seventh;
int e4 = e3 + seventh;
int e5 = e4 + seventh;
Copy the code

Try the benchmark of 200 bits: 41-70-99-128-157

Hardcore sort of the five base elements using insert sort.

if (a[e2] < a[e1]) { int t = a[e2]; a[e2] = a[e1]; a[e1] = t; }
if (a[e3] < a[e2]) { int t = a[e3]; a[e3] = a[e2]; a[e2] = t;
    if(t < a[e1]) { a[e2] = a[e1]; a[e1] = t; }}if (a[e4] < a[e3]) { int t = a[e4]; a[e4] = a[e3]; a[e3] = t;
    if (t < a[e2]) { a[e3] = a[e2]; a[e2] = t;
        if(t < a[e1]) { a[e2] = a[e1]; a[e1] = t; }}}if (a[e5] < a[e4]) { int t = a[e5]; a[e5] = a[e4]; a[e4] = t;
    if (t < a[e3]) { a[e4] = a[e3]; a[e3] = t;
        if (t < a[e2]) { a[e3] = a[e2]; a[e2] = t;
            if(t < a[e1]) { a[e2] = a[e1]; a[e1] = t; }}}}Copy the code

Quicksort algorithm

Let’s review our quicksort algorithm before we move on.

public static void quickSort(int[] arr, int low, int high) {
	int l, h, temp, t;
	if (low > high) return;
	l = low;
	h = high;
	// Temp is the base bit, the first digit of the array
	temp = arr[l];
	while (l < h) {
		// Look to the right first, descending to the left
		while (temp <= arr[h] && l < h) h--;
		// Look at the left side, and increase in order to the right
		while (temp >= arr[l] && l < h) l++;
		// Swap if the condition is met
		if(l < h) { t = arr[h]; arr[h] = arr[l]; arr[l] = t; }}// Finally the benchmark is the swap of numbers equal to I and j
	arr[low] = arr[l];
	arr[l] = temp;
	// Recursively call the left half array
	quickSort(arr, low, h - 1);
	// Recursively call the right half of the array
	quickSort(arr, h + 1, high);
}
Copy the code

Quicksort separates the records to be sorted into two independent parts through a sort. One part is smaller than the benchmark, and the other part is larger than the benchmark. Then the two parts are sorted in this way, and the whole sequence is finally ordered through recursive operation. Use dichotomy and divide-and-conquer. Compared to a regular bubble, it jumps and compares and swaps.

Double pivot quicksort

And then we go back to the quicksort algorithm, and then we go to the double-pivot quicksort. The five datum bits take the second and fourth as pivots, while the ordinary quickrow is a datum.

/* * Use the second and fourth of the five sorted elements as pivots. * These values are cheap approximations of the first and second and third cycles of the array. Note that pivot1 <= pivot2. * /
int pivot1 = a[e2];
int pivot2 = a[e4];
Copy the code
/* * The first and last elements to be sorted are moved to positions previously occupied by pivots. * After partitioning is complete, the pivot is swapped back to its final position and excluded from subsequent sorting. * /
a[e2] = a[left];
a[e4] = a[right];
Copy the code

So you can see that the two pivots, the two nodes, divide the data into three parts, which are recursed separately, as shown in the figure below.

/* * Partitioning: * * left part center part right part * +--------------------------------------------------------------+ * | < pivot1 | pivot1 <= && <= pivot2 | ? | > pivot2 | * + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + * ^ ^ ^ * | | | * great less k * * invariance: * * all in (left, less) < pivot1 * pivot1 <= all in [less, k) <= pivot2 * all in (great, right) > pivot2 * * Pointer k is the first index of ?-part. */
Copy the code

Divided into three parts: leftPart 😡 < P1, centerPart: P1 <=x<=p2, rightPart :x>p2 p1

According to the figure, less points to the next position < P1, and great points to the previous position > P2. K is the current position of traversal. P1 <= x <= p2, the following three situations occur for the current k position of the variable:

  • Ak < p1 swap the number of k and less positions, less++
  • Ak < p1 && ak > p2

So you move a[k] to the right.

while (a[great] > pivot2) {
    if (great-- == k) {
        breakouter; }}Copy the code

Then judge: A[great] < pivot1. If true, the number of great positions should be in the part < P1.

if (a[great] < pivot1) {  // a[great] <= pivot2
    a[k] = a[less];       // less is placed in position k. The number of elements in position K is stored in ak
    a[less] = a[great];   // Place great in less
    ++less;               / / update the less
} else {                  // pivot1 <= a[great] <= pivot2
    a[k] = a[great];
}
Copy the code

The last

/* * a[I] = b; I --;" Instead of "A [I --] = b;" Due to performance issues. * /
a[great] = ak; // ak goes to great
--great;
Copy the code

The above process looks more complicated, in fact, is a process of switching numbers. If: A[great] < pivot1, the position of A[less] should be in the middle part, which can be placed at the position of K. A[Great] should be placed at the position of LESS, and ak should be placed at the corresponding position of Great. A[k] = A[great]; b [k] = A[great]; A[great] = ak

After the end: form the top form, and recurse the three parts separately.

DualQuickSort:

public class DualQuickSort {
	public void dualQuickSort(int[] arr, int left, int right) {
		if (left >= right) return;
		if (arr[left] > arr[right]) {
			swap(arr, left, right);
		}
		int less = left;
		int great = right;
		int pivot1 = arr[left];
		int pivot2 = arr[right];
		while (arr[++less] < pivot1) ;
		while (arr[--great] > pivot2) ;

		/* * Partition: * * left part center part right part * +--------------------------------------------------------------+ * | < pivot1 | pivot1 <= && <= pivot2 | ? | > pivot2 | * + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + * ^ ^ ^ * | | | * great * less k invariance: * all in (left, less) < pivot1 * pivot1 <= all in [less, k) <= pivot2 * all in (great, right) > pivot2 * * Pointer k is the first index of ?-part. */
		outer:
		for (int k = less - 1; ++k <= great; ) {
			int ak = arr[k];
			if (ak < pivot1) { // ak is less than p1
				swap(arr, k, less); / / exchange
				less++;
			} else if (ak > pivot2) { // ak > p2
				while (arr[great] > pivot2) { // Find the position that does not satisfy the condition
					if (great-- == k) {
						System.out.println("outer");
						breakouter; }}if (arr[great] < pivot1) { // arr[great] <= pivot1,

					arr[k] = arr[less];  // less is placed in position k. The number of elements in position K is stored in ak
					arr[less] = arr[great]; // Place great in less
					++less;  / / update the less
				} else { // pivot1 <= arr[great] <= pivot2
					arr[k] = arr[great];
				}
				/* * a[I] = b; I --;" Instead of "A [I --] = b;" Due to performance issues. * /
				arr[great] = ak; // ak goes to great
				--great;
			} // The other case is the middle position, do not worry about
		}

		System.out.println("left: " + left + " less: " + less + " great: " + great + " right: " + right);
		for (int t : arr) {
			System.out.print(t + "\t");
		}
		System.out.println("\n");

		dualQuickSort(arr, left, less - 1);
		dualQuickSort(arr, less, great);
		dualQuickSort(arr, great + 1, right);
	}

	public void swap(int[] arr, int i, int j) {
		int t = arr[i];
		arr[i] = arr[j];
		arr[j] = t;
	}

	public static void main(String[] args) {
		int[] arr = new int[] {13.3.65.97.76.10.35.71.5.7.3.27.49};
		for (int t : arr) {
			System.out.print(t + "\t");
		}
		System.out.println("\n");

		int l = 0;
		int r = arr.length - 1;
		new DualQuickSort().dualQuickSort(arr, l, r);
		for (int t : arr) {
			System.out.print(t + "\t"); }}}Copy the code

If the array is in the range 47-286, it enters the Merge Sort. If the array is in the range 47-286, it enters the Merge Sort. If the array is in the range 47-286, it enters the Merge Sort.

/* * Index run[I] is the start of the i-th run (ascending or descending). * /
int[] run = new int[MAX_RUN_COUNT + 1];
int count = 0; run[0] = left;

// Check if the array is close to sort
for (int k = left; k < right; run[count] = k) {
    if (a[k] < a[k + 1]) { // ascending
        while (++k <= right && a[k - 1] <= a[k]);
    } else if (a[k] > a[k + 1]) { // descending
        while (++k <= right && a[k - 1] >= a[k]);
        for (int lo = run[count] - 1, hi = k; ++lo < --hi; ) {
            intt = a[lo]; a[lo] = a[hi]; a[hi] = t; }}else { // equal
        for (int m = MAX_RUN_LENGTH; ++k <= right && a[k - 1] == a[k]; ) {
            if (--m == 0) {
                sort(a, left, right, true);
                return; }}}/* * Arrays are not highly structured, use quicksort instead of merge sort. * /
    if (++count == MAX_RUN_COUNT) {
        sort(a, left, right, true);
        return; }}Copy the code

See if the array has any structure: the actual logic is grouping sort, each descending order into one group, like 1,9,8,7,6,8. 9 to 6 are in descending order, as a group, and then put the descending group into ascending order: 1,6,7,8,9,8. And then we’re going to keep looking after the last eight. For each of these descending groups, ++count is used to quickly sort the array if there are at least 67 descending groups in the array. Otherwise, there are not many descending groups in the array.

Triplet quicksort

There is also a faster C/C++ library based Triple State QuickSort

Cdn.rawchen.com/2021/10/tim…

conclusion

The average time complexity and the worst time complexity in fast sorting are O(nlog(n)) and O(n^2) respectively. Ordinary bubble sort is O(n^2). Double-pivot Quicksort provides O (nlog(n)) performance on many datasets, causing other quicksorts to degrade to quadratic performance, and is generally faster than traditional (single-pivot) Quicksort.

reference

What sort algorithm does Java arrays.sort () use