The sorting

In general, sorting is for quick sorting

A way to measure the merits and demerits of sorting algorithms

  1. Time complexity: keyword comparison times and record movement times
  2. Space complexity: How much auxiliary memory is needed in the sorting algorithm
  3. Stability: If a is in front of B and a= B, a is still ahead of B after sorting

[1, 2, 2, 3, 5] [1, 2, 2, 3, 5] [1, 2, 2, 2, 3, 5] [1, 2, 2, 2, 3, 5] [1, 2, 2, 2, 3, 5] [1, 2, 2, 2, 3, 5

Sorted classification

  1. Internal sorting: the whole sorting process does not need to use external memory (such as disk, etc.), all sorting operations are completed in memory
  2. External sorting: there are a lot of data involved in sorting, the amount of data is large, the computer memory is issued all loaded, must use external memory (such as disk); The most common external sort is the multi-way merge sort, which can be thought of as consisting of multiple internal sorts.

Five features of the algorithm

  1. Input: Zero or more input data, which must be clearly described and defined
  2. Output: There must be at least one or more output results
  3. Fineness: The algorithm terminates automatically after a finite number of steps rather than an infinite loop, and each step can be completed in an acceptable amount of time
  4. Deterministic: Each step in the algorithm has a definite meaning, without ambiguity (ambiguity)
  5. Feasibility: Each step of the algorithm is clear and feasible, allowing the user to calculate the answer with pen and paper

Common sorting algorithms and complexity

Classification algorithm

Time complexity

Comparative analysis of algorithms

  1. In terms of average time: quicksort is the best, but in the worst case time performance is inferior to heap sort and merge sort
  2. From the point of view of algorithm simplicity: since the selection sort/insertion sort and bubble sort are relatively simple, they are considered as additive algorithm; For hill sort/heapsort/quicksort and merge sort, their algorithms are more complex and are considered as complex sort
  3. From the stability point of view: insert sort/bubble sort and merge sort are the most stable; Selection sort/quicksort/Hill sort and heap sort are unstable
  4. From the size of the number of records to be sorted, n: when n is small, simple sorting should be adopted, while when N is large, quick sorting should be adopted

Selection of algorithm

  1. If N is small (e.g.? N 50 or less? , you can use insertion sort or selection sort; When the record size is small, insert sort is better, otherwise, selection sort is better because it moves fewer records than insert sort
  2. If the initial state of the file is basically orderly (positive order), you should choose insert sort/bubble sort or quick sort
  3. If N is large, the time complexity should be? O(nlogn)? Sort by: quicksort/heap sort/merge sort

Implementation of common algorithms

Bubble Sort

Compare the order of adjacent elements in turn, if found in reverse order, swap, so that the larger element is swapped to the last

Dynamic graph demonstration

The %accordion% code implements %accordion%

public static int[] maoPaoSort() {
	/* * The end condition of both for loops is important; Each large loop reduces the number of comparisons */
	int[] arrMaoPao = new int[] { 46.89.23.45.12.99.55.44.43 };

	for (int i = 0; i < arrMaoPao.length - 1; i++) {
		for (int j = 0; j < arrMaoPao.length - i - 1; j++) {
			if (arrMaoPao[j] > arrMaoPao[j + 1]) {
				int tempNum = arrMaoPao[j + 1];
				arrMaoPao[j + 1] = arrMaoPao[j]; arrMaoPao[j] = tempNum; }}}return arrMaoPao;
}
Copy the code

%/accordion%

Selection Sort

It works by first finding the smallest (largest) element in the unsorted sequence and placing it at the beginning of the sorted sequence, then continuing to find the smallest (largest) element from the remaining unsorted elements and placing it at the end of the sorted sequence. And so on until all the elements are sorted

  • Initial state: The disordered region is R[1..n], and the ordered region is empty
  • The ith sort (I =1, 2, 3… N-1), the current ordered region and disordered region are R[1..i-1] and R(I.. N). This sort selects the record R[k] with the smallest keyword from the current unordered region and exchanges it with the first record R in the unordered region, so that R[1.. I] and R[I +1..n) become the new ordered region with 1 more records and the new unordered region with 1 less records, respectively
  • N minus one is done, and the array is sorted

One of the most stable sorting algorithms, because no matter what data goes in, right? O(n^{2})? So when you use it, keep the data size as small as possible. The only benefit might be that it doesn’t take up any extra memory. Theoretically speaking, selection sort may also be the usual sort ordinary people think of the most sorting method.

Dynamic graph demonstration

The %accordion% code implements %accordion%

public static int[] SelectionSort() {
	int[] arrQuick = new int[] { 46.89.23.45.12.99.55.44.43 };
	
	for (int i = 0; i < arrQuick.length - 1; i++) {
		int minIndex = i;
		for (int j = i + 1; j < arrQuick.length; j++) {
			if (arrQuick[j] < arrQuick[minIndex]) { // Find the smallest number
				minIndex = j; // Save the smallest index}}int temp = arrQuick[i];
		arrQuick[i] = arrQuick[minIndex];
		arrQuick[minIndex] = temp;
	}
	return arrQuick;
}
Copy the code

%/accordion%

Insertion Sort

Insertion Sort algorithm is a simple and intuitive sorting algorithm. It works by building an ordered sequence, scanning back to front in the sorted sequence for unsorted data, finding the appropriate location and inserting it.

In the implementation of insertion sort, in-place sort is usually adopted (that is, the sort only needs to use O(1) extra space). Therefore, in the process of scanning from back to front, the sorted elements need to be moved backward step by step repeatedly to provide insertion space for the latest elements.

In general, insert sorts are implemented on arrays using in-place. The specific algorithm is described as follows:

  • Start with the first element, which can be considered sorted
  • Takes the next element and scans back to front in the sorted element sequence
  • If the element (sorted) is larger than the new element, move the element to the next position
  • Repeat step 3 until you find a place where the sorted element is less than or equal to the new element
  • After inserting the new element into the location
  • Repeat steps2 ~ 5

Dynamic graph demonstration

The %accordion% code implements %accordion%

public static int[] InsertSort() {
	int[] arrInsert = new int[] { 46.89.23.45.12.99.55.44.43 };
	
	for (int i = 1; i < arrInsert.length; i++) {
		int preIndex = i - 1;
		int current = arrInsert[i];
		while (preIndex >= 0 && arrInsert[preIndex] > current) {
			arrInsert[preIndex + 1] = arrInsert[preIndex];
			preIndex--;
		}
		arrInsert[preIndex + 1] = current;
	}
	return arrInsert;
}
Copy the code

%/accordion%

Shell Sort

1959 Shell invention, the first breakthrough? O(n^{2}) ? Is an improved version of simple insertion sort. It differs from insertion sort in that it compares distant elements first. Hill sort is also called reduced increment sort.

The core of Hill sequence is the setting of interval sequence. You can either set the interval sequence in advance or define the interval sequence dynamically. The algorithm for dynamically defining interval sequences was developed by Robert Sedgewick, co-author of Algorithms (4th Edition).

Firstly, the whole record sequence to be sorted is divided into several sub-sequences for direct insertion sorting. The specific algorithm is described as follows:

  • Select an incremental sequence? T_1, t_2,… , t_k, ? Among them? t_i > t_j, t_k=1?
  • Sort the sequence k times according to the number of incremental sequences k;
  • In each sort, the sequence to be sorted is divided into several sub-sequences of length M according to the corresponding increment TI, and each sub-table is sorted by direct insertion. Only when the increment factor is 1, the whole sequence is treated as a table, and the length of the table is the length of the whole sequence.

Dynamic graph demonstration

The %accordion% code implements %accordion%

public static void shellSort(int[] arrays) {
    // Increments are /2 each time
    for (int step = arrays.length / 2; step > 0; step /= 2) {

        // Insert sort from the increment group to the end
        for (int i = step; i < arrays.length; i++) {

            int j = i;
            int temp = arrays[j];

            // j-step represents the element next to its group
            while (j - step >= 0&& arrays[j - step] > temp) { arrays[j] = arrays[j - step]; j = j - step; } arrays[j] = temp; }}}Copy the code

%/accordion%

Merge Sort

Merge sort is an efficient sorting algorithm based on merge operation. This algorithm is a very typical application of Divide and Conquer. The ordered subsequences are combined to obtain a fully ordered sequence. That is, each subsequence is ordered first, and then the subsequence segments are ordered. If two ordered tables are joined into one ordered table, it is called 2-way merge.

Merge sort is a stable sorting method. As with select sort, merge sort performs independently of the input data, but performs much better than select sort because it is always O(nlogn) time. The trade-off is extra memory.

  • Divide an input sequence of length n into two sequences of length? \frac{n}{2}? The subsequence
  • Merge sort is applied to the two subsequences respectively
  • Combine two sorted subsequences into a final sorted sequence

Dynamic graph demonstration

Quick Sort

Quick Sort was invented by Turing Award winner Tony Hoare and is listed as one of the top ten algorithms of the 20th century. It is the fastest of all internal Sort algorithms so far. It is an upgraded version of bubble Sort. O(nlog(n))? Usually significantly more than the same? O(nlogn)? Other algorithms are faster, so they are often used, and quicksort uses the idea of divide-and-conquer

The basic idea of quicksort is that the records to be sorted are separated into two independent parts through a sort. If the keywords of one part of the records are smaller than those of the other part, the two parts of the records can be sorted separately to achieve the order of the whole sequence.

Quicksort uses divide-and-conquer to divide a list into two sub-lists. The specific algorithm is described as follows:

  • Pick an element from the sequence, called pivot.
  • Reorder the array so that all elements smaller than the base value are placed in front of the base value and all elements larger than the base value are placed behind the base value (the same number can go to either side). After the partition exits, the benchmark is in the middle of the array. This is called a partition operation
  • Recursively sorts subsequences of elements less than a reference and subsequences of elements greater than a reference

Cardinality selection method:

  1. Fixed reference number
  2. Random reference number
  3. Take the three Numbers

Dynamic graph demonstration

The %accordion% code implements %accordion%

public class ArraySort {

    private static void printArr(int[] arr) {
        for (int anArr : arr) {
            System.out.print(anArr + ""); }}private static int partition(int[] arr, int left, int right) {
        int temp = arr[left];
        while (right > left) {
            // Compare the base number with the next number
            while (temp <= arr[right] && left < right) {
                --right;
            }
            // If the base number is greater than arr[right], fill the pit
            if (left < right) {
                arr[left] = arr[right];
                ++left;
            }
            // Now arr[right] needs to be filled
            while (temp >= arr[left] && left < right) {
                ++left;
            }
            if (left < right) {
                arr[right] = arr[left];
                --right;
            }
        }
        arr[left] = temp;
        return left;
    }

    private static void quickSort(int[] arr, int left, int right) {
        if (arr == null || left >= right || arr.length <= 1)
            return;
        int mid = partition(arr, left, right);
        quickSort(arr, left, mid);
        quickSort(arr, mid + 1, right);
    }


    public static void main(String[] args) {
        int[] arr = {6.4.3.2.7.9.1.8.5};
        quickSort(arr, 0, arr.length - 1); printArr(arr); }}Copy the code

%/accordion%

Heap Sort

Heapsort refers to a sort algorithm designed by using heap data structure. Heap is a nearly complete binary tree structure, and also satisfies the property of heap: that is, the key value or index of the child node is always smaller (or larger) than its parent node.

  • The initial sequence of keywords to be sorted (R1,R2… .rn) is constructed into the big top heap, which is the initial unordered region
  • By exchanging the top element R[1] with the last element R[n], a new unordered region (R1,R2… Rn-1) and a new ordered region (Rn), and R[1,2… n-1]<=R[n]
  • Since the new heap top R[1] after the swap may violate the heap properties, it is necessary to apply the current unordered region (R1,R2… Rn-1) adjusts to the new heap, and then swaps R[1] again with the last element of the unordered region to obtain the new unordered region (R1,R2… .Rn-2) and the new ordered region (Rn-1,Rn). The process is repeated until the number of elements in the ordered region is n-1, and the whole sorting process is complete

Dynamic graph demonstration

Counting Sort

Counting sort is not a comparison-based sorting algorithm. Its core is to convert the input data values into keys and store them in an extra array space. As a sort of linear time complexity, counting sort requires that the input data must be integers with a definite range.

Counting sort is a stable sorting algorithm. When the input elements are n integers between 0 and k, the time complexity is O(n+k) and the space complexity is also O(n+k), which is faster than any comparison sorting algorithm. Counting sort is a very efficient sorting algorithm when k is not very large and the sequence is relatively concentrated.

  • Find the largest and smallest elements in the array to be sorted
  • Count the number of occurrences of each element of value I in the array, stored in the i-th entry of array C
  • Add up all the counts (starting with the first element in C, adding each term to the previous one)
  • Reverse populating the target array: place each element I in the C(I) item of the new array, subtracting C(I) by 1 for each element

Dynamic graph demonstration

Bucket Sort

Bucket sort is an updated version of counting sort. It makes use of the mapping relation of functions, and the key of efficiency lies in the determination of the mapping function. Bucket sort works by dividing the data into a finite number of buckets, assuming that the input data is evenly distributed, and sorting each Bucket separately (it is possible to use another sorting algorithm or to continue sorting using Bucket sort recursively).

The optimal linear time for bucket sorting is O(n). The time complexity of bucket sorting depends on the time complexity of sorting data between buckets, because the time complexity of all other parts is O(n). Obviously, the smaller the bucket, the less data there is between the buckets, and the less time it takes to sort. But the corresponding space consumption will increase.

  • Set a quantitative array as an empty bucket
  • Iterate over the input data and place the data one by one into the corresponding bucket
  • Sort each bucket that is not empty
  • Splicing together sorted data from never-empty buckets.

Images demonstrate

Radix Sort

Radix sort is sorted by low order and then collected; And then sort it in order, and then collect it; And so on, all the way to the highest place. Sometimes some attributes are prioritized, first by low priority, then by high priority. The final order is that the higher-priority ones come first, and the lower-priority ones come first.

Radix sort is based on sorting separately, collecting separately, so it’s stable. However, the performance of radix sort is slightly worse than that of bucket sort. It takes O(n) time complexity for each bucket allocation of keywords, and O(n) time complexity for obtaining new keyword sequences after allocation. If the data to be sorted can be divided into D keywords, then the time complexity of radix sort will be? O (d * 2 n)? D, of course, is much less than n, so it’s basically linear.

The space complexity of radix sort is O(n+k), where K is the number of buckets. Generally n>> K, so you need about n extra Spaces.

  • Gets the largest number in the array and the number of digits
  • Arr is the original array, and each bit is taken from the lowest level to form the RADIX array
  • Counting sorting for the Radix (taking advantage of counting sorting for a small range of numbers)

Images demonstrate