Author: Guo Yaohua

cnblogs.com/guoyaohua/p/8600214.html

In recent days, I studied sorting algorithm and read many blogs. I found that some articles on the Internet did not explain sorting algorithm thoroughly and many codes were wrong. For example, in some articles, collection.sort () function was directly used to sort each bucket in the “bucket sorting” algorithm. But not for algorithm research.

So I read these days according to the article, sorted out a more complete sorting algorithm summary, all the algorithms in this article have JAVA implementation, after I debug without error before issued, if there is any error, please point out your predecessors.

0, sorting algorithm description

0.1 Definition of sorting

Sort a sequence of objects by a keyword.

0.2 Glossary

  • Stability: if a is in front of B, and a= B, a is still in front of B after sorting;

  • Unstable: If a is in front of B and a= B, a may appear behind B after sorting;

  • Internal sort: All sort operations are done in memory;

  • External sort: Because the data is too large, the data is placed on disk, and the sort can only be carried out by disk and memory data transfer;

  • Time complexity: The amount of time an algorithm takes to execute.

  • Space complexity: The amount of memory required to run a program.

Reference: Stable and unstable sort

0.3 Algorithm Summary

Picture noun explanation:

  • N: Data scale

  • K: Number of buckets

  • In-place: occupies constant memory and no extra memory

  • Out-place: takes up extra memory

0.4 Algorithm Classification

0.5 The difference between comparative and non-comparative

Common quick sort, merge sort, heap sort, bubble sort and so on belong to comparison sort. In the end result of sorting, the order between elements depends on how they are compared. Each number must be compared with the others to determine its position.

In a sort like bubble sort, the problem size is N, and the average time complexity is O(n²) because you need to compare n times. In merge sort, quicksort and so on, the problem size is reduced to logN times by divide-and-conquer, so the average time complexity is O(nlogn).

The advantage of comparative sorting is that it is applicable to all sizes of data, regardless of the distribution of data, can be sorted. You can say that comparison sort applies to everything that needs sorting.

Counting sort, radix sort and bucket sort belong to non-comparison sort. Non-comparative sorting is sorting by determining how many elements should precede each element. The position of arr[I] in the sorted array is uniquely determined by counting the number of elements before arr[I].

Non-comparison sort can be solved only by determining the number of existing elements before each element. The algorithm time complexity O(n).

The time complexity of non-comparative sort is low, but since non-comparative sort needs to take up space to determine the unique position. Therefore, there are certain requirements for data size and data distribution.

1, Bubble Sort

Bubble sort is a simple sorting algorithm. It repeatedly walks through the sequence to be sorted, comparing two elements at a time and swapping them if they are in the wrong order. The task of visiting an array is repeated until no more exchanges are needed, that is, the array has been sorted.

The algorithm gets its name because smaller elements slowly “float” to the top of the list through swapping. Bubble sort introduction: Bubble sort

1.1 Algorithm Description

  • Compare adjacent elements. If the first one is larger than the second, swap them both;

  • Do the same for each pair of adjacent elements, starting with the first pair and ending with the last pair, so that the last element should be the largest;

  • Repeat for all elements except the last one;

  • Repeat steps 1 to 3 until the sorting is complete.

1.2 GIF demonstration

1.3 Code Implementation

/** * bubble sort ** @param array * @return
     */
    public static int[] bubbleSort(int[] array) {
        if (array.length == 0)
            return array;
        for (int i = 0; i < array.length; i++)
            for (int j = 0; j < array.length - 1 - i; j++)
                if (array[j + 1] < array[j]) {
                    int temp = array[j + 1];
                    array[j + 1] = array[j];
                    array[j] = temp;
                }
        return array;
    }Copy the code

1.4 Algorithm Analysis

Best case: T(n) = O(n) Worst case: T(n) = O(n2) Average case: T(n) = O(n2)

2, Selection Sort

One of the most stable sorting algorithms, because whatever data goes in is O(n2) time, so when you use it, the smaller the data, the better. The only benefit might be that it doesn’t take up any extra memory. Theoretically speaking, the choice sort may also be the usual sort ordinary people think of the most sorting method.

Selection-sort is a simple and intuitive sorting algorithm. It works by first finding the smallest (largest) element in the unsorted sequence and storing it at the beginning of the sorted sequence, then continuing to find the smallest (largest) element from the remaining unsorted elements and placing it at the end of the sorted sequence. And so on until all the elements are sorted. Selection sort introduction: Selection sort

2.1 Algorithm Description

Direct selection sort of N records can get ordered results after n-1 direct selection sort. The specific algorithm is described as follows:

  • Initial state: the disordered region is R[1..n], and the ordered region is empty.

  • Th order (I =1,2,3… N-1), the current ordered region and disordered region are R[1..i-1] and R(I.. N). This sorting selects the record R[k] with the smallest keyword from the current unordered region and exchanges it with the first record R in the unordered region, so that R[1.. I] and R[I +1..n) become the new ordered region with 1 more records and the new unordered region with 1 less records, respectively.

  • N minus one is done, and the array is sorted.

2.2 GIF demonstration

2.3 Code Implementation

/** * select sort * @param array * @return
     */
    public static int[] selectionSort(int[] array) {
        if (array.length == 0)
            return array;
        for (int i = 0; i < array.length; i++) {
            int minIndex = i;
            for (int j = i; j < array.length; j++) {
                if(array[j] < array[minIndex]) // Select minIndex = j; } int temp = array[minIndex]; array[minIndex] = array[i]; array[i] = temp; }return array;
    }Copy the code

2.4 Algorithm Analysis

Best case: T(n) = O(n2) Worst case: T(n) = O(n2) Average case: T(n) = O(n2)

1, Insertion Sort

Insertion Sort algorithm is a simple and intuitive sorting algorithm. It works by building an ordered sequence, scanning back to front in the sorted sequence for unsorted data, finding the appropriate location and inserting it. In the implementation of insertion sort, in-place sort is usually adopted (that is, the sort only needs to use O(1) extra space). Therefore, in the process of scanning from back to front, the sorted elements need to be moved backward step by step repeatedly to provide insertion space for the latest elements. Insert sort: Directly insert sort

3.1 Algorithm Description

In general, insert sorts are implemented on arrays using in-place. The specific algorithm is described as follows:

  • Starting with the first element, the element can be considered sorted;

  • Fetch the next element and scan back to front in the sorted element sequence;

  • If the element (sorted) is larger than the new element, move the element to the next position;

  • Repeat step 3 until you find a place where the sorted element is less than or equal to the new element;

  • After inserting a new element into that position;

  • Repeat steps 2 to 5.

3.2 GIF demonstration

3.2 Code Implementation

/** * insert sort * @param array * @return
     */
    public static int[] insertionSort(int[] array) {
        if (array.length == 0)
            return array;
        int current;
        for (int i = 0; i < array.length - 1; i++) {
            current = array[i + 1];
            int preIndex = i;
            while (preIndex >= 0 && current < array[preIndex]) {
                array[preIndex + 1] = array[preIndex];
                preIndex--;
            }
            array[preIndex + 1] = current;
        }
        return array;
    }Copy the code

3.4 Algorithm Analysis

Best case: T(n) = O(n) Worst case: T(n) = O(n2) Average case: T(n) = O(n2)

4, Shell Sort

Hill sort is a sort algorithm proposed by Donald Shell in 1959. Hill sort is also an insertion sort, which is a more efficient version of simple insertion sort, also known as reduced incremental sort, and was one of the first algorithms to break out of O(n2). It differs from insertion sort in that it compares distant elements first. Hill sort is also called reduced increment sort. Hill sort: Hill sort

Hill sort is to group the records in a certain increment of the following table, and sort each group using the direct insert sort algorithm.
As the increments decrease, each group contains more and more keywords, and when the increments decrease to 1, the entire file is exactly grouped and the algorithm terminates.

4.1 Algorithm Description

Let’s take a look at the basic steps of Hill sorting. Here we select the increments gap=length/2, and continue to narrow the increments by gap= gap/2. The increments can be represented by a sequence {n/2,(n/2)/2… 1}, called an incremental sequence. It is a mathematical problem to select and prove the delta sequence of Hill sort. The delta sequence we choose is more commonly used and also the delta suggested by Hill, called Hill delta, but in fact this delta sequence is not optimal. Here we do an example using Hill increments.

Firstly, the whole record sequence to be sorted is divided into several sub-sequences for direct insertion sorting. The specific algorithm is described as follows:

  • Select a delta sequence T1, T2… , where Ti >tj, tk=1;

  • Sort the sequence k times according to the number of incremental sequences k;

  • In each sort, the sequence to be sorted is divided into several sub-sequences of length M according to the corresponding increment TI, and each sub-table is sorted by direct insertion. Only when the increment factor is 1, the whole sequence is treated as a table, and the length of the table is the length of the whole sequence.

4.2 Process Demonstration

4.3 Code Implementation

/** * @param array * @return
     */
    public static int[] ShellSort(int[] array) {
        int len = array.length;
        int temp, gap = len / 2;
        while (gap > 0) {
            for (int i = gap; i < len; i++) {
                temp = array[i];
                int preIndex = i - gap;
                while (preIndex >= 0 && array[preIndex] > temp) {
                    array[preIndex + gap] = array[preIndex];
                    preIndex -= gap;
                }
                array[preIndex + gap] = temp;
            }
            gap /= 2;
        }
        return array;
    }Copy the code

4.4 Algorithm Analysis

Best case: T(n) =O(nlog2n) Worst case: T(n) =O(nlog2n) Average case: T(n) =O(nlog2n)

5, Merge Sort (Merge Sort)

Like selection sort, merge sort performs independently of the input data, but performs much better than selection sort because it is always O(n log n) time. The trade-off is extra memory. Merge sort: Merge sort

Merge sort is an efficient sorting algorithm based on merge operation.
This algorithm is a very typical application of Divide and Conquer.
Merge sort is a stable sorting method.
The ordered subsequences are combined to obtain a fully ordered sequence.
That is, each subsequence is ordered first, and then the subsequence segments are ordered.
If two ordered tables are joined into one ordered table, it is called 2-way merge.

5.1 Algorithm Description

  • The input sequence of length N is divided into two subsequences of length N /2.

  • Merge sort is used for these two subsequences respectively.

  • Combine two sorted subsequences into a final sorted sequence.

5.2 GIF demonstration

5.3 Code Implementation

/** * merge sort ** @param array * @return
     */
    public static int[] MergeSort(int[] array) {
        if (array.length < 2) return array;
        int mid = array.length / 2;
        int[] left = Arrays.copyOfRange(array, 0, mid);
        int[] right = Arrays.copyOfRange(array, mid, array.length);
        returnmerge(MergeSort(left), MergeSort(right)); } /** * merge sort -- combine two sorted arrays into a sorted array ** @param left * @param right * @return
     */
    public static int[] merge(int[] left, int[] right) {
        int[] result = new int[left.length + right.length];
        for (int index = 0, i = 0, j = 0; index < result.length; index++) {
            if (i >= left.length)
                result[index] = right[j++];
            else if (j >= right.length)
                result[index] = left[i++];
            else if (left[i] > right[j])
                result[index] = right[j++];
            else
                result[index] = left[i++];
        }
        return result;
    }Copy the code

5.4 Algorithm Analysis

Best case: T(n) = O(n) Worst case: T(n) = O(nlogn) Average case: T(n) = O(nlogn)

6. Quick Sort

The basic idea of quicksort is that the records to be sorted are separated into two independent parts through a sort. If the keywords of one part of the records are smaller than those of the other part, the two parts of the records can be sorted separately to achieve the order of the whole sequence. Quicksort: Quicksort

6.1 Algorithm Description

Quicksort uses divide-and-conquer to divide a list into two sub-lists. The specific algorithm is described as follows:

  • Pick an element from the sequence, called pivot;

  • Reorder the array so that all elements smaller than the base value are placed in front of the base value and all elements larger than the base value are placed behind the base value (the same number can go to either side). After the partition exits, the benchmark is in the middle of the array. This is called a partition operation;

  • Recursively sorts subsequences of elements less than a reference and subsequences of elements greater than a reference.

6.2 GIF demonstration

6.3 Code Implementation

/** * quicksort * @param array * @param start * @param end * @return
     */
    public static int[] QuickSort(int[] array, int start, int end) {
        if (array.length < 1 || start < 0 || end >= array.length || start > end) return null;
        int smallIndex = partition(array, start, end);
        if (smallIndex > start)
            QuickSort(array, start, smallIndex - 1);
        if (smallIndex < end)
            QuickSort(array, smallIndex + 1, end);
        returnarray; } /** * quicksort algorithm -- partition * @param array * @param start * @param end * @return
     */
    public static int partition(int[] array, int start, int end) {
        int pivot = (int) (start + Math.random() * (end - start + 1));
        int smallIndex = start - 1;
        swap(array, pivot, end);
        for (int i = start; i <= end; i++)
            if (array[i] <= array[end]) {
                smallIndex++;
                if (i > smallIndex)
                    swap(array, i, smallIndex);
            }
        returnsmallIndex; } @param array * @param I * @param j */ public static void swap(int[] array, int I, int j) { int temp = array[i]; array[i] = array[j]; array[j] = temp; }Copy the code

6.4 Algorithm Analysis

Best case: T(n) = O(nlogn) Worst case: T(n) = O(n2) Average case: T(n) = O(nlogn)

7. Heap Sort

Heapsort refers to a sort algorithm designed by using heap data structure. Heap is a nearly complete binary tree structure, and also satisfies the property of heap: that is, the key value or index of the child node is always smaller (or larger) than its parent node. Heap sort: Heap sort

7.1 Algorithm Description

  • The initial sequence of keywords to be sorted (R1,R2… .rn) is constructed into the big top heap, which is the initial unordered region;

  • By exchanging the top element R[1] with the last element R[n], a new unordered region (R1,R2… Rn-1) and a new ordered region (Rn), and R[1,2… n-1]<=R[n];

  • Since the new heap top R[1] after the swap may violate the heap properties, it is necessary to apply the current unordered region (R1,R2… Rn-1) adjusts to the new heap, and then swaps R[1] again with the last element of the unordered region to obtain the new unordered region (R1,R2… .Rn-2) and the new ordered region (Rn-1,Rn). This process is repeated until the number of elements in the ordered region is N-1, and the whole sorting process is complete.

7.2 GIF demonstration

7.3 Code Implementation

Note that some of the properties of complete binary trees are used here:

http://www.cnblogs.com/guoyaohua/p/8595289.html

// Declare a global variable to record the length of the array; static int len; /** * heapsort algorithm ** @param array * @return
     */
    public static int[] HeapSort(int[] array) {
        len = array.length;
        if (len < 1) returnarray; //1. Build a maximum buildMaxHeap(array); //2. The loop swaps the first (maximum) bit of the heap with the last, and then readjust the maximum heapwhile (len > 0) {
            swap(array, 0, len - 1);
            len--;
            adjustHeap(array, 0);
        }
        returnarray; Public static void buildMaxHeap(int[] array) {// Build a heap from the last non-leaf nodefor(int i = (len/2 - 1); i >= 0; I = (len/2-1) adjustHeap(array, I); I = (len/2-1) adjustHeap(array, I); Public static void adjustHeap(int[] array, int I) {int maxIndex = I;  // If there is a left subtree and the left subtree is larger than the parent, then the maximum pointer is to the left subtreeif(i * 2 < len && array[i * 2] > array[maxIndex]) maxIndex = i * 2; // If there is a right subtree and the right subtree is larger than the parent, then the maximum pointer is to the right subtreeif(i * 2 + 1 < len && array[i * 2 + 1] > array[maxIndex]) maxIndex = i * 2 + 1; // If the parent is not the maximum, swap the parent with the maximum and recursively adjust the swap position with the parent.if (maxIndex != i) {
            swap(array, maxIndex, i);
            adjustHeap(array, maxIndex);
        }
    }Copy the code

7.4 Algorithm Analysis

Best case: T(n) = O(nlogn) Worst case: T(n) = O(nlogn) Average case: T(n) = O(nlogn)

8, Counting Sort

The core of counting sort is to convert input data values into keys and store them in extra array space. As a sort of linear time complexity, counting sort requires that the input data must be integers with a definite range.

Counting sort is a stable sorting algorithm. Counting sort uses an extra array C, where the ith element is the number of elements in array A whose value is equal to I. We then arrange the elements in A into the correct position according to array C. It can only sort integers.

8.1 Algorithm Description

  • Find the largest and smallest elements in the array to be sorted;

  • Count the number of occurrences of each element of value I in the array and store it in the i-th entry of array C;

  • Add up all the counts (starting with the first element in C, adding each term to the previous one);

  • Reverse populating the target array: place each element I in the C(I) item of the new array, subtracting C(I) by 1 for each element.

8.2 GIF demonstration

8.3 Code Implementation

/** * count sort ** @param array * @return
     */
    public static int[] CountingSort(int[] array) {
        if (array.length == 0) return array;
        int bias, min = array[0], max = array[0];
        for (int i = 1; i < array.length; i++) {
            if (array[i] > max)
                max = array[i];
            if (array[i] < min)
                min = array[i];
        }
        bias = 0 - min;
        int[] bucket = new int[max - min + 1];
        Arrays.fill(bucket, 0);
        for (int i = 0; i < array.length; i++) {
            bucket[array[i] + bias]++;
        }
        int index = 0, i = 0;
        while (index < array.length) {
            if(bucket[i] ! = 0) { array[index] = i - bias; bucket[i]--; index++; }else
                i++;
        }
        return array;
    }Copy the code

8.4 Algorithm Analysis

When the input element is n integers between 0 and k, it runs O(n + k). Counting sort is not comparison sort, which is faster than any comparison sort algorithm. Since the length of the array C used to count depends on the range of data in the array to be sorted (equal to the difference between the maximum and minimum value of the array to be sorted plus 1), counting sort takes a lot of time and memory for arrays with a large range of data.

Best case: T (n) = O (n + k) worst case: T (n) = O (n + k) average: T (n) = O (n + k)

9, Bucket Sort

Bucket sort is an updated version of counting sort. It makes use of the mapping relation of functions, and the key of efficiency lies in the determination of the mapping function.

Bucket sort works by dividing the data into a finite number of buckets, assuming that the input data is evenly distributed. Each Bucket is sorted separately (it is possible to use another sorting algorithm or to continue sorting using Bucket sort recursively

9.1 Algorithm Description

  • Set a BucketSize as the number of different values that can be stored in each bucket (for example, BucketSize==5 can store {1,2,3,4,5}, but there is no limit to the BucketSize, that is, 100 3’s can be stored).

  • Iterate over the input data and put the data one by one into the corresponding bucket;

  • Sort each bucket that is not empty, either by other sorting methods or by recursive bucket sorting;

  • Splicing together sorted data from never-empty buckets.

Note that if you recursively use bucket sorting to sort buckets, when the number of buckets is 1, you need to manually decrease BucketSize and increase the number of buckets in the next loop. Otherwise, you will fall into an infinite loop, resulting in memory overflow.

9.2 Picture Demonstration

9.3 Code Implementation

/** * @param array * @param bucketSize * @return
     */
    public static ArrayList<Integer> BucketSort(ArrayList<Integer> array, int bucketSize) {
        if (array == null || array.size() < 2)
            returnarray; int max = array.get(0), min = array.get(0); // Find the maximum and minimum valuefor (int i = 0; i < array.size(); i++) {
            if (array.get(i) > max)
                max = array.get(i);
            if (array.get(i) < min)
                min = array.get(i);
        }
        int bucketCount = (max - min) / bucketSize + 1;
        ArrayList<ArrayList<Integer>> bucketArr = new ArrayList<>(bucketCount);
        ArrayList<Integer> resultArr = new ArrayList<>();
        for (int i = 0; i < bucketCount; i++) {
            bucketArr.add(new ArrayList<Integer>());
        }
        for (int i = 0; i < array.size(); i++) {
            bucketArr.get((array.get(i) - min) / bucketSize).add(array.get(i));
        }
        for (int i = 0; i < bucketCount; i++) {
            if(bucketSize == 1) {// If there are duplicate numbers in the sorted array, thank @bucketSizefor (int j = 0; j < bucketArr.get(i).size(); j++)
                    resultArr.add(bucketArr.get(i).get(j));
            } else {
                if (bucketCount == 1)
                    bucketSize--;
                ArrayList<Integer> temp = BucketSort(bucketArr.get(i), bucketSize);
                for(int j = 0; j < temp.size(); j++) resultArr.add(temp.get(j)); }}return resultArr;
    }Copy the code

9.4 Algorithm Analysis

The optimal linear time for bucket sorting is O(n). The time complexity of bucket sorting depends on the time complexity of sorting data between buckets, because the time complexity of all other parts is O(n). Obviously, the smaller the bucket, the less data there is between the buckets, and the less time it takes to sort. But the corresponding space consumption will increase.

Best case: T (n) = O (n + k) worst case: T (n) = O (n + k) average: T (n) = O (n2)

10, Radix Sort

Radix sort is also a non-comparative sort algorithm, sorting each bit, sorting from the least, complexity is O(kN), is the length of the array, k is the largest number of the number in the array;

Radix sort is sorted by low order and then collected; And then sort it in order, and then collect it; And so on, all the way to the highest place. Sometimes some attributes are prioritized, first by low priority, then by high priority. The final order is that the higher-priority ones come first, and the lower-priority ones come first. Radix sort is based on sorting separately, collecting separately, so it’s stable. Radix sort: radix sort

10.1 Algorithm Description

  • Get the largest number in the array and get the number of digits;

  • Arr is the original array, and each bit is taken from the lowest level to form the RADIX array.

  • The radix is sorted by counting (taking advantage of the fact that counting sorting is suitable for small range numbers);

10.2 GIF demonstration

10.3 Code Implementation

/** * base sort * @param array * @return
     */
    public static int[] RadixSort(int[] array) {
        if (array == null || array.length < 2)
            returnarray; // 1. Calculate the largest number of digits; int max = array[0];for (int i = 1; i < array.length; i++) {
            max = Math.max(max, array[i]);
        }
        int maxDigit = 0;
        while(max ! = 0) { max /= 10; maxDigit++; } int mod = 10, div = 1; ArrayList<ArrayList<Integer>> bucketList = new ArrayList<ArrayList<Integer>>();for (int i = 0; i < 10; i++)
            bucketList.add(new ArrayList<Integer>());
        for (int i = 0; i < maxDigit; i++, mod *= 10, div *= 10) {
            for (int j = 0; j < array.length; j++) {
                int num = (array[j] % mod) / div;
                bucketList.get(num).add(array[j]);
            }
            int index = 0;
            for (int j = 0; j < bucketList.size(); j++) {
                for(int k = 0; k < bucketList.get(j).size(); k++) array[index++] = bucketList.get(j).get(k); bucketList.get(j).clear(); }}return array;
    }Copy the code

10.4 Algorithm Analysis

Best case: T(n) = O(n * k) Worst case: T(n) = O(n * k) Average case: T(n) = O(n * k)

There are two ways to sort radix:

  • MSD sorts from the top

  • LSD sorts from the low end

Radix sort vs counting sort vs bucket sort

These three sorting algorithms all use the concept of buckets, but there are obvious differences in the way buckets are used:

  • Radix sort: Buckets are allocated based on each digit of the key value

  • Counting sort: Each bucket stores a single key value

  • Bucket sort: Each bucket stores a range of values


Recommended reading (Click to skip to reading)

1. SpringBoot content aggregation

2. Assemble interview questions

3. Design pattern content aggregation

4. Mybatis content aggregation

5. Multi-threaded content aggregation