“This is the second day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”

Summary of nine sorting algorithms

Bubble sort

core idea

Each bubbling operation compares two adjacent elements to see if they meet the size relationship requirements. If you don’t, switch them. One bubble moves at least one element to its proper position, n times, and n numbers are sorted

Sorting process

The bubbling process can also be optimized. If no data is exchanged during a bubbling operation, the order is complete and subsequent bubbling operations do not need to be performed. As shown in figure

Code implementation

func BubbleSort(arr []int) { flag := false n := len(arr) for i:=0; i < n; i++ { flag = false for j:=0; j < n-i-1; j++ { if arr[j] > arr[j+1] { arr[j], arr[j+1] = arr[j+1], arr[j] flag = true } } if ! flag { break } } fmt.Println(arr) }Copy the code

Analysis of bubble sort algorithm

Firstly, bubble sort is an in-place sort algorithm, because bubble sort only involves the exchange of adjacent data and requires constant level temporary space, so the space complexity is O(1).

In bubble sort, only swapping can change the order of two elements. In order to ensure the stability of bubble sort algorithm, when there are two adjacent elements of the same size, we do not exchange, the same size of data before and after sorting will not change the order, so bubble sort is a stable sorting algorithm

In the best case, where the data is completely ordered, it only takes one bubbling operation, so in the best case the time is order n.

The worst case is that the data to be sorted is completely unordered, and then you need n bubbles, so the worst case is O(n^2).

Insertion sort

core idea

Divide the array to be sorted into two ranges, ordered and unordered. When we started, the ordered region only had the first element. The process of insertion sort is to take one element out of the unordered region at a time and put it into the corresponding position in the ordered region to ensure that the ordered region remains in order after insertion into the ordered region. This process is repeated until the unordered region is empty

Sorting process

I refers to the element in the sorted region, and j refers to the element in the sorted region. I is responsible for traversing each element in the unordered region, and J is responsible for finding the correct position of the element in the unordered region.

Code implementation

func InsertSort(arr []int) { n := len(arr) for i:=1; i < n; Value := arr[I] j := i-1 for; j>=0; If arr[j] > value {arr[j+1] = arr[j]}else {break}} arr[j+1] = value} fpt.println (arr)}Copy the code

Analysis of insertion sort algorithm

Insertion sort also requires no additional storage space, and the space complexity is O(1), so it is an in-place sort algorithm

In insert sort, for elements with the same value, we can choose to insert the elements that appear later after the elements that appear before, so that the original order can remain unchanged, so insert sort is a stable sorting algorithm

If the data to be sorted is completely ordered, no data needs to be moved. If you look up the insertion position from end to end in an ordered data set, you only need to compare one data at a time to determine the insertion position. So in this case, the best time is order n. Notice that the ordered data is traversed from end to end

If the array is in reverse order, each insert is equivalent to inserting new data in the first position of the array, so a lot of data needs to be moved, so the worst-case time is O(n^2). The average time is also O(n^2).

Selection sort

core idea

The idea of selection sort is somewhat similar to the idea of insertion sort, which is to select the smallest element from an unordered region and put it into an ordered region at a time

Sorting process

The process of finding the smallest element in an unordered region is to take the first element in an unordered region and compare it to each of the elements in an unordered region. Specific figure:

Code implementation

func SelectSort(arr []int) {
	n := len(arr)

	for i:=0; i < n-1; i++ {
		for j:=i+1; j < n; j++ {
			if arr[i] > arr[j] {
				arr[i], arr[j] = arr[j], arr[i]
			}
		}
	}

	fmt.Println(arr)
}
Copy the code

Selection sorting algorithm analysis

The spatial complexity of selection sorting is also O(1), which is an in-place sorting algorithm. The best case and the worst case are both O(n^2), and this is easy, just look at how it works

Selection sort is not a stable sort, and it breaks stability by finding the minimum value of the remaining unsorted elements and swapping places with the previous element

For example, if you sort a set of data like 7, 3, 5, 7, 1, 9 using the selective sorting algorithm, if you find the smallest element 1 for the first time and switch places with the first 7, the order of the first 7 and the middle 7 will change, so it will be unstable

Merge sort

core idea

To sort an array, we divide the array into two parts, sort the two parts separately, and merge the sorted parts together so that the entire array is sorted

Merge sort uses the divide-and-conquer idea. Divide and conquer, as the name suggests, is to divide and conquer, to solve a big problem by breaking it down into smaller sub-problems. When small sub-problems are solved, big problems are solved

Sorting process

Take the middle element of the array as the partition point each time, divide the array into two parts before and after, when the number of elements in each part of the array is only 1, start to merge the split array, during the merge process, ensure its order

The merging of ordered arrays is very simple and I won’t draw it here

Code implementation

func MergeSort(arr []int) []int {
	if len(arr) < 2 {
		return arr
	}

	medium := len(arr) / 2
	leftArr := MergeSort(arr[:medium])
	rightArr := MergeSort(arr[medium:])

	res := merge(leftArr, rightArr)

	return res
}

func merge(leftArr, rightArr []int) []int {
	var mergeRes []int
	leftN := len(leftArr)
	rightN := len(rightArr)

	leftIndex :=0
	rightIndex := 0

	for leftIndex < leftN && rightIndex < rightN {
		if leftArr[leftIndex] < rightArr[rightIndex] {
			mergeRes = append(mergeRes, leftArr[leftIndex])
			leftIndex++
		} else {
			mergeRes = append(mergeRes, rightArr[rightIndex])
			rightIndex++
		}
	}

	if leftIndex == leftN {
		mergeRes = append(mergeRes, rightArr[rightIndex:]...)
	}

	if rightIndex == rightN {
		mergeRes = append(mergeRes, leftArr[leftIndex:]...)
	}

	return mergeRes
}
Copy the code

Analysis of merge sort algorithm

Merge sort is a stable sort algorithm. This depends on the merge operation, which merges two ordered arrays into one ordered array. In the process of merging, if two arrays have the same elements, the elements in the previous part of the array are put into TMP first. In this way, the same elements can be kept in the same order before and after merging

As can be seen from the schematic diagram of merge sort, “the execution efficiency of merge sort has nothing to do with the order degree of the original array to be sorted, so its time complexity is very stable, no matter in the best case, the worst case, or the average case, the time complexity is O(nlogn)”.

Merge sort is not as widely used as quicksort, although the time complexity is O(nlogn). The merge process in merge sort requires extra storage (space complexity is O(n))

Quick sort

core idea

If we want to sort a set of data in an array with subscripts from P to R, we select any data between P and R as pivot.

Iterate over the data between P and R, placing anything less than pivot on the left, anything greater than pivot on the right, and pivot in the middle. After this step, the data in the array p through R is divided into three parts, with p through Q-1 being less than Pivot in the front, pivot in the middle, and q+1 through R being greater than Pivot in the back

According to the idea of divide-and-conquer and recursion, we can recursively sort the data with subscripts from P to Q-1 and the data with subscripts from q+1 to R until the interval is reduced to 1, indicating that all data are in order

Sorting process

In merge sort there is a merge() method, which merges two ordered arrays into one ordered array. The partition() function selects an element at random as the pivot point (usually the last element from start to end) and returns the subscript of the pivot point for A[start… end]

The partition() function implementation process

Code implementation

func QuickSort(arr []int, start, end int)  {
	if start >= end {
		return
	}

	pivot := partition(arr, start, end)
	QuickSort(arr, start, pivot-1)
	QuickSort(arr, pivot+1, end)
}

func partition(arr []int, start, end int) int {
	pivotValue := arr[end]

	i:=start
	for j:=start; j<len(arr); j++ {
		if arr[j] > pivotValue {
			arr[i], arr[j] = arr[j], arr[i]
			i++
		}
	}

	arr[end], arr[i] = arr[i], arr[end]

	return i
}
Copy the code

Analysis of quicksort algorithm

The above uses a clever method to achieve fast sorting with space complexity O(1), so ** “fast sorting is a stable sorting algorithm” **

Since partitioning involves swapping, if there are two identical elements in an array, such as sequences 6, 8, 7, 6, 3, 5, 9, 4, the relative order of the two sixes changes after the first partitioning (which can be easily deduced from the partitioning algorithm above). So, quicksort is not a stable sorting algorithm.

Heap sort

To understand heap sorting, you need to understand the heap data structure, including building a heap, removing an element from the heap, and inserting an element into the heap

What is a pile of

  • The heap is a complete binary tree;
  • The value of each node in the heap must be greater than or equal to (or less than or equal to) the value of each node in its child tree

A heap whose value of each node is greater than or equal to the value of each node in the subtree is called a “big top heap”. A heap whose value of each node is less than or equal to the value of each node in the subtree is called a small top heap.

How do YOU implement a heap

Full binary trees are best stored in arrays. Using arrays to store full binary trees is very storage efficient. Because we don’t need to store Pointers to the left and right child nodes, we can find the left and right child and parent nodes of a node simply by using the index of the array

As can be seen from the figure, the left child of the node with subscript I in the array is the node with subscript I ∗2, the right child is the node with subscript I ∗2+1, and the parent node is the node with subscript I /2

Inserts elements into the heap

After you insert an element into the heap, you need to ensure that two characteristics of the heap continue to be satisfied

If you put the newly inserted element at the end of the heap, you can look at the following diagram. It then needs to be adjusted to meet the characteristics of the heap again, a process known as heapify

Heapification is very simple, just follow the path of the nodes, up or down, compare, and swap

Package Heap import "FMT" var a []int // array var count int // The amount of data stored in the Heap var n int // The capacity of the Heap func Heap(capacity int) {a = Make ([]int, capacity) n = capacity count = 0} // func BuildHeap(a []int, n int) {// For a complete binary tree, For I :=n/2; for I :=n/2; i>0; i-- { heapify(a, n, i) } fmt.Println(a) } func swap(a []int, i, j int) { a[i], a[j] = a[j], Func InsertHeap(data int) {if count == n {return count++ a[count] = data I := count for I /2 > > 0 && a [I] a [I / 2] {swap (a, I, I / 2) I = I / 2} FMT. Println (" insert ", a)} func heapify (a int [], n, i int) { for true { maxPos := i if 2*i < n && a[i] < a[2*1] { maxPos = 2*i } if 2*i+1 < n && a[maxPos] < a[2*i+1] { maxPos = 2*i+1 } if maxPos == i { break } swap(a, i, maxPos) i = maxPos } }Copy the code

Delete the heaptop element

The last node is placed at the top of the heap (overwriting the heaptop element, which is deleted) and the same parent-child comparison method is used. For those that do not satisfy the parent-child node size relationship, the two nodes are switched and the process is repeated until the parent-child node size relationship is satisfied

Func RemoveHeap() {if count == 0 {return // heap is empty} a[1] = a[count] count-- heapify(a, count, FMT.Println(a)}Copy the code

A complete binary tree with n nodes, no higher than log2n. The heapification process is commutated along the path of nodes, so the time complexity of heapification is proportional to the height of the tree, that is, O(logn). The primary logic for inserting data and removing the heaptop element is heapization, so the time to insert an element into the heap and remove the heaptop element is O(logn)

Heap sort

Building the heap

Since the leaf nodes are heaped down only compared to themselves, we simply heap them from the last non-leaf node

Func BuildHeap(a []int, n int) {// Build a heap (a []int, n int) {// Build a heap (a []int, n int) {// Build a heap (a []int, n int); i>0; i-- { heapify(a, n, i) } fmt.Println(a) } func heapify(a []int, n, i int) { for true { maxPos := i if 2*i < n && a[i] < a[2*1] { maxPos = 2*i } if 2*i+1 < n && a[maxPos] < a[2*i+1] { maxPos = 2*i+1 } if maxPos == i { break } swap(a, i, maxPos) i = maxPos } }Copy the code

The heap time is order n.

The sorting

After the heap is built, the data in the array is organized according to the characteristics of the big top heap. The first element in the array is the top of the heap, which is the largest element. Swap it with the last element, and the largest element will be at index n

This process is similar to the “delete top element” operation described above. After the top element is removed, the element with subscript N is placed on the top of the heap, and then the remaining N −1 elements are rebuilt into the heap by heapization. After the heap is heaped, we take the top element of the heap, place it at index n−1, and repeat the process until there is only one element with index 1 left in the heap. The sorting is complete

// HeapSort func HeapSort(a []int, n int) {BuildHeap(a, n) k := n for k > 1 {swap(a, 1, k) k-- heapify(a, k, 1)}}Copy the code

Heapsort algorithm analysis

The whole process of heap sort requires very little temporary storage, so heap sort is an in-place sort algorithm. Heap sort consists of two operations: heap build and sort. The time complexity of heap build is O(n) and the time complexity of sort is O(nlogn). Therefore, the overall time complexity of heap sort is O(nlogn).

Heap sort is not a stable sorting algorithm because it is possible to change the original relative order of data with the same value by swapping the last node of the heap with the top node during sorting

Bucket sort

core idea

The data to be sorted is sorted into several ordered buckets, and the data in each bucket is sorted separately. After the bucket is sorted, the data in each bucket is taken out in sequence to form an orderly sequence

Sorting process

Analysis of bucket sorting algorithm

Suppose the number of data to be sorted is N, divide the N data evenly into M buckets, and each bucket has k= N /m elements. Each bucket uses internal quicksort and the time complexity is O(k x logk). The time complexity of sorting m buckets is O(m * k * logk). Since k=n/m, the time complexity of sorting m buckets is O(n*log(n/m)). When the number of buckets m approaches the number of data n, log(n/m) is a very small constant, and the time complexity of bucket sorting approaches O(n).

Although the time complexity of bucket sorting algorithm is lower than the previous six sorting algorithms, it cannot replace the previous six sorting algorithms, because bucket sorting is very demanding to the data to be sorted

  1. The data to be sorted needs to be easily divided into m buckets, and buckets have a natural size order. In this way, after sorting data in each bucket, data between buckets do not need to be sorted
  2. Data is evenly distributed among buckets (if data is not evenly distributed among buckets, time complexity may degenerate to Nlogn, such as all data is divided in one bucket)

Applicable scenario: The data range is small

Count sorting

core idea

Counting sort is actually a special case of bucket sort. When the n numbers that we want to sort are on a small scale, for example, the maximum is k, we can divide the data into k buckets. The data values in each bucket are the same, saving the sorting time in the bucket

For example, in the college entrance examination, if there are 500,000 examinees, the full score is 900 and the minimum score is 0, the data range is very small, so we can divide it into 901 buckets, and the corresponding scores are from 0 to 900

According to the scores of the examinees, the 500,000 examinees were divided into the 901 buckets. The data in the bucket are all examinees with the same score, so there is no need to sort. Just scan each bucket in turn and output the examinees in the bucket to an array in turn to achieve the sorting of 500,000 examinees. Because only scan traversal is involved, the time is order n.

Sorting process

The sorting algorithm is called counting sort. Where does counting come from? Look at this example

Suppose there are only eight examinees with a score between 0 and 5. The scores of these 8 examinees are put into an array A[8], they are: 2,5,3,0,2,3,0,3 respectively

Candidates score from 0 to 5, using an array C[6] of size 6 to represent buckets, where subscripts correspond to scores. However, C[6] does not store examinees, but the corresponding number of examinees. Simply iterate over the examinee score to get a value of C[6]

As can be seen from the figure, there are 3 examinees with a score of 3 and 4 examinees with a score of less than 3. Therefore, examinees with a score of 3 will save the positions of subscripts 4, 5 and 6 in the ordered array R[8] after sorting

Now it is time to find out where the examinee with the same score should be. The idea is that by summing the array C[6] sequentially, the data stored in C[6] looks like this. C[k] stores the number of candidates whose score is less than or equal to k

Now scan array A from back to front (A holds out-of-order student scores). For example, when 3 is scanned, the value 7 with subscript 3 (the subscript in C is the score) can be retrieved from the array C, which means that so far there are seven candidates, including themselves, with scores of 3 or less. That is, 3 is the seventh element in R (the position with subscript 6 in R). When 3 is added to R, there are only 6 elements less than or equal to 3 left, so C[3] is reduced by 1 to 6

Similarly, when a second candidate with a score of 3 is scanned, it is placed at the position of the sixth element in the array R (the position with subscript 5). When the entire array A is scanned, the data in array R is sorted in order of fractions from smallest to largest

Counting sort algorithm analysis

Count sort can only be used in scenarios where the data range is small. If the data range K is much larger than the data to be sorted n, count sort is not suitable. Moreover, counting sort can only sort non-negative integers. If the data to be sorted is of another type, it is converted to a non-negative integer without changing its relative size

Radix sort

core idea

Compare and sort each small unit of each data of the data to be sorted one by one. When each unit is sorted, the group of data is in order

For example, if you have 10W mobile phone numbers, you want to sort the 100,000 mobile phone numbers from smallest to largest. The 11-digit range of mobile phone numbers is too large to use the sorting algorithm mentioned earlier. Suppose you want to compare the size of two mobile phone numbers A and B. If the number of a is already larger than that of B in the first few digits, you don’t need to look at the last few digits

Sort the phone number by the last digit, then by the second to last, and so on, and finally by the first digit. After 11 sequences, the phone numbers were in order

Sorting process

Let’s use string sorting as an example

Note that if the sorting algorithm is stable, it’s not going to be right. Because if it’s an unstable sort algorithm, then the last sort only takes into account the order of the highest bit, regardless of the size of the other bits, then the lowest bit sort is completely meaningless

Analysis of radix sort algorithm

Sort by each bit, we can use bucket sort or count sort mentioned above, which can be done in time

O (n). If we’re sorting k bits, then we need k bucket sorts or count sorts, and the total time is zero

O n (k *). When k is small, as in the case of sorting mobile numbers, k is at most 11, so the time complexity of radix sort is

Similar to O (n)

Radix sort requires that the data to be sorted can be divided into independent “bits” for comparison, and there is a progressive relationship between the bits. If the high level of data A is larger than that of data B, the rest of the low level need not be compared. In addition, the data range of each bit should not be too large, so that the linear sorting algorithm can be used to sort, otherwise, the time complexity of radix sort cannot be O(n)

Comparison of sorting algorithms

The name of the Degradation conditions Applicable scenario Best time complexity Worst time complexity Average time complexity Spatial complexity Is it stable sort Is it sort in place
Bubble sort The sorted data happens to be in reverse order O(n) (sorted data is already ordered) O(n^2) O(n^2) O(1) It’s stable sort It’s sort in place
Insertion sort The sorted data happens to be in reverse order O(n) (sorted data is already ordered) O(n^2) O(n^2) O(1) It’s stable sort It’s sort in place
Selection sort O(n^2) O(n^2) O(n^2) O(1) notA stable sort It’s sort in place
Merge sort O(nlogn) O(nlogn) O(nlogn) O(n) It’s stable sort notIn situ sorting
Quick sort The sorted data happens to be in order O(nlogn) O(n^2) O(nlogn) O(1) notA stable sort It’s sort in place
Heap sort O(nlogn) O(1) notA stable sort It’s sort in place
Bucket sort In extreme cases, if the data is all divided into a bucket The amount of data is large and the memory is limited. Therefore, the data cannot be loaded into the memory O(n) O(nlogn) O(n)
Count sorting 1. Count sort can only be used in scenarios with small data range. If the data range K is much larger than the data to be sorted n, it is not suitable to use count sort. Counting sort can only sort non-negative integers. If the data to be sorted is of another type, it is converted to a non-negative integer without changing its relative size O(n) O(nlogn) O(n)
Radix sort Radix sort requires that the data to be sorted can be divided into independent “bits” for comparison, and there is a progressive relationship between the bits. If the high level of data A is larger than that of data B, the rest of the low level need not be compared. In addition, the data range of each bit should not be too large, so that the linear sorting algorithm can be used to sort, otherwise, the time complexity of radix sort cannot be O(n) Radix sort requires that the data to be sorted can be divided into independent “bits” for comparison, and there is a progressive relationship between the bits. If the high level of data A is larger than that of data B, the rest of the low level need not be compared. In addition, the data range of each bit should not be too large, so that the linear sorting algorithm can be used to sort, otherwise, the time complexity of radix sort cannot be O(n) O(n) O(n)