Share how to gradually optimize the time complexity of an algorithm, starting with a relatively simple and common TopK problem. USES the example of LeetCode215

Finds the KTH largest element in the unsorted array. Note that you need to find the KTH largest element of the sorted array, not the KTH distinct element. Example 1: inputs: [3,2,1,5,6,4] and k = 2 output: 5 example 2: inputs: [3,2,3,1,2,4,5,5,6] and k = 4 output: 4 note: you can assume that k is always valid and that 1 ≤ k ≤ the length of the array.Copy the code

Use the sorting method

To find the KTH largest element in an array, the straightforward idea is to sort the array and take the KTH largest element.


    func findKthLargest(_ nums: [Int], _ k: Int) -> Int {
       return nums.sorted()[nums.count-k]
    }

Copy the code

The solution is to sort the entire array. It’s O(n*lg(n)), and what we really want is to get the KTH largest element, but we sort the entire array, some elements that we don’t need to sort, so let’s just sort the whole array, let’s just sort locally, pick the first K largest elements.

We use the bubble method to sort the first K elements for example

Assume that arr = [3,6,9,7,8,1,2,3,5,6] and k = 4Copy the code

     func findKthLargest(_ nums: [Int], _ k: Int) -> Int {
        var nums = nums
        let count = nums.count
        for i in0.. <k {for j in0.. <count-i-1 {if nums[j] > nums[j+1] {
                    (nums[j] ,nums[j+1]) = (nums[j+1] ,nums[j])
                }
            }
        }
        return nums[count-k]
    }
Copy the code

So the KTH largest number is 6. The time complexity of this notation is O(K*N), depending on the size of K. If K is too big this is still a pretty complicated way to write it. Because it’s finding the KTH largest element. We don’t have to sort the first K elements, we try not to sort them.

Use the heap

Getting the largest or smallest element in an array is reminiscent of a heap. To get a better idea of what we’re going to do let’s briefly review some of the main concepts of the heap.

The concept of the heap

A heap is a binary tree implemented in arrays, so it has no parent or child Pointers. Heaps are sorted according to “heap properties,” which determine the location of nodes in the tree.

Heap properties

There are two types of heap: maximum heap and minimum heap. The difference between the two is how the nodes are sorted.

In the maximum heap, the value of the parent node is larger than the value of each child node. In the minimum heap, the value of the parent node is smaller than the value of each child node. This is known as a “heap property” and is true for every node in the heap. Maximum heap sample graph

Minimum heap sample diagram

How the heap is stored in an array

Array index formula for parent and child nodes:

parent(i) = (i-1)/2
left(i) = 2i + 1
right(i) = 2i + 2
Copy the code

figure

The height of the heap

The height of the heap with n nodes is h = log2(n).

Basic operations of the heap

  • ShiftUp () : If an element is larger (max-heap) or smaller (min-heap) than its parent, it needs to be swapped with the parent, which moves the element up.
  • ShiftDown (). This operation moves the element down if it is smaller (max-heap) or larger (min-heap) than its child, also known as “heapify.”
  • Insert (value) : Add new elements to the end of the heap, then use shiftUp() to repair the heap.
  • Remove () : deletes and returns the maximum value (max-heap) or minimum value (min-heap). To fill the position left after the element is deleted, let the last element move to the root, and then use shiftDown() to restore the heap. (Sometimes called “extract minimum” or “extract maximum.”)
  • RemoveAtIndex (index) : Similar to remove(), you can remove not only the root node, but also any node from the heap. If the new element is not neatly aligned with its children, shiftDown() is called; If the element is not neatly aligned with its parent, shiftUp() is called.
  • Replace (index, value) : Assigns a smaller (min-heap) or larger (max-heap) value to a node. Because this invalidates the heap property, it uses shiftUp() to fix it. (Also known as “decrease key” and “increase key.”)

Answer:

  1. When we use minimal heap. We take the first K bits of the supplied array and generate it into a minimum heap. The element at the top of the heap is the KTH element out of the first K elements.

The rest of the array is then compared to the top of the heap one by one. If it is larger than the top of the heap, it should be swapped and then shiftdown. The top of the heap is the KTH largest element.

Func findKthLargest(_ nums: [Int], _ k: Int) -> Int {var array: array <Int>! var heap :Heap<Int>!if nums.count > k {
        array = Array.init(nums[0..<k])
        heap = Heap.init(array: array, sort: <)
        for item in nums[k...nums.count-1] {
            if item > heap.peek()! {
                heap.replace(index: 0, value: item)
            }
        }
    }else{
        array = nums
        heap = Heap.init(array: array, sort: <)
    }
    return heap.peek()!
}
Copy the code

Time complexity analysis: this is an array generated by the heap, the entire array needs to be traversed, assuming that each element is unlucky, each time into the heap adjustment, adjustment time complexity is the height of the heap, namely lg(N), so the overall time complexity is O(N*lgN). 2. Using the Max heap solution, generate a Max heap with an array and remove the top of the heap k-1 times. That top is the KTH largest element

Func findKthLargest(_ nums: [Int], _ k: Int) -> Int {var heap = heap.init (array: nums, sort: >)for _ in0.. <k-1 { heap.remove() }return heap.peek()!
}
Copy the code

With the maximum number of solutions, the idea is relatively simple, but the complexity is higher. O(NlgN + KlgN)

Use binary search and quicksort ideas to reach O(n) solutions

First, select a random base in the array, and then partition the array around that base. The values greater than or equal to the base value are placed to the left of the base value, and the values less than the base value are placed to the right. It then returns the index of the base value, which is the KTH largest number we are looking for if the index is equal to K. If K is less than the index of the base value, then the KTH largest value is greater than the base value, and we simply repeat the process to the left of the base value until we find the base at exactly the KTH position. If k is greater than the base value then the KTH largest number is less than the base value, and we just repeat the process to the right of the base value until we find the base value at exactly the KTH position. For example:

[1,-2,-4,-7,3,4,11,23,14,27]
Copy the code

Let’s say K is equal to 4, then we need to pick the fourth largest element.

  1. So the first step is we pick a random number from the array, let’s say 14.
  2. We partition the array (not sort, just walk through the array once) and place elements greater than or equal to 14 to the left of 14, and elements less than 14 to the right of 14.
[23, 27, 14, -7, 3, 4, 11, 1, -2, -4]
 <---- larger  smaller  -->
Copy the code

All values greater than or equal to 14 are on the left, and all values less than 14 are on the right. The index of base value 14 is 2(the third largest), so the fourth largest element must be to the right of 14. We can ignore the rest of the array

[x, x, x, -7, 3, 4, 11, 1, -2, -4]
Copy the code

Again, we randomly select a base number of 11 from the rest of the array and partition the array.

[x, x, x, 11,-7, 3, 4, 1, -2, -4]
Copy the code

The partition results in the following array, with the index of 11 being 3, the fourth largest element, which is exactly what we’re looking for. Swift code implementation

func randomizedPartition(_ a: inout [Int], _ low: Int, _ high: Int) -> Int {
    let pivotIndex = Int.random(in: low... high) a.swapAt(pivotIndex, high)let pivot = a[high]
    var store_index = low
    for j inlow.. <high {if a[j] <= pivot {
            a.swapAt(store_index, j)
            store_index += 1
        }
    }
    a.swapAt(store_index, high)
    return store_index
}


func randomizedSelect<T: Comparable>(_ a: inout [T], _ low: Int, _ high: Int, _ k: Int) -> T {
  if low < high {
    let p = randomizedPartition(&a, low, high)
    if k == p {
      return a[p]
    } else if k < p {
      return randomizedSelect(&a, low, p - 1, k)
    } else {
      return randomizedSelect(&a, p + 1, high, k)
    }
  } else {
    return a[low]
  }
}
Copy the code

Complexity analysis:

  • Time complexity The average complexity is O(n) and the worst case is O(n 2).
  • The space complexity is order 1.

In conclusion, although TopK is a relatively simple algorithm problem, its idea of gradually optimizing complexity is very worthy of our reference.