A little digression

Last time in my public number to you do a small survey “cast you want to solve the problem programming language ~”. Here are the results of the survey:

For the rest, it’s mostly Go.

Since Java and Python are already over 60%, I’ll try both this time thanks to @CaptainZ for the Java code. In order not to make this article too long and smelly, I have put all the Java code (Java and Python) of this article on the official website of Licoojiajia: leetcode-solution.cn/solution-co…

If the Internet is not scientific, it may be very slow to open.

The body of the

Hi, I’m Lucifer. Today we bring you the “pile” special topic. First of all, the outline of this paper is a brain map I drew with MindMap. Later, I will continue to improve and gradually improve other topics.

You can also use vscode blink-mind to open the source file and view some notes in it. The source files can be obtained by replying to the brain map on my public account “Likou Jiajia”, and the brain map will continue to update more content in the future. Vscode plug-in address: marketplace.visualstudio.com/items?itemN…

This series includes the following topics:

  • After almost finishing all the links, I found these things…
  • Almost finished brushing all the tree buttons, I found these things…
  • After almost finishing all the questions, I found these things… (First shot)

This is the next chapter, students who have not read the previous chapter strongly suggest reading the previous chapter almost finished the force buckle all the pile questions, I found these things… (First shot)

This is part two, and it’s going to be a little bit more concrete, three tips and four applications. These are two topics that teach you how to do it. With this knowledge, most of the heap titles in the leecker are no problem (of course, I only mean the part of the title that refers to the heap).

Warning: Most of the questions in this chapter are hard, because many of them are marked hard.

note

We’ll have an appetizer before the main course.

Here we introduce two concepts, namely tuple and simulated big top heap. And the whole point of doing this is just in case you don’t understand it.

tuples

It’s not just possible to store a single value with a heap; for example, 1,2,3,4 of [1,2,3,4] are all single values, respectively. In addition to single values, you can also store compound values, such as objects or tuples.

Here we introduce a way to store tuples, a technique that will be widely used later, so be sure to master it. For example [(1,2,3), (4,5,6), (2,1,3),(4,2,8)].

h = [(1.2.3), (4.5.6), (2.1.3), (4.2.8)]
heapq.heappify(h) # heap (small top heap)

heapq.heappop() Popup (# 1, 2, 3)
heapq.heappop() Popup (2 #,1,3)
heapq.heappop() Popup (# 4,2,8)
heapq.heappop() Popup (# 4 and 6)
Copy the code

A diagram of the heap structure looks like this:

Briefly explain the result of the above code.

Using tuples, the first value of a tuple is compared as a key by default. If the first one is the same, continue to compare the second one. For example, (4,5,6) and (4,2,8) above, since the first value is the same, we continue to compare the next one, and since 5 is larger than 2, (4,2,8) goes first.

Using this technique serves two purposes:

  1. Bring some extra information. Let’s say I want to find the KTH decimal in a two-dimensional matrix, with the value as the key, of course. However, the processing needs its row and column information, so tuples such as (val, row, col) are appropriate.

  2. You want to sort by two keys, a primary key and a secondary key. There are two more typical uses,

    2.1 One is that both are in the same order, for example, both are in order or both are in reverse order.

    2.2 The other is two different order sorting, namely, one is the reverse order and the other is the order.

Due to the length, I will not expand the details here, you can pay attention to it in the usual process of doing problems, and I will open a separate article to explain.

If your programming language does not have a heap or the implementation of a heap does not support tuples, you can also make simple changes to support them, mainly by customalizing the comparison logic.

Simulate the big top reactor

Because Python doesn’t have a big top heap. So I’m using the small top heap for simulation. You take all the negatives of the original number, say 5, and you put -5 in the heap. After this treatment, the small top heap can be used as the large top heap. But remember, when you pop it out, take it upside down and put it back in.

Code examples:

h = []
A = [1.2.3.4.5]
for a in A:
    heapq.heappush(h, -a)
-1 * heapq.heappop(h) # 5
-1 * heapq.heappop(h) # 4
-1 * heapq.heappop(h) # 3
-1 * heapq.heappop(h) # 2
-1 * heapq.heappop(h) # 1
Copy the code

It is shown as follows:

That’s all for foreshadowing, and then we’ll get to the point.

Three techniques

Tip 1 – Fix the heap

The trick is to keep the size of the heap constant, k, and code can be implemented by pushing one in for every pop out. And since the initial heap might be 0, we initially need to push into the heap one by one to get the size of the heap to be K, so strictly speaking, the size of the heap should not be greater than K.

A typical application of a fixed heap is to find the KTH smallest number. The easiest way to figure out the KTH smallest number is to build a small top heap, put all the numbers in the heap first, and then get out of the heap one by one, k times. The last thing that goes out of the heap is the KTH smallest number.

However, instead of going all the way to the heap first, we build the big top heap (note not the small top heap) and keep the heap size at K. If the heap size is greater than k after the new number is added to the heap, you need to compare the number on top of the heap with the new number and remove the larger number. This guarantees that the number in the heap is the smallest k of the total number, and that the largest of the smallest k is the KTH smallest. This is why you choose to build a big top heap rather than a small top heap.

A simple summary is that fixing a big top heap of size K can quickly solve for the KTH smallest number, and vice versa fixing a small top heap of size K can quickly solve for the KTH largest number. For example, to find the KTH largest xor coordinate value can be used to fix the small top heap technique to achieve (this problem asks you to find the KTH largest number).

So you may not feel strongly about it, but let me give you two examples to help you understand.

295. The median of data streams

Topic describes
The median is the number in the middle of an ordered list. If the list length is even, the median is the average of the middle two numbers. For example, the median of [2,3,4] is 3 and the median of [2,3] is (2 + 3) / 2 = 2.5 Design a data structure that supports two operations: void addNum(int num) - Add an integer from the data stream to the data structure. Double findMedian() - Returns the current median for all elements. Example: addNum(1) addNum(2) findMedian() -> 1.5 addNum(3) findMedian() -> 2 Advanced: How would you optimize your algorithm if all integers in the data stream were in the range 0 to 100? How would you optimize your algorithm if 99% of the integers in your data stream were in the range 0 to 100?Copy the code
Train of thought

This is actually a special case of finding the KTH smallest number.

  • If the list length is odd, then k is (n + 1) / 2, and the median is the KTH number. So if n is 5, k is 5 plus 1 over 2, which is 3.
  • If the list length is even, then k is (n + 1) / 2 and (n + 1) / 2 + 1, and the median is the average of the two numbers. For example, if n is 6, k is 6 + 1/2 = 3 and 6 + 1/2 + 1 = 4.

Therefore, we can maintain two fixed heaps, the size of the fixed heap is (N +1)÷2(n +1) \div2 (n+1)÷2 and N −(n+1)÷2n – (n+1) \div2n−(n+1)÷2, that is, the size difference between the two heaps is at most 1. More specific is 0 < = (n + 1) present 2 – (n – (n + 1) present 2) < = 1 0 < = (n + 1)/div 2 – (n – (n + 1)/div 2) < = 10 < = (n + 1) present 2 – (n – (n + 1) present 2) < = 1.

Based on the knowledge mentioned above, we can:

  • Build a big top heap and store the smallest (n+1)÷2(n +1) \div 2(n+1)÷2 number, so that the number at the top of the heap is the smallest number (n+1)÷2(n +1) \div 2(n+1)÷2, which is the median of the odd case.
  • Build a small top heap, and store the largest n – (n+1)÷2(n +1) \div 2(n+1)÷2 number, so that the number at the top of the heap is the n – (n+1)÷2(n +1) \div 2(n+1)÷2 number, combined with the big top heap above, the median of the even case can be obtained.

With this knowledge, all that remains is how to maintain the size of the two heaps.

  • If there are fewer large top heaps than small top heaps, then the smallest of the small top heaps is transferred to the large top heap. Since the small top heap maintains the largest number of K and the large top heap maintains the smallest number of K, the small top heap must be greater than or equal to the large top heap, and these two heap tops are the median at this time.
  • If the number of large top heaps is 2 more than the number of small top heaps, then the largest of the large top heaps is transferred to the small top heap for the same reason.

At this point, you probably understand why you have two separate heaps, and you need a big top heap and a small top heap. The reason for this is as described above.

Fixed heap is more common than that, so let’s do another problem.

code
class MedianFinder:
    def __init__(self) :
        self.min_heap = []
        self.max_heap = []
    def addNum(self, num: int) - >None:
        if not self.max_heap or num < -self.max_heap[0]:
            heapq.heappush(self.max_heap, -num)
        else:
            heapq.heappush(self.min_heap, num)
        if len(self.max_heap) > len(self.min_heap) + 1:
            heappush(self.min_heap, -heappop(self.max_heap))
        elif len(self.min_heap) > len(self.max_heap):
            heappush(self.max_heap, -heappop(self.min_heap))
    def findMedian(self) - >float:
        if len(self.min_heap) == len(self.max_heap): return (self.min_heap[0] - self.max_heap[0) /2
        return -self.max_heap[0]
Copy the code

1.3.1 (code)

857. Minimum cost of employing K workers

Topic describes
There are N workers. The ith worker's work quality is quality[I], and his minimum expected wage is wage[I]. Now we want to hire K workers to form a payroll. When we employ a group of K workers, we must pay them according to the following rule: Each worker in the wage group shall be paid in proportion to the quality of his work as compared with that of the other workers in the group. Each worker in the wage group should receive at least their minimum expected wage. Returns the minimum amount of money required to form a salary set that meets the above criteria. Example 1: input: quality = [10,20,5], wage = [70,50,30], K = 2 output: 105.00000 explanation: we pay 70 to worker 0 and 35 to worker 2. Example 2: Input: quality = [3,1,10,10,1], wage = [4,8,2,2,7], K = 3 Output: 30.66667 Explanation: We pay 4 to worker 0 and 13.33333 to worker 2 and 3 respectively. Tip: 1 <= K <= N <= 10000, N = quality. Length = wage. Length 1 <= quality[I] <= 10000 1 <= wage[I] <= 10000 and the correct answer within 10^-5 will be regarded as correct.Copy the code
Train of thought

They ask us to choose K individuals and pay them in proportion to the quality of their work to the quality of the work of the other workers in the group, and that each worker in the wage group should receive at least their minimum expected wage.

In other words, k people in the same group have a fixed ratio of quality of work to pay the least. Please understand this sentence first, the rest of the content is based on this premise.

So let’s say we have a metric for efficiency, which is equal to q over W. If the q/W of k individuals is the same, the minimum salary can be guaranteed, and the Q/W must be the lowest of the K individuals, otherwise someone will not get the minimum expected salary.

So we can write the following code:

class Solution:
    def mincostToHireWorkers(self, quality: List[int], wage: List[int], K: int) - >float:
        eff = [(q / w, q, w) for a, b in zip(quality, wage)]
        eff.sort(key=lambda a: -a[0])
        ans = float('inf')
        for i in range(K-1.len(eff)):
            h = []
            k = K - 1
            rate, _, total = eff[i]
            # Find k people who are more productive than that, and those K people are paid as little as possible.
            # since the work efficiency is already in reverse order, so the front is higher than it, then use the heap to get k lowest wages.
            for j in range(i):
                heapq.heappush(h, eff[j][1] / rate)
            while k > 0:
                total += heapq.heappop(h)
                k -= 1
            ans = min(ans, total)
        return ans
Copy the code

1.3.2 (code)

This approach, which pushes a lot and pops K times at a time, doesn’t take advantage of the dynamic nature of the heap, but only its extremal nature.

A better approach is to use the fixed-heap technique.

Another way to think about it is. In fact, the problem is to choose K people, take the lowest efficiency ratio of them, and calculate the total wage based on the lowest efficiency, to find the lowest total wage. So this problem can fix a big top heap of size K, and do something to ensure that the top heap is the KTH smallest.

And in the previous solution, the heap uses triples (q/w, q, w), which is not really necessary. Since we know two of them, we can deduce the other one, so we can store two of them, and since we need to make the keys of the heap based on the efficiency ratio, we can just pick either Q or W, so I chose Q, which is to store the (q/2, q) binary.

To be specific: Sum n=1kqn/rate\ DisplayStyle \sum_{n=1}^{k}{q}_{n}/raten=1∑ KQN /rate, where rate is the current Q/W, It’s also the minimum q/W for k individuals.

code
class Solution:
    def mincostToHireWorkers(self, quality: List[int], wage: List[int], K: int) - >float:
        effs = [(q / w, q) for q, w in zip(quality, wage)]
        effs.sort(key=lambda a: -a[0])
        ans = float('inf')
        h = []
        total = 0
        for rate, q in effs:
            heapq.heappush(h, -q)
            total += q
            if len(h) > K:
                total += heapq.heappop(h)
            if len(h) == K:
                ans = min(ans, total / rate)
        return ans
Copy the code

(code 1.3.3)

Tip 2 – multiple merge

This technique was actually mentioned in the previous lecture on super ugly numbers, but there is no name for this type of problem.

This technique, called multi-pointer optimization, would have been more appropriate, but it’s too simplistic and confusing with double-pointers, so I gave it the fancy name multi-merge.

  • Multiple routes: there are multiple alternative routes. In code, we can use multiple Pointers.
  • The result may be the longest or shortest of multiple alternative routes, the KTH, etc. Therefore, we need to compare the results of multiple routes and discard or select one or more routes according to the description of the topic.

This is an abstract description, but let’s do a couple of examples to help you understand.

Here I have prepared four hard questions for you. Master this routine can go to the happy AC these four questions.

1439. The KTH minimum array sum in ordered matrices

Topic describes
You are given an m by n matrix mat, and an integer k, each row of which is arranged in non-decreasing order. You can select 1 element from each row to form an array. Returns the KTH smallest array sum of all possible arrays. Example 1: input: mat = [[1,3,11],[2,4,6]], k = 5 output: 7 explanation: select an element from each row, the first k and the smallest array are: [1,2], [1,4], [3,4], [1,6]. The sum of the fifth one is 7. Example 2: input: mat = [,3,11 [1], [2 minus 2]], k = 9 output: 17 example 3: input: mat = [].enlighened,10,10 [1], [2,3,6]], k = 7 output: 9: Choose one from each row element, k and the smallest array respectively is: before,1,2 [1], [1,1,3],,4,2 [1], [1, 3],,1,6 [1], [1,5,2],,5,3 [1]. The sum of the seventh one is 9. Example 4: input: mat = [[1,1,10],[2,2,9]], k = 7 output: 12 hint: M == mat.length n == mat.length[I] 1 <= m, n <= 40 1 <= k <= min(200, n ^ m) 1 <= mat[I][j] <= 5000 MAT [I] is a non-decreasing arrayCopy the code
Train of thought

In fact, the problem is to give you m one-dimensional arrays of the same length, let’s pick a number from each of the m arrays, that is, to pick a total of m numbers, the sum of these m numbers is all the possibilities and the KTH smallest.

A naive idea is to use multiple Pointers. For this problem, m Pointers are used to point to m one-dimensional arrays, and the position of the Pointers indicates the number of digits in the array.

Take mat = [[1,3,11],[2,4,6]], k = 5 for example.

  • Initialize two Pointers, p1 and p2, to the beginning of two one-dimensional arrays.
  • The sum of the two Pointers is 1 + 2 = 3, so that’s the first smallest sum.
  • Next, we move one of the Pointers. So we can move P1, we can move P2.
  • So the second smallest must be the smaller value in the case of moving P1 and moving P2. And if you move p1 and p2 here you actually get 5, so the second and third smallest sums are both 5.

At this point it has bifurcated and two things happen (note the position in bold, which indicates the position of the pointer) :

  1. ,3,11 [1], [2 minus 2] and as 5
  2. ,3,11 [1], [2 minus 2] and as 5

Next, the two should go hand in hand.

For case 1, there are two more scenarios for the next move.

  1. ,3,11 [1], [2 minus 2] and 13
  2. ,3,11 [1], [2 minus 2] and for 7

For case 2, there are also two scenarios for the next move.

  1. ,3,11 [1], [2 minus 2] and for 7
  2. ,3,11 [1], [2 minus 2] and for 7

By comparing these four cases, we came to the conclusion that the fourth, fifth, and sixth smallest numbers are all 7. But the seventh smallest number doesn’t have to be 13. For a similar reason, perhaps the 7th smallest is hidden in the new situation after the previous 7 split, and indeed it is. So we need to continue with this logic.

Further, we can extend the above idea to the general case.

I mentioned above that what they’re looking for is the KTH smallest sum, and the smallest one we can easily figure out, which is the sum of all the first entries in a one-dimensional array. We found again, according to the minimum, we can deduce the 2 small, is to move one of the pointer is derived, which were split out of the n kinds of circumstances, where n is a one-dimensional array length, small in the division of n 2 kinds of cases, and screening the way is this kind of circumstance of n and the smallest, the latter is also similar. It’s not hard to see that the extremum changes after each split, so this is an obvious signal to evaluate the extremum dynamically, and using a heap is a good choice.

So how do you write the code?

As stated above, we first initialize m Pointers and assign a value of 0. Corresponding pseudocode:

Initialize the heap
h = []
# sum(vec[0] for vec in mat) is the first sum of m one-dimensional arrays
# [0] * m initializes an array of length m filled with zeros.
# We assembled the above two messages into yuanzu cur for easy use
cur = (sum(vec[0] for vec in mat), [0] * m)
# add it to the heap
heapq.heappush(h, cur)
Copy the code

Next, we move the pointer one at a time to create a new branch. Every time you pop the smallest one out of the heap, k pops is the KTH smallest one. Pseudo code:

for 1 to K:
    # acc current and Pointers are pointer cases.
    acc, pointers = heapq.heappop(h)
    Move one pointer in the pointer array roughly at a time. Forked for every pointer moved, the total number of possible moves is n, where n is the length of the one-dimensional array.
    for i, pointer in enumerate(pointers):
        # if pointer == len(mat[0]) -1
        ifpointer ! =len(mat[0]) - 1:
            = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
            new_pointers = pointers.copy()
            new_pointers[i] += 1
            # reheap the updated ACC and pointer array
            heapq.heappush(h, (acc + mat[i][pointer + 1] - mat[i][pointer], new_pointers))
Copy the code

This is the core code for the multiway merge problem, so keep it in mind.

The code looks a lot, but it’s only seven lines without the comments.

There is a problem with the pseudocode above. Let’s say we have two one-dimensional arrays with Pointers initialized to 0. Move the pointer of the first one-dimensional array for the first time, and move the pointer of the second array for the second time. In this case, the pointer array is [1, 1], that is, all Pointers point to the element with the subscript 1. If the pointer of the second one-dimensional array is moved the first time and the pointer of the first array is moved the second time, the array of Pointers is still [1, 1]. This is actually a case where if left unchecked it will be computed twice and cause an error.

One possible solution is to use a Hashset to record all pointer cases, thus avoiding the problem of having the same pointer evaluated more than once. To do this, we need to tweak our use of pointer arrays by using tuples instead of arrays. The reason is that arrays cannot be hashed directly. See the code section for details.

The topic of multiple merge, the idea and the code are relatively similar. Please make sure that this problem is solved for the sake of a better understanding of the following problems, which we will not go into in such detail.

code
class Solution:
    def kthSmallest(self, mat, k: int) - >int:
        h = []
        cur = (sum(vec[0] for vec in mat), tuple([0] * len(mat)))
        heapq.heappush(h, cur)
        seen = set(cur)

        for _ in range(k):
            acc, pointers = heapq.heappop(h)
            for i, pointer in enumerate(pointers):
                ifpointer ! =len(mat[0]) - 1:
                    t = list(pointers)
                    t[i] = pointer + 1
                    tt = tuple(t)
                    if tt not in seen:
                        seen.add(tt)
                        heapq.heappush(h, (acc + mat[i][pointer + 1] - mat[i][pointer], tt))
        return acc
Copy the code

1.3.4 (code)

Find the KTH smallest pair of distances

Topic describes
Given an array of integers, returns the KTH minimum distance between all pairs. The distance between A pair of (A, B) is defined as the absolute difference between A and B. Example 1: input: nums = [1,3,1] k = 1 output: 0 explanation: all number pairs are as follows: (1,3) -> 2 (1,1) -> 0 (3,1) -> 2 so the first least distant number pair is (1,1) and the distance between them is 0. 2 <= len(nums) <= 10000. 0 <= nums[I] < 1000000. 1 <= k <= len(nums) * (nums) -1) / 2Copy the code
Train of thought

It is not difficult to see that all pairs may be Cn2C_n^2Cn2, that is, N ×(n−1)÷2n\times(n-1)\div2n×(n−1)÷2.

So we can loop twice to find all the pairs, sort them in ascending order, and then take the KTH pair.

In fact, we can use the fixed heap technique to maintain a large top heap of size K, such that the top element of the heap is the KTH smallest, as described in the previous fixed heap.

class Solution:
    def smallestDistancePair(self, nums: List[int], k: int) - >int:
        h = []
        for i in range(len(nums)):
            for j in range(i + 1.len(nums)):
                a, b = nums[i], nums[j]
                Keep the heap size up to K
                if len(h) == k and -abs(a - b) > h[0]:
                    heapq.heappop(h)
                if len(h) < k:
                    heapq.heappush(h, -abs(a - b))

        return -h[0]
Copy the code

1.3.5 (code)

However, this optimization doesn’t make much sense because the bottleneck of the algorithm is the enumeration of the N2N^2N2 part, which we should try to optimize.

If we sort pairs, the smallest distance must be nums[I] -nums [i-1], where I is an integer from 1 to n, depending on which is smaller. Then you can use the idea of multiple merge above to solve the problem.

If the difference between nums[I] -nums [i-1] is the smallest, then the second smallest must be the remaining n-1 case and the new case of nums[I] -NUMs [i-1] splitting. In terms of splitting, just like above, we just move the pointer of I to I + 1. Here the pointer array is fixed at length 2, not m as in the above problem. I’ve named the two Pointers FR and TO, for from and to.

code
class Solution(object) :
    def smallestDistancePair(self, nums, k) :
        nums.sort()
        # n possible answers
        h = [(nums[i+1] - nums[i], i, i+1) for i in range(len(nums) - 1)]
        heapq.heapify(h)

        for _ in range(k):
            diff, fr, to = heapq.heappop(h)
            if to + 1 < len(nums):
                heapq.heappush((nums[to + 1] - nums[fr], fr, to + 1))

        return diff
Copy the code

Code (1.3.6)

Since the time complexity is related to k, which can be on the order of N2N^2N2 at most, this method actually times out as well. But it turns out that the idea is right, and if you change the problem a little bit, it might work.

This problem can be solved by dichotomy, and since it deviates from the heap topic, it is briefly covered here.

The easiest thing to think about for the KTH smallest number is the heap sum dichotomy. The reason for this dichotomy is that finding the KTH small is essentially finding the number that is no bigger than itself and has k minus 1. This problem is often monotonic, so dichotomy can be used to solve it.

In this case, the maximum number of pairs is the maximum and minimum of the array, which is called max_diff. We can ask:

  • How many pairs are less than max_diff?
  • How many pairs of numbers are less than max_diff-1?
  • How many pairs of numbers are less than max_diff-2?
  • How many pairs of numbers are less than max_diff-3?
  • How many pairs of numbers are less than max_diff-4?
  • .

And we know that the answers to the question are not strictly decreasing, so the use of dichotomy should be thought of. We keep asking until we get k minus 1 less than x. But there are problems with such a question. There are two reasons:

  1. There might be more than one k minus 1 less than x
  2. We don’t know for sure that k-1 numbers less than x exist. If the pair difference is [1,1,1,1,2] and you take the third largest number, then there is no number with two numbers less than x.

We can adjust our thinking to find k numbers less than or equal to x, and then we can use the leftmost template of dichotomy. For left-most templates, see my binary search topic

Code:

class Solution:
    def smallestDistancePair(self, A: List[int], K: int) - >int:
        A.sort()
        l, r = 0, A[-1] - A[0]

        def count_ngt(mid) :
            slow = 0
            ans = 0
            for fast in range(len(A)):
                while A[fast] - A[slow] > mid:
                    slow += 1
                ans += fast - slow
            return ans

        while l <= r:
            mid = (l + r) // 2
            if count_ngt(mid) >= K:
                r = mid - 1
            else:
                l = mid + 1
        return l
Copy the code

1.3.7 (code)

632. Minimum interval

Topic describes
You have a list of k integers that are not decremented. Find a minimum interval such that each of k lists contains at least one number. We define the interval [a,b] to be smaller than [c,d] if b-a < d-c or a < C when b-a == d-c. Example 1: input: nums = [[4,10,15,24,26], [0,9,12,20], [5,18,22,30]] output: [20,24] explanation: listing 1: [4,10,15,24,26], 24 is in the range [20,24]. Listing 2: [0, 9, 12, 20], 20 is in the interval [20,24]. Listing 3: [5, 18, 22, 30], 22 in the interval [20,24]. Example 2: input: nums = [[1, 2, 3], [1, 2, 3], [1, 2, 3]] output: [1, 1] example 3: input: nums = [[10, 10], [11] 11] output: [10, 11] example 4: input: Nums = [[10], [11]] output: [10, 11] example 5: input: nums = [[1], [2], [3], [4], [5], [6], [7]] output: [1, 7] tip: Nums. length == K 1 <= K <= 3500 1 <= NUMs [I]. Length <= 50-105 <= NUMs [I][j] <= 105 NUMs [I] in non-descending orderCopy the code
Train of thought

In essence, this problem is to take A number from each of m one-dimensional arrays to form A new array A, so that the difference between the maximum value and minimum value (DIff) in the new array A is minimized.

This is a little bit similar to the problem above, but a little bit different. This is a matrix, and the top one is a one-dimensional array. But we can think of a two-dimensional matrix as a one-dimensional array, so we can use the same idea.

The minimum diff must occur between adjacent elements after sorting. In this case, we can’t sort a two-dimensional array directly, and even if we do, we can’t figure out how to sort it.

We can actually continue with the idea of the last two problems. In particular, the small top heap is used to obtain the minimum value in the heap, and then the maximum value in the heap is recorded by a variable, so that the DIFF is known. Every time the pointer is updated, a new DIFF is generated. This process is repeated and the global minimum diFF is maintained.

The premise of this algorithm is that k lists are arranged in ascending order. Here, the principle of array ascending order is the same as the above problem. After ordering, you can maintain a pointer for each list, and then use the above idea to solve the problem.

Nums = [[1,2,3],[1,2,3]

  • [1, 2, 3]
  • [1, 2, 3]
  • [1, 2, 3]

We first select the minimum value of all rows, i.e. [1,1,1], in which case the diff is 0, the global maximum value is 1, and the minimum value is also 1. Next, continue to look for a backup, to see if there is a better backup for us to choose.

The next spare tire may arise from case 1:

  • [1, 2, 3]
  • [1, 2, 3]
  • [1,2,3] moves the pointer to this row, moving it one unit from 0 to 1.

Or case 2:

  • [1, 2, 3]
  • [1,2,3] moves the pointer to this row, moving it one unit from 0 to 1.
  • [1, 2, 3]

.

These cases continue to split into more cases, and this is the same as the problem above.

code
class Solution:
    def smallestRange(self, martrix: List[List[int]]) - >List[int] :
        l, r = -10**9.10**9
        Add the smallest row to the heap and record the row number and column number of each row
        h = [(row[0], i, 0) for i, row in enumerate(martrix)]
        heapq.heapify(h)
        # Maximum maintenance
        max_v = max(row[0] for row in martrix)

        while True:
            min_v, row, col = heapq.heappop(h)
            # max_v-min_v is the current maximum and minimum difference, and r-l is the global maximum and minimum difference. Because if the current is smaller, we update the global result
            if max_v - min_v < r - l:
                l, r = min_v, max_v
            if col == len(martrix[row]) - 1: return [l, r]
            # update the pointer, move it one bit further back
            heapq.heappush(h, (martrix[row][col + 1], row, col + 1))
            max_v = max(max_v, martrix[row][col + 1])
Copy the code

(code 1.3.8)

1675. Minimum offset of array

Topic describes
You are given an array of n positive integers, nums. You can perform two types of operations any number of times on any element of the array: For example, if the array is [1,2,3,4], then you can do this for the last element to make it [1,2,3,2]. If the elements are odd, multiply by 2. For example, if the array is [1,2,3,4], Then you can do this to the first element, making it [2,2,3,4]. The offset of the array is the maximum difference between any two elements in the array. Returns the minimum offset that an array can have after performing some operation. Example 1: input: nums = [1,2,3,4] output: 1 explanation: you can convert the array to [1,2,3,2] and then to [2,2,3,2] with an offset of 3-2 = 1 example 2: input: nums = [4,1,5,20,3] output: Example 3: Input: nums = [2,10,8] Output: 3 Hint:  n == nums.length 2 <= n <= 105 1 <= nums[i] <= 109Copy the code
Train of thought

They say you can do any number of operations on each item in the array, but there’s a finite number of operations.

  • We can only double an odd number once, because then it becomes even.
  • We can divide an even number of times until we get to an odd number, which is a finite number of times.

Take [1,2,3,4] in the question. We can:

  • Change 1 to 2.
  • Change 2 to 1.
  • Change 3 to 6.
  • Change the 4 to a 2 or a 1.

It looks like this:

Isn’t this the same as taking a number from each row of a two-dimensional array like [[1,2], [1,2], [3,6], [1,2,4] and minimizing the difference? Isn’t this exactly the same as the question above?

I’m going to wrap the above problem solution directly into an API call, depending on the code.

code
class Solution:
    def smallestRange(self, martrix: List[List[int]]) - >List[int] :
        l, r = -10**9.10**9
        Add the smallest row to the heap and record the row number and column number of each row
        h = [(row[0], i, 0) for i, row in enumerate(martrix)]
        heapq.heapify(h)
        # Maximum maintenance
        max_v = max(row[0] for row in martrix)

        while True:
            min_v, row, col = heapq.heappop(h)
            # max_v-min_v is the current maximum and minimum difference, and r-l is the global maximum and minimum difference. Because if the current is smaller, we update the global result
            if max_v - min_v < r - l:
                l, r = min_v, max_v
            if col == len(martrix[row]) - 1: return [l, r]
            # update the pointer, move it one bit further back
            heapq.heappush(h, (martrix[row][col + 1], row, col + 1))
            max_v = max(max_v, martrix[row][col + 1])
    def minimumDeviation(self, nums: List[int]) - >int:
        matrix = [[] for _ in range(len(nums))]
        for i, num in enumerate(nums):
            if num & 1= =1:
                matrix[i] += [num, num * 2]
            else:
                temp = []
                while num and num & 1= =0:
                    temp += [num]
                    num //= 2
                temp += [num]
                matrix[i] += temp[::-1]
        a, b = self.smallestRange(matrix)
        return b - a

Copy the code

1.3.9 (code)

Tip # 3 – Hindsight

The trick is that when you walk from left to right, you don’t know what’s on the right until you get to the right.

And if you want to know what’s on the right hand side, an easy way to do it is to do it twice, the first time you write down the data, and when you do it the second time, you use the data that you wrote down the last time. That’s the way we use it the most. Sometimes, though, we can go back after we’ve traversed the specified element, so we can store it as we go, just once. Specifically, it is to collect all the data from left to right, and when you need to use it, choose one from it. If we’re both going to Max or min and the extreme values change, we can use heap acceleration. Intuitively, it is the use of time machine to go back to the past, to achieve the purpose of hindsight.

You don’t know what that means. That’s okay. Let’s do it with a couple of examples. When you’re done with these examples, go back to this sentence.

871. Minimum refueling

Topic describes
From the starting point, the car drives to its destination, which is target miles east of the starting point. There are gas stations along the route, with each station[I] representing a gas station miles east of the starting position, station[I][0], and station[I][1] liters of gas. Suppose the car's fuel tank has an infinite capacity, with startFuel initially as a liter of fuel. It uses a litre of petrol for every mile. When the car reaches the gas station, it may stop to fill up, transferring all the gasoline from the station to the car. What is the minimum number of refuels necessary to get the car to its destination? If the destination cannot be reached, -1 is returned. Note: If the car arrives at a gas station with 0 fuel left, it can still fill up there. If the car arrives at its destination with zero fuel remaining, it is still considered to have reached its destination. Example 1: Inputs: target = 1, startFuel = 1, Stations = [] Output: 0 Explanation: We can get to our destination without refueling. Example 2: inputs: target = 100, startFuel = 1, stations = [[10,100]] output: -1 explanation: we were unable to reach our destination or even the first petrol station. Example 3: inputs: target = 100, startFuel = 10, stations = [[10,60],[20,30],[30,30],[60,40] output: 2 explanation: we started with 10 litres of fuel. We drove to a petrol station 10 miles from the start and consumed 10 litres of fuel. Increase petrol from 0 litres to 60 litres. Then, we drive from a gas station 10 miles away to a gas station 60 miles away (consuming 50 liters of fuel) and fill it up from 10 liters to 50 liters. Then we drove to our destination. We stop at one or two gas stations along the way, so we return to two. Tip:  1 <= target, startFuel, stations[i][1] <= 10^9 0 <= stations.length <= 500 0 < stations[0][0] < stations[1][0] < ... < stations[stations.length-1][0] < targetCopy the code
Train of thought

In order to get the minimum number of refuels, we definitely want to be able to skip refueling. When do you have to fill up? The answer should be when you can’t get to your next destination if you don’t refuel.

The pseudocode description is:

cur = startFuel # Start with startFuel
last = 0 # Last position
for i, fuel in stations:
    cur -= i - last # The distance between two Statons is two stations, i.e. I-last
    if cur < 0:
        We must refuel in front, or we won't get here
        But which station should I fill up at?
        # Intuition tells us to be greedy and choose the station I with the most gas to fill. If the gas with I is still cur < 0, continue to fill the next larger station J until there is no more gas to fill or cur > 0
Copy the code

It says to choose the station I that can fill up with the most gas. If that doesn’t work, continue to choose the station with the second most gas. This dynamic minimization scenario is ideal for using heap.

To be specific:

  • At each station it passes, its fuel quantity is added to the heap.
  • Keep going as far as you can, as long as the oil is not less than zero.
  • If the amount of fuel is less than zero, take the largest amount of fuel from the heap and add it to the tank. If the amount of fuel is still less than zero, repeat the maximum amount of fuel in the heap.
  • If the oil level is greater than 0 after filling, continue driving and repeat the above procedure. Otherwise, -1 is returned, indicating that the destination cannot be reached.

So how does this algorithm reflect hindsight? You can put yourself in the problem and simulate it. Imagine yourself driving a car. Your goal is the minimum number of refuels. When you get to a stop, you don’t know if you have enough gas to last you to the next stop, and even if you don’t, it might be better to fill up at the last stop. So in reality you have no way of knowing whether I should go for gas or not at the current station, because there is so little information.

So what would I do? If I were driving, I would have to fill up the gas every time and never get to my destination, then I would definitely never get to my destination. But if this can get us to the destination, THEN I can say that if we refueled at that station, this station chose not to refuel and could get us to the destination with the minimum number of refueling. Why didn’t you say so? Isn’t it wise after the event?

This hindsight is reflected in the fact that we wait until we run out of gas to think about filling up at a previous station.

So what this hindsight essentially solves is that we can’t get an optimal solution based on the current information, so we have to go backwards with all the information. For this problem, we can first traverse one station, and then record the amount of fuel at each station into an array. Each time we “foresee” that we will not reach the next station, we can take the largest from the array… Based on this, we can consider using heap optimization to get the maximum value rather than using an array approach.

code
class Solution:
    def minRefuelStops(self, target: int, startFuel: int, stations: List[List[int]]) - >int:
        stations += [(target, 0)]
        cur = startFuel
        ans = 0

        h = []
        last = 0
        for i, fuel in stations:
            cur -= i - last
            while cur < 0 and h:
                cur -= heapq.heappop(h)
                ans += 1
            if cur < 0:
                return -1
            heappush(h, -fuel)

            last = i
        return ans
Copy the code

(code 1.3.10)

1488. Avoid flooding

Topic describes
Your country has an infinite number of lakes, all of which start out empty. When it rains in the NTH lake, if the NTH lake is empty, it will fill with water, otherwise the lake will flood. Your goal is to avoid flooding in any of the lakes. Give you an integer array rains, where: Rains [I] > 0 means that rains[I] lakes will rain on the i-th day. Rains [I] == 0 means that no lake will rain on day I, and you can pick a lake and drain the lake. If rains[I] > 0, then ans[I] == -1. If rains[I] == 0, ans[I] is the lake you choose to drain on day I. If there are more than one possible solution, return any of them. If you can't stop the flood, return an empty array. Note that if you choose to drain a lake full of water, it will become an empty lake. But if you choose to drain an empty lake, nothing will happen (see Example 4 for details). Example 1: Input: Rains = [1,2,3,4] Output: [-1,-1,-1,-1] Explanation: After the first day, lakes that are filled with water include [1] After the second day, lakes that are filled with water include [1,2] After the third day, lakes that are filled with water include [1,2,3] After the fourth day, lakes that are filled with water include [1,2,3,4] There is no day that you can drain any lake and no lake will flood. Example 2: Input: Rains = [1,2,0,0,2,1] Output: [-1,-1,2,1,-1] Explanation: After the first day, lakes filled with water include [1] And after the second day, lakes filled with water include [1,2] and after the third day, we drain lake 2. So the remaining lakes filled with water include [1] on the fourth day, we drained lake 1. So there are no lakes full of water for the time being. On the fifth day, lakes filled with water include [2]. On the sixth day, lakes filled with water include [1,2]. As you can see, there would be no flooding under this scheme. At the same time, [-1,-1,1,2,-1,-1] is another possible solution without flooding. Example 3: input: rains = [1,2,0,1,2] output: [] explanation: after the second day, lakes filled with water include [1,2]. We can drain a lake on the third day. But on the third day, it will rain again in lakes 1 and 2, so no matter which lake we drain on the third day, the other lake will flood. Example 4: input: rains = [69,0,0,0,69] output: [-1,69,1,1,-1] explanation: ,69 any form such as [- 1, x, y, 1], [69-1, x, y, 1] or [- 1, x, y, 69, 1] is feasible solution, including 1 < = x, y < = 10 ^ 9 example 5: input: rains =,20,20 [10] output: [] interpretation: Since the lake will rain for two days in a row, there is no way to stop the flood. 1 <= rains. Length <= 10^ 50 0 <= rains[I] <= 10^9Copy the code
Train of thought

If the above question seems far-fetched to describe with hindsight, the latter two are apt.

We can drain a lake when it is not raining. If there are multiple lakes full of rain, which lake should we drain? The obvious answer is to drain the lake that is about to be flooded recently. But in reality it’s impossible to know which lake will rain on any given day, even if there is a weather forecast, so it’s not 100% reliable.

But the code can. So we can go through the rain array and see which lake it rained on which day. With this information, we can be wise after the event.

“The weather is good today, I opened the Eye of heaven, tomorrow lake 2 will flood, we need to drain it today, otherwise it will flood.” .

As in the above problem, we can also simulate the daily changes without going through the rain array first. Instead, we can simulate directly, even if the current day is sunny, we do not drain any lakes. Then we record the sunny days in the simulation, and when the flood happens, we think about which sunny days to drain which lakes. So the hindsight is that we wait for the flood to come before we think about what we should have done on the previous day.

Algorithm:

  • Walk through rain to simulate daily changes
  • If rain is currently 0, that means it’s sunny, and we’re not draining any lakes. But we record the current day into the Sunny array.
  • If rain is greater than zero, it’s raining in one of the lakes, and we’re going to see if the lake that’s raining is flooding. In fact, it is to see whether there is water before the rain. This prompts us to use a data structure, lakes, to record the status of each lake. We can use 0 for no water and 1 for water. So when it rains on lake I and lakes[I] = 1 there will be flooding.
  • If the lake is currently flooded, go to sunny and find a sunny day to drain it so it doesn’t flood, then just keep the lakes[I] = 1.

I didn’t use the heap in this problem, and I did that on purpose. The reason I did this is to show you that hindsight is not a heap trick, it’s actually a common algorithmic idea, like going backwards. However, most of the time, we need to be wise after the event, dynamic maximum and minimum values, this time should consider using heap, this is actually back to the beginning of a central article, so we must use these skills flexibly, not mechanically.

The next problem is a pure case of hindsight + pile optimization.

code
class Solution:
    def avoidFlood(self, rains: List[int]) - >List[int] :
        ans = [1] * len(rains)
        lakes = collections.defaultdict(int)
        sunny = []

        for i, rain in enumerate(rains):
            if rain > 0:
                ans[i] = -1
                if lakes[rain - 1] = =1:
                    if 0= =len(sunny):
                        return []
                    ans[sunny.pop()] = rain
                lakes[rain - 1] = 1
            else:
                sunny.append(i)
        return ans
Copy the code

(code 1.3.11)

1642. Farthest building accessible

Topic describes
You are given an integer array heights, which represents the height of the building. There are also bricks and ladders. You start at building 0 and move towards the next building, using bricks or ladders. When moving from building I to building I +1 (subscript starts at 0) : If the height of the current building is greater than or equal to the height of the next building, no ladder or bricks are needed If the height of the current building is less than the height of the next building, you can use a ladder or (H [I +1] -h [I]) bricks if the given ladder and bricks are used in the best way, Returns the subscript of the farthest building you can reach (subscripts start at 0).Copy the code

Example 1: Input: heights = [4,2,7,6,9,14,12], bricks = 5, ladders = 1 Output: 4 - Don't use bricks or ladders to reach building 1, because 4 >= 2 - use 5 bricks to reach building 2. You must use bricks or ladders as 2 < 7 - do not use bricks or ladders to reach building 3 as 7 >= 6 - use the only ladder to reach building 4. You have to use bricks or ladders because 6 < 9 cannot go past building 4 because there are no more bricks or ladders. Example 2: input: heights = [4,12,2,7,3,18,20,3,19], bricks = 10, ladders = 2 output: 7 example 3: input: Heights = [14,3,19,3], bricks = 17, ladders = 0  1 <= heights.length <= 105 1 <= heights[i] <= 106 0 <= bricks <= 109 0 <= ladders <= heights.lengthCopy the code
Train of thought

We can see a ladder as an infinite block, but we can only use it once, and we want to use the ladder well. Again, in real life, we wouldn’t know when to use a ladder and when to use a brick.

It doesn’t matter. Let’s go back to hindsight and do it in one walk. Similar to the previous thought, that is, I have no brain to use the ladder, when the ladder is not enough, we will start to hindsight, if the brick in front of the good. So when is it okay to use a brick? Obviously, the difference in height when we used the ladder was smaller than the difference in height now.

Frankly, I used a ladder to climb a five-meter wall, and now I have a ten-meter wall, and I don’t have a ladder, so I can only use ten bricks. If you used 5 bricks before, now you can use a ladder and save 5 bricks.

This prompts us to save the height difference of the building that we cross with the ladder in front. When the ladder behind is used up, we “exchange” the ladder used in front into bricks to continue to use. In the example above, we can exchange 10 bricks and then use 5 bricks, which is equivalent to adding 5 bricks.

If the ladder has been used several times before, which one should we “exchange” first? Obviously is the priority exchange height difference, so the exchange of bricks just the most. The prompt selects the maximum height difference from the previously stored height difference each time and removes it after “exchange”. What data structure is appropriate for this dynamic extremum scenario? Pile, of course.

code
class Solution:
    def furthestBuilding(self, heights: List[int], bricks: int, ladders: int) - >int:
        h = []
        for i in range(1.len(heights)):
            diff = heights[i] - heights[i - 1]
            if diff <= 0:
                continue
            if bricks < diff and ladders > 0:
                ladders -= 1
                if h and -h[0] > diff:
                    bricks -= heapq.heappop(h)
                else:
                    continue
            bricks -= diff
            if bricks < 0:
                return i - 1
            heapq.heappush(h, -diff)
        return len(heights) - 1
Copy the code

(code 1.3.12)

The four applications

Next comes the final part of this article, “The Four Applications”, which aims to reinforce the previous knowledge with a few examples.

1. topK

Solving topK is a very important function of the heap. This was actually introduced earlier in the fixed heap section.

Here’s a direct quote:

“Well, the easiest way to find the KTH smallest number is to build a small top heap, put all the numbers in the heap first, and then get out of the heap one by one, k times. The last thing that goes out of the heap is the KTH smallest number. However, instead of going all the way to the heap first, we build the big top heap (note not the small top heap) and keep the heap size at K. If the heap size is greater than k after the new number is added to the heap, you need to compare the number on top of the heap with the new number and remove the larger number. This guarantees that the number in the heap is the smallest k of the total number, and that the largest of the smallest k is the KTH smallest. That’s why you choose to build a big top heap rather than a small top heap.”

In fact, in addition to the KTH smallest number, we can also collect all the middle numbers, and that will solve for the smallest k number. The only difference from the KTH smallest number above is that you need to collect all the numbers that popp comes out of.

It is important to note that sometimes the weight is not the size of the original array value, but also the distance, frequency, etc.

Related Topics:

  • Interview question 17.14. Minimum K number
  • 347. The first K high frequency elements
  • 973. K points nearest the origin

A lot of problems about the k in the force link are heap. In addition to heap, there will be some problems to find rules for the KTH problem, for this kind of problem can be solved by divide-and-conquer + recursion, the specific will not be developed here, interested can leave a message with me for discussion.

2. Shortest distance with weight

In fact, I mentioned this in the previous section, but only in passing. But is BFS really not implemented with priority queues? Of course not! For example, the shortest path problem of weighted graph, if you use queue to do BFS, you need priority queue, because there are weight differences between paths, which is the original design of priority queue. The BFS implementation using priority queues is typical of Dijkstra.”

DIJKSTRA algorithm mainly solves the shortest distance between any two points in a graph.

The basic idea of the algorithm is greedy, traversing all the neighbors each time and finding the smallest distance among them, which is essentially a breadth-first traversal. Here we use the data structure of the heap to find the point where cost is the smallest in logNlogNlogN time, where N is the size of the heap.

Code template:

def dijkstra(graph, start, end) :
    The data in the heap is the binary ancestor of (cost, I), which means "the distance from start to I is cost".
    heap = [(0, start)]
    visited = set(a)while heap:
        (cost, u) = heapq.heappop(heap)
        if u in visited:
            continue
        visited.add(u)
        if u == end:
            return cost
        for v, c in graph[u]:
            if v in visited:
                continue
            next = cost + c
            heapq.heappush(heap, (next, v))
    return -1
Copy the code

1.4.1 (code)

You can see that code templates are basically similar to BFS. BFS can also be simulated if you set the heap key to steps yourself, which has already been covered and won’t be repeated here.

For example, a graph looks like this:

E -- 1 --> B -- 1 --> C -- 1 --> D -- 1 --> F
 \                                         /\
  \                                        ||
    -------- 2 ---------> G ------- 1 ------
Copy the code

We use the adjacency matrix to construct:

G = {
    "B": [["C".1]],
    "C": [["D".1]],
    "D": [["F".1]],
    "E": [["B".1], ["G".2]],
    "F": []."G": [["F".1]],
}

shortDistance = dijkstra(G, "E"."C")
print(shortDistance)  # E -- 3 --> F -- 3 --> C == 6
Copy the code

With this algorithm template, you can go to AC 743. Network latency.

Complete code:

class Solution:
    def dijkstra(self, graph, start, end) :
        heap = [(0, start)]
        visited = set(a)while heap:
            (cost, u) = heapq.heappop(heap)
            if u in visited:
                continue
            visited.add(u)
            if u == end:
                return cost
            for v, c in graph[u]:
                if v in visited:
                    continue
                next = cost + c
                heapq.heappush(heap, (next, v))
        return -1
    def networkDelayTime(self, times: List[List[int]], N: int, K: int) - >int:
        graph = collections.defaultdict(list)
        for fr, to, w in times:
            graph[fr - 1].append((to - 1, w))
        ans = -1
        for to in range(N):
            Call the encapsulated Dijkstra method
            dist = self.dijkstra(graph, K - 1, to)
            if dist == -1: return -1
            ans = max(ans, dist)
        return ans
Copy the code

1.4.2 (code)

Have you learned?

The algorithm above is not optimal, I just want to illustrate the idea of encapsulating Dijkstra as an API call. A better practice is to record all the distance information once, rather than double-counting each time. Time complexity is greatly reduced. This makes a lot of sense in calculating the distance from one point to all of the points on the graph. What adjustments will our algorithm make to achieve this goal?

Tip: You can do this by using a DIST hash table that records the shortest distance from the starting point to each point. If you want to come out, you can buckle 882 to verify oh ~

You only need to make a small adjustment, and since the adjustment is small, it’s better to just look at the code.

Code:

class Solution:
    def dijkstra(self, graph, start, end) :
        heap = [(0, start)]  # cost from start node,end node
        dist = {}
        while heap:
            (cost, u) = heapq.heappop(heap)
            if u in dist:
                continue
            dist[u] = cost
            for v, c in graph[u]:
                if v in dist:
                    continue
                next = cost + c
                heapq.heappush(heap, (next, v))
        return dist
    def networkDelayTime(self, times: List[List[int]], N: int, K: int) - >int:
        graph = collections.defaultdict(list)
        for fr, to, w in times:
            graph[fr - 1].append((to - 1, w))
        ans = -1
        dist = self.dijkstra(graph, K - 1, to)
        return -1 if len(dist) ! = Nelse max(dist.values())
Copy the code

1.4.3 (code)

You can see we’re just replacing visitd with DIST, nothing else. In addition, Dist is actually only the visited with key, which also plays the role of visitd.

If you need to calculate the shortest path from one node to all other nodes, you can use a dist (a Hashmap) to record the shortest path information from the starting point to all points instead of using Visited (a Hashset).

There are many similar problems, so LET me give you another one of the cheapest flights within the 787.K station. Title description:

There are n cities connected by m flights. Each flight starts at city U and arrives at v at price W. Now given all the cities and flights, as well as the departure city SRC and the destination DST, your task is to find the cheapest price from SRC to DST with a maximum transit through K station. If there is no such route, the output is -1. Example 1: Input: n = 3, edges = [[0,1,100],[1,2,100],[0,2,500]] SRC = 0, DST = 2, k = 1 Output: 200 Explanation: The city flight diagram is shown belowCopy the code

The cheapest price from City 0 to city 2 within transit station 1 is 200, as shown in red in the figure. Example 2: input: n = 3, edges = [[0,1,100],[1,2,100],[0,2,500]] SRC = 0, DST = 2, k = 0 output: 500 explanation: the city flight diagram is shown belowCopy the code

The cheapest price from City 0 to City 2 within transit station 0 is 500, as shown in blue. Tip: N range is [1, 100], City tag from 0 to n-1 number of flights range is [0, n * (n-1) / 2] Format for each flight (SRC, DST, price) Price range for each flight is [1, 10000] k range is [0, n-1] Flights are not repeated, And there is no self-loopCopy the code

This problem is not fundamentally different from the above, but I still use it as an API, depending on the code.

The only special thing about this problem is that if the transit times are greater than K, it is also considered unreachable. This is actually quite easy, we just need to use a tuple in the heap to carry one more step, which is the distance in the weightless BFS. If the pop steps are greater than K, it is considered illegal and we can skip and continue processing.

class Solution:
    Add K to add steps to the heap
    def dijkstra(self, graph, start, end, K) :
        heap = [(0, start, 0)]
        visited = set(a)while heap:
            (cost, u, steps) = heapq.heappop(heap)
            if u in visited:
                continue
            visited.add((u, steps))
            if steps > K: continue
            if u == end:
                return cost
            for v, c in graph[u]:
                if (v, steps) in visited:
                    continue
                next = cost + c
                heapq.heappush(heap, (next, v, steps + 1))
        return -1
    def findCheapestPrice(self, n: int, flights: List[List[int]], src: int, dst: int, K: int) - >int:
        graph = collections.defaultdict(list)
        for fr, to, price in flights:
            graph[fr].append((to, price))
         Call the encapsulated Dijkstra method
        return self.dijkstra(graph, src, dst, K + 1)
Copy the code

1.4.4 (code)

3. Factorization

Apply this to the above two, which I mentioned earlier in 313. Super Ugly Numbers.

An ugly number is a positive integer whose prime factors include only 2, 3, and 5. So an ugly number is essentially an integer that has been factored down to 2, 3, and 5, with no other factors.

There are many problems about ugly numbers, and most of them can also be solved from the heap point of view. But sometimes the number of factors is limited and it’s easy to solve without using a heap. For example: 264. Ugly number II can be recorded using three Pointers, this technique has been described in the previous, not repeated.

Some questions are not ugly numbers, but explicitly mention information about similar factors and ask you to find xx of the KTH largest, in which case the heap is preferred. Dichotomies can also be considered if the problem contains some other information, such as order. Which method to use will be a case-by-case matter, but you need to be familiar with both methods before doing so.

4. The heap sort

The three applications mentioned above were all more or less mentioned above. Heap sorting was not mentioned earlier.

There are few problems that directly examine heap sorting. But the interview might, and learning heap sort is important for understanding divide-and-conquer and other important algorithmic thinking. Personal feeling, heap sorting, construction of binary tree, construction of line segment tree and other algorithms are very similar, master one, the other can be understood by analogy.

In fact, after learning from the previous heap, we can encapsulate a heap sort in a very simple way.

Here I put a simple example code that implements heap sort using the heap API:

h = [9.5.2.7]
heapq.heapify(h)
ans = []

while h:
    ans.append(heapq.heappop(h))
print(ans) # 2,5,7,9
Copy the code

Once you understand the example, it’s not hard to encapsulate generic heap sort.

def heap_sort(h) :
    heapq.heapify(h)
    ans = []
    while h:
        ans.append(heapq.heappop(h))
    return ans

Copy the code

This method is simple enough, if you understand the principle of the previous heap, it is not difficult to hand out a heap sort. There is a drawback to this method, however, because it is not an in situ algorithm, which means that you have to use extra space to carry on the results, which is O(N)O(N)O(N). However, after the heapsort method is called, the original array memory can be freed, so theoretically there is no waste of space, but we calculate the space complexity at the time of the most memory, so the in-place algorithm is definitely better. If you really don’t like the implementation, you can change it in situ. This is not too difficult, just a slight modification of the previous heap implementation, which is not covered here due to space constraints.

conclusion

The heap and queue are inextricably linked. A lot of problems I think about first and use the heap to do it. Then you find that every time you get to the heap, it’s + 1 instead of jumping to update, like the next one is + 2, +3, etc., so it’s better to use queues to do it. For example, 649. Dota2 senate and 1654. The minimum number of jumps home.

The center of the heap is one, and that is dynamic minima.

And the extremum is nothing more than a maximum or a minimum, and that’s not hard to see. If we want to maximize, we can use the big top heap, and if we want to minimize, we can use the minimal heap. In fact, without the word dynamic, the heap is not necessary in many cases. For example, you can just go through it once and find the largest one. The dynamic point is not easy to see, and that’s the difficulty of the problem. This requires you to analyze the problem first, the analysis of the problem is actually dynamic optimization, then the heap optimization should be thought of.

There are many implementations of the heap. For example, skip table based on linked list, binary heap based on array and red-black tree based implementation. Here we introduce two main implementations and detail the implementation of binary heap, not only its implementation is simple, and its performance in many cases is good, recommend you focus on mastering binary heap implementation.

For binary heap implementation, the core point is always to maintain the same properties of the heap, what are the properties? That is, the weight of the parent node is not greater than the weight of the child (small top heap). To do this, we need to float up and sink, and swap elements properly, both in and out of the heap. In particular, the floating process swaps with its larger parent, and the sinking process swaps with the smaller of the two children, provided that it has children that are smaller than it.

We didn’t do a detailed analysis of heapification. However, if you understand the heap-loading in this article, it’s actually quite easy. So heapification itself is a process of continuous heap entry, but it turns discrete operations in time into one-off operations.

In addition, I have introduced three problem-solving skills for you, which are as follows:

  • Fixed reactor can not only solve the k problem, but also effectively use the calculated results to avoid double calculation.
  • Multiple merge is essentially a violent solution, no different from violent recursion. If you turn it into a recursion, it’s a recursion that can’t be memorized. So more like a backtracking algorithm.
  • After the event. Some information that we don’t have access to at the moment can be stored in a data structure for later retrieval. This kind of data solution can be many, the common ones are hash table and heap. You can also think of this technique as an afterthought. Some people are more comfortable with it, but whatever it is, that’s what it means.

Finally, I will introduce you to four applications, all of which have been more or less covered in the previous section except heap sort. They are:

  • topK
  • Weighted shortest path
  • factorization
  • Heap sort

These four applications actually dynamically take extreme values around a center of the heap, which is just a flexible use of this feature. So when you do the problem, you just memorize the dynamic and take the extreme. If you can figure out that this problem has something to do with dynamic maxima, make sure to consider the heap. Now we’re going to have a little bit of complexity in mind, and we’re going to look at the range of data in the question and we’re going to be able to sort of estimate whether it’s feasible or not.

If you have any opinion on this, please leave a message to me. I will check the answer one by one when I have time. More algorithms can be found in my LeetCode repository: github.com/azl39798585… . Currently, it has 39K star. You can also pay attention to my public account “Li Kou Jia Jia” take you to bite the hard bone of the algorithm.