Greedy algorithm

Before reading benefits, send you classic ebook

The full text overview

Theoretical basis

Greedy algorithm, also known as greedy method, is a common method to find the optimal solution to the problem, but also the Internet companies in the interview process often to investigate the algorithm thought. Greedy algorithm has many classic applications, such as Huffman Coding, Prim algorithm, Kruskal algorithm and Dijkstra algorithm.

Basic idea of greedy algorithm

A greedy algorithm is one that, when solving a problem, always makes the choice that seems best so far. In simple terms, it is to choose the local optimum at each stage, so as to expect to achieve the global optimum.

When to use greed

Greedy algorithm has no fixed algorithm framework, which also leads to that it is not easy for us to think whether we can use greedy to solve problems in the usual process, so we can only practice more.

To determine whether a problem can be solved by a greedy algorithm, we must prove that the greedy choices made at each step guarantee the global optimal solution of the problem. There are two ways of general proof, namely mathematical induction and proof by contradiction. If you are interested in the specific proof process, you can refer to the relevant materials, I will not repeat here.

In fact, when we are doing problems, if we encounter the problem of solving the optimal solution. If we feel that we can derive global optimality from local optimality, and we can’t think of counterexamples to overturn our conclusion, then we can try greedy method and run test cases. If we don’t pass, then we can switch to dynamic programming and so on.

To put it simply, problems solved by greedy method generally have the following two characteristics.

  • Greedy choice of nature

The global optimal solution of a problem can be reached through a series of local optimal choices, and each choice can depend on previous choices, but not on future choices (also known as no aftereffect).

  • Optimal substructure properties

When the optimal solution of a problem contains the optimal solution of its subproblems, the problem is said to have the property of optimal substructure. The property of optimal substructure is the key to solve the problem by greedy method.

Greedy algorithm problem solving steps

The theory is too abstract. Let’s take the classic knapsack problem as an example to feel the solution idea of greedy algorithm.

Let’s say we have a backpack that can hold 10 kilograms of stuff, and we can carry all kinds of stuff. We have the following 5 items, each of which has a different amount and total value. How do we choose which items to carry in a backpack in order to maximize the total value of the items? How much of each item should be packed?

Perhaps you can think of the solution train of thought of this problem at a draught, calculate the unit price of all goods first namely, next according to unit price by high to low will be installed. The unit price of the items in the question is: durian, apple, orange, pear and banana in descending order. Therefore, we can carry 5kg of durian, 3kg of apples and 2kg of oranges in our backpack.

The solution of this problem is essentially solved by the greedy method. Let’s use this example to see how greedy algorithms work.

  1. When solving the following problems, we should first associate greedy algorithm

    Given a set of data, let’s pick some data that maximizes the expected value given the constraints. For a given example, the limit is the 10kg that the backpack can hold, and the expected value is the total value of the item.

  2. Try to see if this problem can be solved by the greedy method

    Each time the current case is selected, the data that contributes the most to the expected value with the same amount of contribution to the limiting value. For the given example, we choose the item with the highest unit price from the remaining items.

  3. Let’s take a few examples to see if greedy algorithms produce optimal results.

    To prove the correctness of the greedy algorithm rigorously is very complicated and requires a lot of mathematical reasoning. We’ll just do a couple of examples as we go along, and then we’ll run test cases to verify that.

Jumping games

55. Jump games

Problem description

Given an array of non-negative integers, nums, you start at the first index of the array. Each element in the array represents the maximum length you can jump at that location. Determine if you can reach the last index.

The sample

Input: nums = [2, 3, 1, 1, 4]

Output: true,

Explanation: you can jump 1 step from subscript 0 to subscript 1, and then jump 3 steps from subscript 1 to the last subscript.

To analyze problems

In other words, is the farthest you can jump to greater than or equal to the last index of the array (if you can get to a place, you must get to all the places before it).

We can solve it using the greedy method. Remember the steps of our greedy algorithm? Here we directly set template ~

  1. When solving the following problems, we should first associate greedy algorithm

    Given a set of data, let’s pick some data that maximizes the expected value given the constraints. For a given example, there is no constraint, and the expected value is the farthest you can jump to.

  2. Try to see if this problem can be solved by the greedy method

    Each time the current case is selected, the data that contributes the most to the expected value with the same amount of contribution to the limiting value. Take the given example as far as possible (greedy, locally optimal) at each step.

  3. Let’s take a few examples to see if greedy algorithms produce optimal results.

    We don’t have a counter example here, so let’s try to solve it by greed.

Let’s look directly at the code implementation.

Class Solution: def canJump(self, nums): n=len(nums) max_num=0 # if I <= max_num: Max_num = Max (max_num, I + nums[I]) max_num = Max (max_num, I + nums[I]) return True return FalseCopy the code

The time complexity of the algorithm is O(n) and the space complexity is O(1).

Jump Game II

45. Jump Game II

Problem description

Given an array of non-negative integers, nums, you start at the first position in the array. Each element in the array represents the maximum length you can jump at that location. Your goal is to get to the last position in the array with the fewest hops. Suppose you can always get to the last place in the array.

Example:

Input: nums = [2, 3, 1, 1, 4]

Output: 2

Explanation: The minimum number of jumps to the last position is 2. Jump from subscript 0 to subscript 1, one step, then three steps to the last position in the array.

To analyze problems

This problem is an updated version of the previous one, and we can also solve it with a greedy algorithm, looking for the farthest position (locally optimal) we can reach at each step.

For example, for the array nums = [2, 3, 1, 1, 4], the initial position is subscript 0, starting from subscript 0, the farthest can reach subscript 2, and the range that can be jumped is the pink node shown in the figure below. Because the value of subscript 1 is now 3, you can go further from subscript 1, so the first step goes to subscript 1.

In a concrete implementation, we maintain the maximum subscript position that can be reached at the moment, called the boundary. We iterate through the array from left to right, and when we reach the boundary, we update the boundary and increase the number of jumps by one.

One small detail to note here is that while iterating through the array, we do not access the last element, because our boundary must be greater than or equal to the last element before we access the last element, or we cannot jump to the last position. If we access the last element, we increase the number of “unnecessary jumps” in cases where the boundary is exactly the last position, so we don’t have to access the last element.

Let’s look at the implementation of the code.

Class Solution: def jump(self, nums): n = len(nums) end =0 max_num=0 step=0 for I in range(n-1) if max_num >= i: max_num = max(max_num, i + nums[i]) if i == end: end = max_num step += 1 return stepCopy the code

The time complexity of the algorithm is O(n) and the space complexity is O(1).

Gas station

Problem description

There are N gas stations on a loop, and the ith gas station has a liter of gas.

You have a car with an infinite tank, and it costs you [I] liters of gas to go from the ith gas station to the I +1 gas station. You start at one of the gas stations with an empty tank. Return the number of the gas station at the time of departure if you can make a full loop, otherwise -1.

Description:

  • If the problem has a solution, that answer is the only answer.
  • Input arrays are all non-empty arrays of the same length.
  • The elements in the input array are non-negative.

Example:

Gas = [1,2,3,4,5], cost = [3,4,5,1,2]

Output: 3

Explanation: From station 3 (index 3), you can get 4 litres of petrol. In this case, the fuel tank has = 0 + 4 = 4 liters of gasoline to the no. 4 gas station, and 4-1 + 5 = 8 liters of gasoline to the No. 0 gas station, and 8-2 + 1 = 7 liters of gasoline to the No. 1 gas station. Now you have 7-3 + 2 = 6 liters of gas going to gas station no. 2, and 6-4 + 3 = 5 liters of gas going to gas station No. 3. You need to consume 5 liters of gas, which is just enough to go back to gas station No. 3. Therefore, 3 can be the starting index.

To analyze problems

We can abstract the description of the topic into the following graphic structure. The node represents each gas station (the value of the node represents the amount of oil that can be added), and the value of the edge represents the amount of oil that needs to be consumed from one point to another.

The easiest way to think about this problem is to solve it by force, which is to go through all the stations, use it as a starting point, and see if you can go around and back. The algorithm uses a two-layer for loop and the time complexity of its algorithm is O(N^2).

Class Solution: def canCompleteCircuit(self, gas, cost): n=len(gas) J = start remain = gas[start] # check whether the remaining oil can reach the next point Remain = remain - cost[j] + gas[(j + 1) % n] # if j == start = if j == start: return start return -1Copy the code

Let’s take a look at how to optimize.

To optimize the

When the car enters the gas station I, it can add gas[I]. When it leaves the station I and goes to the next station I +1, cost[I] of oil will be consumed. Therefore, we can consider the station and its connected roads as a whole. Gas [I] -cost [I] is the change in oil passing through site I. As shown below:

So we have an array of rings, and the ith element of the array is gas[I] -cost [I].

So by doing that, we can change the problem to whether we can find a starting point in this array of rings, start, such that the sum from this starting point is always greater than or equal to 0.

In the process of calculating the summation, if the summation is less than 0, it means that the gas station is taken as the starting point and the whole circle cannot be completed, so another gas station needs to be taken as the starting point.

Deciding which gas station to switch to as a starting point? This is the key to problem optimization, and the key conclusion to solve using greedy thinking.

This conclusion is: if you choose gas station I as the starting point and “happen” to be unable to walk to gas station J, then any station K between gas station I and J cannot be the starting point.

Now let’s try to prove that this is true.

Assuming that sum records the amount of remaining oil in the current fuel tank, if starting from gas station I (sum=0), sum < 0 happens when we go to gas station J, that is to say, sum > 0 is satisfied when we go from station I to station K (between station I and station J).

If k is taken as the starting point, it is equivalent to sum=0 at station K (the amount of oil in the tank is 0 at the time of departure), then sum must be < 0 when we reach station J, indicating that K cannot be the starting point.

In conclusion, this conclusion is proved.

So, if we find that we can’t go from site I to site J, then site I and all the stations between I and J can’t be used as a starting point, and we can reduce some of the redundant calculations.

The specific code is as follows:

Class Solution: def canCompleteCircuit(self, gas, cost): sum=0 # start=0 while start < n: Step =0 while step<n: j = (start+step) % n # sum += gas[j] -cost [j] # Break step=step+1 if step==n: # Start return start else: # update start start=start+step+1Copy the code

Points of candy

575. Candy

Problem description

Given an array of even-length integers, different numbers in the array represent different types of candy. Each number represents a candy, and now you need to divide the candy equally between one brother and one sister, returning the maximum number of kinds of candy that the sister can get.

Example:

Type: candie = [2,2,3,3,4,4]

Output: 3

There are three different numbers 2, 3, and 4 in the array, representing three different kinds of candy, two of each kind. In this case, the optimal allocation is that the younger sister gets [2,3,4] and the younger brother also gets [2,3,4]. This gives the sister the most variety of candy.

To analyze problems

And they want to divide the candy equally between the younger brother and the younger sister, so the number of candy the younger sister can get is n over 2. The best case is that the n/2 candies that the sister gets are of different kinds, which means that the maximum number of candies that the sister can get is N /2. In addition, if the number of types of candies in the array is less than n/2, we will assign different types of candies in the array to sister in order to maximize the number of types of candies that sister gets. So, in this case, the number of different kinds of candy the sister can get is equal to the number of kinds of candy in the given array.

So now we’re going to solve the problem of how to figure out the number of types of candy in the array, let’s say count. So the maximum number of candy types assigned to the sister is min(n/2,count). Let’s look at how to find the number of types of candy in the array. The most intuitive idea is to iterate over a given array and then put the elements into a Set. Using the Set’s de-duplication feature, we can find the number of types of candy in the array.

Def distributeCandies(candies): candies_set=set() # Add (canday) return min(len(candies)/2, Len (candies_set)) candies = [1,1,2,2,3] print(candies)Copy the code

The points of candy II

1103. Divide candy II

Problem description

We have candies now and are going to divide them among the n children in line. We adopt the following allocation strategy. Give the first child one candy, the second child two, and so on, until the last child is given n candy. Then, we go back to the beginning of the group and give n + 1 candy to the first child, n + 2 candy to the second child, and so on until we give 2 * N candy to the last child. Repeat the process (giving one more candy each time, starting again at the beginning when we reach the end of the line) until we have all the candy. Note that even if we don’t have enough candy left (no more than from the previous distribution), all the candy will be given to the current child. Returns an array of length N with the sum of candies to indicate the final distribution of candies (that is, ANS [I] is the number of candies distributed to the ith child).

Example:

Type: candies = 9, n= 3

Output:

Explanation: The first time, ans[0] += 1, the array becomes [1,0,0,0]. The second time, ans[1] += 2, the array becomes [1,2,0,0]. The third time, ans[2] += 3, the array becomes [1,2,3,0]. The fourth time, ans[3] += 3 (because there are only three sweets left), the array becomes [1,2,3,3].

To analyze problems

The most intuitive idea to get at this problem is to keep distributing candy to n children until all the candy is distributed. Let’s take a look at how the code works.

def distributeCandies(candies,n): ans=[0]*n i=0 while candies>0: Index = I % nans [index]=ans[index]+min(I +1,candies) candies= I candies-min(I +1,candies n=4 print(distributeCandies(candies,n))Copy the code

Cent candy III

135. Give out candy

Problem description

Now there are a group of children playing games, please give out candies according to the score of the game. The requirements are as follows:

  1. No matter how many points each child gets, at least one candy.

  2. Between any two neighboring children, the one who scores more must get more candy. (No such restriction if they are the same)

Given an array arR stands for score array, what is the minimum number of sweets you need to allocate according to the rules?

To analyze problems

First of all, we can break down the second rule of the problem. For any neighboring child, we can split it into left child and right child.

  1. Left rule: When left child ratings[i-1]
  2. Right rule: When right child ratings[I +1]

We iterate through the array twice to find the minimum amount of candy each child needs to satisfy the left rule and the right rule left and right. The amount of candy each child gets is the maximum of the two values left and right, i.e. Max (left,right). Let’s take a look at the code implementation.

Def candy_count(ratings): n = len(ratings) # left = [1] * n for I in range(1, n): if ratings[i] > ratings[i - 1]: left[i] = left[i - 1] + 1 right = result = max(left[n-1],1) for i in range(n - 2, -1, -1): if ratings[i] > ratings[i + 1]: right += 1 else: Result += Max (left[I], right) return result s=[1,3,5,3,2,1] print(candy_count(s))Copy the code

Let’s look at the complexity of the algorithm.

  1. Time complexity: O(n), the number of n children. We need to walk through the array twice to find the minimum number of sweets needed to satisfy the rule.
  2. Space complexity: O(n).

So what can we do to reduce space complexity?

To optimize the

According to the question, we should distribute as little candy as possible, and each child should be allocated at least one candy.

Let’s go through each student from left to right, and let’s call the number of candies given to the previous student pre.

  1. If the current student scores higher than the previous student, it means that we are currently in the most recent increasing sequence, so we need to allocate pre+1 candy to the student.
  2. If the current student has the same score as the previous student, we give a candy, so as not to break the rule.
  3. If the current student scores lower than the previous student, we give the current student a candy. Also, in order to ensure that the number of sweets allocated is satisfactory, we need to assign another candy to the other students in the descending sequence (this may not be easy to understand, we use an example to illustrate), so we need a variable dec to record the length of the descending sequence. One thing to note here is that when the descending sequence is the same length as the previous ascending sequence, we need to incorporate the last student of the most recent ascending sequence into the descending sequence.

Let’s look at an example. Let’s say there are five children, and their scores are 1, 3, 5, 3, 2, 1.

Let’s take a look at how the code works.

def candy_count(ratings): N = len(ratings) ret = 1 # increments inc = 1 # increments dec = 0 # increments pre = 1 for I in range(1, n): if ratings[i] >ratings[i - 1]: dec = 0 pre = pre + 1 ret = ret + pre inc = pre elif ratings[i] == ratings[i-1]: dec = 0 pre = 1 ret = ret + pre inc = 1 else: dec += 1 if dec == inc: Dec += 1 ret += dec pre = 1 return RET s=[1,3,5,3, 1] print(candy_count(s))Copy the code

Let’s look at the complexity of the algorithm.

  1. Time complexity: O(n).
  2. Space complexity: O(1).

Merge range

Problem description

56. Consolidated intervals

The array enjoyments is a set of intervals, where a single interval is enjoyments [I] = [Starti, endi]. You merge all overlapping ranges and return a non-overlapping array of ranges that exactly covers all of the ranges in the input.

Input: [[1,3], [2,6], [8,10], [15,18]] output: [[1,6], [8,10], [15,18]] explanation: intervals [1,3] and [2,6] overlap, merge them into [1,6]

To analyze problems

For any two intervals A and B, the relationship between them can have the following six situations.

We compare and swap these two intervals to make it true that the starting position of the first interval is less than or equal to the starting position of the second interval. In this way, we can convert these six conditions into the following three.

In this way, we order all intervals according to their left endpoints, and then we can guarantee that for any two consecutive intervals, the starting position of the first interval is less than or equal to the starting position of the second interval, so their relationship is only in the above three cases.

algorithm

For the above three cases, we can use the following algorithm to solve.

First, we use the array merged to store the final answer. Next we add the first interval to the merged array and consider each subsequent interval in sequence:

  • If the left endpoint of the current interval is after the right endpoint of the last interval in the array merged, which is the second case in the figure above, then they do not overlap. We can add this interval directly to the end of the merged array;

  • In the first and third cases above, we need to update the right endpoint of the last interval in the array merged with the right endpoint of the current interval, and set it to the larger value of the two.

Now that we can solve all three of these cases, let’s look at the code implementation.

Class Solution: def merge(enjoyments): # Enjoyments (enjoyments): X [0]) # merged = [] for interval in intervals: If merged or merged[-1] < interval[0]: merged. Append (interval) else: if merged or merged[-1] < interval[0]: merged. Merged [-1] = Max (merged[-1][1], interval[1]) return mergedCopy the code

The time complexity of the algorithm is O(nlogn), where n is the number of intervals. Excluding the sorting overhead, we only need one linear scan, so the main time cost is O*(*n logn) of sorting.

The space complexity is order logn.