Introduction to dynamic programming - the legendary zero knapsack problem

This article originated from personal public account: TechFlow, original is not easy, for attention

Today is the 12th article in Wednesday’s algorithms and Data Structures series, the Zero-one knapsack problem in dynamic programming.

In previous articles, we explored binary, greedy, sorting, and search algorithms, but today we’ll look at another very classic algorithm – dynamic programming.

In the field of ACM-ICPC competition, dynamic programming is a very large category, which contains many variations, and many of them are extremely difficult. For example, dynamic programming on various trees and graphs and other data structures can make the problem very complicated. The good news is that non-competition candidates don’t need to know that much. Generally speaking, a thorough knowledge of nine topics is enough to be successful in any interview. So Wednesday’s algorithm topic we start a new chapter – backpack series, today and you share the first of nine backpack lecture, is also the simplest zero one backpack problem.

Backpack and zero one backpack

For those of you who have not competed before, you may be confused by the title, but dynamic programming has nothing to do with backpacks. It doesn’t really matter, I’m not a fan of Eason Chan, it’s just that the most classic dynamic programming problem was solved with a backpack, and it spawned all sorts of variations. Later in the teaching time for convenience, so used the name of the predecessors.

In fact, it is very simple. That is to say, we currently have a backpack with a capacity of V and five items with n individuals whose integral is V [I] and value is W [I]. What is the maximum value of items that we can acquire within the capacity of our backpack?

Since there is only one of each kind of item, that is, items can only be taken or not taken, so this problem is called the zero-one backpack problem.

Greed and counterexamples

The first thing that comes to mind is greed, prioritizing high-value items or high-value items, but it’s easy to think of counterexamples.

For example, let’s say the backpack has a capacity of 10, and we have three items with a volume of 6,5,5 and a value of 10, 8, 8. This counterexample proves that neither greedy strategy works, because the most valuable is 10, and its size is 6. Once we take it, there is no room to continue to acquire other items, and it is obvious that taking two fives is optimal. Similarly, items with a volume of 6 are also the most cost-effective, and the greedy strategy of prioritizing cost performance is also not effective.

In fact, not only these two greedy strategies don’t work, but all the greedy strategies you can think of don’t work. The problem seems simple, but it is not so easy to solve. In fact, this problem has been plaguing computational scientists until the 1960s, when dynamic programming algorithms came along and solved the problem perfectly.

Dynamic programming

Dynamic programming algorithm is a dynamic programming algorithm. Programming is pretty well understood, but what about dynamics? How do you plan dynamically? Thinking about this problem is directly related to the essence of the algorithm.

The essence of dynamic programming algorithms is the recording and transfer of states. Combined with the question we just asked, have you ever wondered why greedy algorithms are not feasible? It’s simple, because there’s no way to determine what’s perfect. Although we know that the backpack has capacity V, we don’t know how much we can carry in the optimal case, what the optimal end state is. We regard space V as a state to greedy, and the result obtained by greedy is optimal, but it is only the optimal solution of the state that greedy can achieve, not the global optimal solution. Due to the limitation of backpack capacity, it is likely that we cannot achieve the truly optimal state under greedy strategy.

To explain the above paragraph using the previous example, under the greedy algorithm, we will choose the item with a capacity of 6 and a value of 10. After the item is taken, the backpack will be in a state of 6 and the value obtained is 10. This state is the final state that greedy can reach, and for this state, it is the optimal solution, but this state is not the overall optimal case, because under greedy strategy, it cannot reach the state where the capacity of 10 is used up.

Once you understand the problem, you can derive the solution. Greedy strategy can obtain some optimal conditions, so can we record the optimal conditions that can be achieved by all the states, and finally choose the optimal one among these optimal conditions, which is not the overall optimal solution?

Dynamic programming is based on the above ideas, it is not to solve the optimal solution of one state, but the optimal solution of all states.

States and transitions

You certainly don’t understand dynamic programming algorithms yet, but you should get a sense of it. That’s right. Having right feelings is a prerequisite for right knowledge. So let’s go back to the idea of states.

What is the state that we’ve been talking about so many times? This is an abstract concept, and it has different meanings in different problems. In the backpack problem, the status refers to the use of backpack capacity. Since the volume of items in the knapsack problem is an integer, it is obvious that the possibilities of knapsack capacity are limited, which is trivial but important. If the state is not an integer, then dynamic programming is possible, but the code implementation may be more difficult.

After understanding that the capacity of the backpack is a state, we can further understand that the capacity of the backpack will change. And the reason why it changes is because we put something in it, and when we put something in it, the state of the backpack changes, it goes from one state to another. The transition of the state is accompanied by the putting of things, the things we put in are not fixed, there are many choices, we decide to put in A instead of BC, this is A decision, the decision will lead to the transition of the state, different decisions will lead to different transitions.

For example, we currently have a backpack with a capacity of 10, and we have put an item with a volume of 3 and a value of 7 in it. If at this point in time, we put in an object of size 4 and value 5 by selection. So obviously, the capacity of the backpack is going to be 7, and its value is going to be 12. This process is a classic state transition process, which is also the core of the whole dynamic programming algorithm.

Now that the basic concepts and ideas have been introduced, it’s time to apply them to practical problems.

The optimal state

As we said before, dynamic programming eventually takes the optimal solution for all the states, and then picks the globally optimal one. So how does it get the locally optimal solution?

Before we answer that question, let’s consider two questions.

First of all, if we already know that the maximum value of a backpack with a volume of 3 is 5, and we decide to add an item with a volume of 4 and a value of 5, then the volume of the backpack will increase to 7, is the optimal solution of volume 6 obtained at this time?

That’s not a hard question to answer, and if we think about it a little bit, it’s probably not. For the simplest example, let’s say we have something of size 7 and value 20. So it’s obviously better than putting these two things. Although state 3 can get value 5 at most, state 7 can also be transferred from state 3, but this is not necessarily optimal. That is to say, when the optimal state is transferred out, it does not necessarily lead to the optimal value of other states.

Let’s flip the problem the other way around, so if we know the optimal solution for volume 6, and we know that it comes from the transition of volume equal to 4, can we be sure that the corresponding optimal solution for volume 4?

This time the answer is different, and it’s correct, because if there’s a better way to do it at volume 4, then volume 6 should also be better, which contradicts our hypothesis.

Let’s summarize the above two conclusions, that is to say, the locally optimal situation transfer out is not necessarily optimal, but the local optimal must also be obtained by other locally optimal state transfer. It’s a bit of a tongue twister, but I don’t think it’s hard to explain. For example, students with excellent grades may not be the first in the exam, but the first must be students with excellent grades. Local optimum is the student with excellent performance, transfer is the examination. Local optimal transfer out is not necessarily the optimal state after the transfer, there may be other better transfer strategy, but for the case of an optimal state, it must also be obtained from a previous optimal state transfer.

In addition, the state transfer is also sequential. For example, in this problem, the volume of items placed in the backpack can only increase, but not decrease, which means that the state can only be transferred from small to large.

So just to get the idea straight, it’s pretty clear that states can be transferred, that states can be transferred sequentially, that local optimality must be derived from other local optimality transfers. Since we do not know if the current transition is optimal, we need to use an array or container to record the maximum value that all states have ever reached in history. And then finally, from all the best values, we pick the best value, which is the solution to the final problem.

At this point, if it’s a general dynamic programming problem, we’ve solved it, but there’s one more detail to consider.

There was no aftereffect

So let’s look at the whole process, first of all we need to start with the initial state, and this initial state is easy to do, when the backpack is empty, its value is zero, its volume is zero, and that’s its optimal state, and that makes sense, because we can’t make something out of nothing.

So we move from zero to state, and state shift is associated with decision making, in this case in choosing different items. We iterate over the objects as decisions, iterate over the states that apply those decisions, and we get all the state transitions. Finally, we use a container to record the local optimal solution achieved during all state transitions, and we are done.

This process appears to be perfectly normal, without any anomalies, but in fact, there is a problem.

Let’s use the example from the previous problem. The backpack has a capacity of 10 and three items, and the volume is 6,5,5, and the value is 10, 8, and 9. We take the third item starting at 0, and we move to state 5, where the value is 9. At this point, if we go further down the path, we’re going to go to state 5, which has already taken item 3, which is worth 9. Since an item can only be picked up once, we can no longer use item 3 to transfer state 5, otherwise it would violate the instructions.

You might say it’s not that hard, we can record the decisions we made before in the state, just add a judgment to the decision.

It looks like it’s because of the restriction that things can’t be picked up again, but it’s actually because of the effect that our states have on each other. That is to say, the decisions we make in the front are likely to affect the decisions we make in the later states, and the back and forth effects between these states are called aftereffects. Obviously, we cannot use dynamic programming algorithm in scenarios with aftereffect. Not all problems can be solved by adding judgment. We need to solve the essential problem of aftereffect, that is to say, we need to find ways to eliminate it.

In this case, that’s easy to do. We simply control the order in which states and decisions are traversed, separating the previous decisions from the subsequent ones, so that they do not affect each other. If we iterate through the states first and then iterate through the actions that can be taken in each state, we will inevitably have a back-and-forth effect. Because you can’t make the decisions you made in the future. But the back can’t perceive what decision was made. So it’s a good idea to iterate over the decision and then iterate over the state in which the decision can be taken. In order to avoid the interaction before and after the decision, we traverse the state in reverse order.

For example, let’s say the backpack is still 10, and the first item we enumerate is 3 in volume and 5 in value. We go backwards through states 7 to 0, because this decision cannot be taken for states greater than 7 (the total volume will exceed). Because for states greater than 7, we can’t make this decision (the total volume will exceed the limit), for state 7, we can make this decision to move to state 10. We do not know if this transfer is optimal, so we record it like this: dp[10] = Max (dp[10], dp[3] + 5).

And then we go through volume 6, and we can go to state 9.

Since we are traversing backwards, when we update state 10 with state 7, state 7 itself has not been updated by the decision. Even if we update state 7 later as we traverse to state 4, the result of state 10 will not be affected. Because we’re traversing backwards, we’re not going to update to state 10 with the same strategy. If it is a positive order traversal, it cannot be avoided. For the same item, we are likely to update state 4 with state 1, then update state 7 with state 4, and then update state 10 with state 7. And this kind of circumstance actually corresponded to use many same article, this and the question meaning contradiction.

For example, let’s say we have an item of volume 2 and its value is 5. When we iterate over state 0, we update state 2. When we iterate over state 2, we update state 4 with the same item to get 10. So for state 4, it’s essentially taking two of these things, which is being updated twice by the same decision. But we only have one item at most, which is obviously not right.

In dynamic programming, since the source of the current enumeration state cannot be determined, after-effects are not allowed. If it cannot be resolved, dynamic programming cannot be used. This is also the most basic principle of dynamic programming. In this problem, we skillfully transform the process of decision making and state enumeration, eliminating the aftereffect. In other topics may not be the same, we need to judge according to the actual situation.

If you’ve forgotten the premise of dynamic programming in the process of thinking about a problem, think about getting things out of a backpack. You can only take one item once. If you can take it after you’ve taken it in front, it violates after-effect.

State transition equation

Let’s rearrange our thoughts on state transitions. Here are a few:

We start at state 0, and the optimal value of state 0 is 0.
Think about aftereffects and make sure there are no aftereffects
When the decision is executed, the state transition will occur and the optimal solution corresponding to the state will be recorded

In this case, the decision is to acquire the item, and the status is the backpack capacity. Since taking and retrieving items will cause a change in backpack capacity, and there is only one item per item, to avoid aftereffect, we need to enumerate decisions first, then states, ensuring that each decision is applied only once at most for each state. During this process, the optimal solution for each state needs to be recorded all the time.

Since the knapsack has capacity V, we only need an array of capacity V to keep track of all the states.

dp[0 for _ in range(V)]



for i in items:

    for v from V-i.v to 0:

        dp[v + v[i]] = max(dp[v+i.v], dp[v] + i.w)

        

return max(dp)

Copy the code

Dp records all states, and we use Max (dp[v+ i.V], DP [v] + i.W) to update the value of dp[v+ I.V] state. Since the current decision may not be better than the previous one, we need to add Max operation to ensure that the recorded result of each state is its optimal. When you have the optimal solution for all the states, it’s obvious that the optimal solution for the whole problem is also in there.

The above equation, which records the state transition process, is called the state transition equation and is the core concept of dynamic programming algorithms. A lot of times, when we’re solving dynamic programming problems, we’ll deduce state transition equations on scratch paper. If the state transition equation can be laid out clearly, you’re not far from writing code.

code

The transfer equation above is so close to the final code that it’s only a few lines long:

dp = [0 for _ in range(11)]



items = [[6.10], [5.8], [5.9]]



# Walk through items

for v, w in items:

    # Traversal backpack space (state)

    Start with space 10-V and traverse to 0 because you want to place items

    # Update vp+ V status, that is, after the current capacity is put into the item

    for vp in range(10-v, - 1.- 1) :

        dp[vp+v] = max(dp[vp+v], dp[vp] + w)



print(max(dp]))

Copy the code

conclusion

The derivation of the zero-one knapsack and all the concepts in it were introduced, although we spent so much space to introduce the algorithm, but the actual code is only a few lines. In terms of the number of lines of code alone, dynamic programming can be said to be the shortest algorithm to achieve the code, but although it is not long code, but the idea is not simple, especially the subscript and the order of the loop and other details, I hope we do not take it lightly.

Today zero one backpack problem to end here, next week’s algorithm topic we continue to backpack problem, to see 01 backpack advanced version – complete backpack and multiple backpack problem, please look forward to.

If you feel that something has been gained, please feel free to click on the attention or forward, your help is very important to me.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Introduction to dynamic programming – the legendary zero knapsack problem

Backpack and zero one backpack

Greed and counterexamples

Dynamic programming

States and transitions

The optimal state

There was no aftereffect

State transition equation

code

conclusion

Introduction to dynamic programming – the legendary zero knapsack problem

Backpack and zero one backpack

Greed and counterexamples

Dynamic programming

States and transitions

The optimal state

There was no aftereffect

State transition equation

code

conclusion

Related Posts

Keras: A neural network model for multiple inputs and mixed data inputs

The cheapest GPU cloud server in the whole network! the more u buy, the more you save!

The Next Industrial Revolution: How can robots help humans transform intelligently