Nn criterions don’t compute the gradient w.r.t. targets error“Pytorch”

# # the, reason and the solution to the pytorch – DDPG | making moved to jupyter notebook on runtime, Nn Criterions don’t compute the gradient W.R.T. targets error. Note: THE evaluation criterion function I use is nn.MSELoss(evalute, target). So the error message is: the criterion function does not calculate the gradient of the target value (expected value, also known as target, label). The tensor Variable requires_grad=True for target in our current program.

Solution: Change target’s requires_grad attribute to False, which does not require a gradient. Requires_grad =False is not possible. The correct approach is to call the target.detach() or target.detach_() function before criterions (evalute, target). So the program doesn’t have this error.

Explanation: In the program I encountered the problem, the target value is calculated from the network forward of the parameter to be updated (for example, the update of Q value in Q-learning) rather than the normal given label(for example, the label value for supervised learning). The target is the tensor Variable, requires_grad=True, so the variables you output from the tensor have to be gradient. We use the detach_() or detach() functions to separate the target from the entire diagram. The target attribute reverts to requiRES_grad =False.

Second, I encountered the error code, and the correction, you can refer to.

  • Positioning pytorch – DDPG | making, positioning in the red box to the picture file, click on the file pages.

  • Continue to locate the figure where the code in the green box should be changed to the code in the green font. You don’t have this error when you run the program.

Note: Also available in comments#critic updateAdd the line above thetarget_q_value.detach_()To separate target from the computed graph, which also solves the problem.

3. Complete debugging process

  • 1 running pytorch DDPG | making program, appear this mistake.
  • Two major search engines, all kinds of search. Found that encountered this problem is also a lot of people, I according to the following answer to the question one by one try, such as willrequires_grad=FalseAnd so on, found that can not be so hard, and there will be another error as follows (at this time the solution has actually given out in the error, but I did not look carefully, or now found when sorting, sojump to 3) :

  • For this example is similar in the update when the Q value of Q – learning function, suddenly remembered pytorch classic routines is DQN | pytorch routines. I run the program without any mistakes, so I will pytorch – DDPG | making relevant part of the change and DQN | pytorch routines. However, there is still a problem, I looked carefully, found that there is still a little bit different. In the figure below, the red box is the target corresponding to DQN, which is followed by a function whose function is unknown. Then I search, find, the function of a blog pytorch Variable detach and detach_ | CSDN blog, and then understand the cause of the problem (the first part has given the explanation of the reasons).

It was a step around, but the end was good — problem solved. Pytorch – DDPG | making is inverted pendulum using DDPG training, continuous action, a continuous time. After the error is resolved, the program runs normally and the result is posted:

  • What happened at the beginning of training

  • 1000 steps of training

And finally, hopefully you won’t have a chance to find this wrong solution.

Requries_grad, variable.volatile, and two functions variable.detach () and variable.detach_ ().