As we know from the logistic regression explanation above, forward operation can calculate the output result, while reverse operation can calculate the gradient or derivative to adjust parameters. Through simple expression, the composition of the calculation graph is elicits forward propagation and back propagation in deep learning.
I. Calculation diagram and forward propagation
Hypothesis function
In order of operations, let’s say,
Ps. If you have studied the calculus of multiple variables in advanced mathematics, the following can be compared to the calculus of multiple variablesChain rule, so the following figure is obtained (in fact, forward calculation is the distribution calculation process) :Step by step calculation process is relatively easy, handed over to the computer will be more efficient, so this part is omitted.
Second, calculate the derivative and back propagation
Provisions on programming symbols
When taking the derivative, dFinalOutputVardvar\frac{dFinalOutputVar}{dvar}dvardFinalOutputVar represents the derivative of the final output variable with respect to a related variable. In programming, in order to express the derivative variable conveniently and uniformly, the variable name is introduced: dvar.dvar.dvar. For example, dJdu→du,dJda→da.\frac{dJ}{du}\to du,\frac{dJ}{da}\to da. DudJ →du,dadJ→da. It also avoids intermediate variables.
Four,
- A calculation flow chart, forward calculation cost function JJJ, need to optimize the function
- The most efficient way to compute a series of derivatives is to go backwards (from right to left), follow the red arrow, and take the derivative step by step (chain).