“This is the fifth day of my participation in the First Challenge 2022. For details: First Challenge 2022.”
In mathematical optimality problems, the Lagrange multiplier method (named after mathematician Joseph Louis Lagrange) is a method of finding extreme values of functions of several variables whose variables are constrained by one or more conditions. This method transforms an optimization problem with n variables and K constraints into an extremum problem of a system of equations with N + K variables without any constraints. This paper introduces the Lagrange multiplier.
An overview of the
-
What we are good at solving is the unconstrained extremum solution problem, which can find all extremum points and saddle points only by taking partial derivatives of all variables and making all partial derivatives 0. When we solve problems with constraints, we try to transform them into unconstrained optimization problems
-
In fact, if we can get a representation of some variable from g, for example x1=h(x2… ,xn)x_1 = h(x_2, … , x_n)x1=h(x2,… ,xn), the formula can be changed into unconstrained optimization by taking it into YYY (this is what is done in junior and senior high schools, the so-called elimination method is also), but sometimes we cannot get such a representation, so we need to use Lagrange multiplier method to avoid the dilemma of elimination method.
As an optimization algorithm, Lagrange multiplier method is mainly used to solve constrained optimization problems. Its basic idea is to transform constrained optimization problems containing NNN variables and K k constraints into unconstrained optimization problems containing (N + K) (n+ K) (N + K) variables by introducing Lagrange multipliers. The mathematical meaning behind the Lagrange multiplier is that it is the coefficient of each vector in the gradient linear combination of constrained equations.
thought
- Considering the binary function f (x, y) f (x, y) f (x, y), g (x, y) = CG in constraint (x, y) = CG (x, y) = c under extreme value
- First of all, we can draw a layer of contour lines of F (x,y)f(x,y)f(x,y). When contour lines are tangent to G (x,y)= CG (x,y)= CG (x,y)=c, it may be the extreme point of this problem
Lagrange multiplier method
Single equality constraint
Consider the NNN meta-function y=f(x1,x2… ,xn)y=f(x_1, x_2,… ,x_n)y=f(x1,x2,… ,xn), in the equation constraint g(x1,x2… ,xn)=0g(x_1, x_2,… ,x_n)=0g(x1,x2,… ,xn)=0
- At the extremum point, yyY and GGG normal vectors must be parallel
- The normal vector of YYy is:
{% raw%}
{% endraw%}
- The normal vector of GGG is:
{% raw%}
{% endraw%}
- If the two are parallel, there exists a constant λ\lambdaλ such that:
{% raw%}
{% endraw%}
- That is:
{% raw%}
{% endraw%}
- So we get NNN equations, plus g(x1,x2… ,xn)=0g(x_1, x_2,… ,x_n)=0g(x1,x2,… Together form the set of n + 1, xn) = 0 n + 1 n + 1 equations equations, unknown to [x1, x2,…, xn, lambda] (x_1, x_2,…, x_n \ lambda] [x1, x2,…, xn, lambda] a total of n + 1 n + 1 n + 1, The solution of the equations is the set of all extremum points and saddle points, and λ\lambdaλ in each set of solutions is the multiple of two parallel normal vectors, which is 0 when the trajectory of the equation constraint passes through the extremum point of YYy.
Multiple equality constraint
The principle is similar to that of a single equality constraint
Consider the NNN meta-function y=f(x1,x2… ,xn)y=f(x_1, x_2,… ,x_n)y=f(x1,x2,… ,xn), gi(x1,x2… , xn) = 0, 1 or less I mg_i or less (x_1, x_2,… ,x_n)=0, 1 \le i \le mgi(x1,x2,… ,xn)=0,1≤ I ≤m)
- NNN dimensional space is constrained by MMM conditions, and an N-Mn-Mn-M dimensional surface is determined. We discuss the extreme value problem of YYy on this surface
- This surface consists of MMM n− 1n-1N −1 dimensional surfaces interwoven with MMM normal vectors, which form the normal space of the n-Mn-Mn − M dimensional surface
- At the extreme point of the problem, the normal vector of YYy must fall within the normal space of the n-Mn-Mn-M dimensional surface, that is to say, the normal vector of YYy can be expressed by the linear combination of MMM normal vectors of the n-Mn-Mn-M dimensional surface:
{% raw%}
{% endraw%}
- That is:
{% raw%}
{% endraw%}
- Now we have NNN equality equations plus MMM equality constraints (gi(x1,x2… , xn) = 0, 1 or less I mg_i or less (x_1, x_2,… ,x_n)=0, 1 \le i \le mgi(x1,x2,… ,xn)=0,1≤ I ≤m) together constitute a system of n+mn+mn+m equations, Unknowns for [x1, x2,…, xn, lambda. 1, 2, lambda…, lambda m] (x_1, x_2,…, x_n, \ lambda_1, \ lambda_2,… \ lambda_m] [x1, x2,…, xn, lambda. 1, 2, lambda…, lambda m] altogether N +mn+mn+m, the solution of the equations is the set of all extremum points and saddle points, and the value of λ I \lambda_i in each set of solutions is the linear combination coefficient of yyy’s normal vector in the normal space of n-Mn-Mn-M dimensional surface.
Single inequality constraint
Inequality constraint is an extension of equality constraint. Equality constraint represents a set of definite contour lines (planes), and inequality constraint represents a certain region of contour lines (planes)
Consider the NNN meta-function y=f(x1,x2… ,xn)y=f(x_1, x_2,… ,x_n)y=f(x1,x2,… ,xn), and the inequality constraint g(x1,x2… , xn) 0 or less g (x_1, x_2,… ,x_n) \le 0g(x1,x2,… ,xn)≤0
- If the problem has a solution, then there are two cases
- Solution in g (x1, x2,… ,xn)=0g(x_1, x_2,… ,x_n) = 0g(x1,x2,… Surface, xn) = 0
- Solution in g (x1, x2,… ,xn)<0g(x_1, x_2,… ,x_n) < 0g(x1,x2,… , xn) < 0
- When the solution in g (x1, x2,… ,xn)=0g(x_1, x_2,… ,x_n) = 0g(x1,x2,… ,xn)=0, it indicates that the inequality limits the area where YYy takes the minimum value, and the final solution falls at the position where YYy and the constraint are tangent. Then the normal vector directions of yyy and the constraint must be opposite (otherwise yyy will be on the surface of G (x1,x2… ,xn)<0g(x_1, x_2,… ,x_n) < 0g(x1,x2,… ,xn)<0 to find a smaller value), construct the equation according to the equation:
{% raw%}
{% endraw%}
-
λ≥0\lambda \ge 0λ≥0
-
When the solution in g (x1, x2,… ,xn)<0g(x_1, x_2,… ,x_n) < 0g(x1,x2,… ,xn)<0, in fact, this inequality does not constrain the solution of YYy, which is equivalent to λ=0\lambda =0 λ=0
-
And in both cases we have g(x1,x2… ,xn)=0g(x_1, x_2,… ,x_n) = 0g(x1,x2,… Lambda =0,xn)=0 and lambda =0 lambda =0, that is, one of the two must be 0
-
Therefore, for the Lagrangian multiplier method with a single inequality constraint, only the following constraints are added: λ≥0\lambda \ge 0λ≥0 and λg(x1,x2… ,xn)=0\lambda g(x_1, x_2,… Lambda, x_n) = 0 g (x1, x2,… ,xn)=0
Multiple inequality constraints
Consider the NNN meta-function y=f(x1,x2… ,xn)y=f(x_1, x_2,… ,x_n)y=f(x1,x2,… ,xn), gi(x1,x2… , xn) 0, 1 or less I mg_i or less or less (x_1, x_2,… ,x_n)\le0, 1 \le i \le mgi(x1,x2,… ,xn)≤0,1≤ I ≤m)
- Essentially the same as a single inequality constraint, but with a larger number
- In this case, conditions need to be added on the basis of Lagrange multiplier method of equality:
{% raw%}
{% endraw%}
Algorithm description
- Based on the above principles, the Lagrange multiplier method is proposed:
-
Consider the NNN meta-function y=f(x1,x2… ,xn)y=f(x_1, x_2,… ,x_n)y=f(x1,x2,… ,xn), in m1m_1m1 equation constraints (gi(x1,x2… , xn) = 0, 1 or less I m1g_i or less (x_1, x_2,… ,x_n)=0, 1 \le i \le m_1gi(x1,x2,… ,xn)=0,1≤ I ≤m1), m2m_2m2 inequality constraints (hj(x1,x2… , xn) 0, 1 j m2h_j or less or less or less (x_1, x_2,… ,x_n)\le0, 1 \le j \le m_2hj(x1,x2,… ,xn)≤0,1≤j≤m2)
-
Add constants λ,μ\lambda,\muλ,μ to construct the equation:
{% raw%}
{% endraw%}
- Take the partial derivative of all variables and set the derivative to 0:
{% raw%}
{% endraw%}
Where: 1≤k≤n1 \le k \le n1≤k≤n
- Combine the above NNN equations with m1m_1M1 equation constraint equations GI (x1,x2… , xn) = 0, 1 or less I m1g_i or less (x_1, x_2,… ,x_n)=0, 1 \le i \le m_1gi(x1,x2,… , xn) = 0, 1 I m1 simultaneous or less or less
- By combining the above n+ M1n + M_1n + M1 equations with μ JHJ =0,1≤j≤ m2mu_j h_j=0, 1 \le j \ LE M_2 μ JHJ =0,1≤j≤m2, n+ M1 + M2N + M_1 + M_2n + M1 +m2 equations are obtained
- Add restriction μj≥0\mu_j \ge 0μj≥0, h_j \le 0$$, 1 \le j \ LE m_2
- The set of extreme points and saddle points can be obtained by solving the n+ M1 + m2N + M_1 + M_2n + M1 +m2 element equation under limited conditions
- The extreme points are screened from all the solutions
KKT conditions
- The above
The meta-equation and the limiting conditions areKKT conditions
KKT
The condition is a necessary condition for taking the extreme value of the Lagrange function
{% raw %}
{% endraw %}
Which I ∈ {1, 2,…, m1} I \ in \ {1, 2, \ \ cdots, m_1 \} I ∈ {1, 2,…, m1}, j ∈ {1, 2,…, m2} j \ in \ {1, 2, \ \ cdots, m_2 \} j ∈ {1, 2,…, m2}
- To summarize what all the conditions mean:
content | meaning |
---|---|
{% raw%} {% endraw%} |
To solve the extreme value, the derivative in the direction of each independent variable is 0 |
Equality constraints | |
Inequality constraint | |
Two cases of inequality constraint: 1. Invalid inequality constraint ( ) 2. The normal vector of the inequality interface is opposite to the normal vector of the original function ( ) |
|
Two cases of inequality constraint: 1. The inequality constraint is invalid, and the extreme point is in Within the scope of ( ) 2. The inequality constraint is valid, and the extreme point is in On the surface ( ) |
The resources
-
Baike.baidu.com/item/%E6%8B…
-
www.zhihu.com/question/38…
-
www.zhihu.com/question/35…
-
www.zhihu.com/question/23…
-
Blog.csdn.net/johnnyconst…