TODO: 1) Convergence analysis 2) interior point primal dual method
Convexity
Separating hyperplane theorum
If two convex sets, and are disjoint, then there exists a hyperplane separates the two, i.e., there exist and such that for all in and for all in .
Supporting hyperplane theorum
For every point in the boundary of a convex set , denoted as , there exists a supporting hyperplane, i.e. there exists an such that for all .
Duality
The Lagrange dual function
-
It is a pointwise infimum of a family of affine functions of , hence it is concave.
-
For every pair that is dual feasible, i.e. and (which usually translates to since is a mapping to ), is a lower bound for the primal optimum.
Proof: for every primal feasible point ,
-
constrained on dual feasible points can be viewed as a set of lower bounds of
-
Saddle point interpretation: .
Slater’s condition and strong duality
If there is a strictly feasible point, meaning one such point that all the inequality constraints hold strictly, then strong duality holds.
- The condition can be reduced to weaker form on affine inequility constraints that they can hold with equalities.
- If a problem is self-dual, meaning the dual of the dual problem is the primal, e.g. SDP, then either of the primal-dual problem pair holds the Slater condition will gurantee strong duality.
Sketch of proof:
Consider
and .
Then apply separating hyperplane theorum. The strictly feasible point is used to ensure that the hyperplane is non-vertical thus won’t pass thru .
Duality gap
For a primal dual feasible point pair and , is refered to as the duality gap, which also gives a certificate of how close the primal value is to its optimum.
Complementarity
Assume strong duality holds, then we must have
which implies that , so and are complementary.
KKT conditions:
Assumes zero duality gap.
KKT:
-
primal feasibility
1) ; 2)
-
dual feasibility
1) ; 2)
-
complementarity (zero duality gap)
i) If the primal problem is convex, then KKT conditions are equivalent for the points to be primal and dual optimal.
ii) If the primal problem is nonconvex, the KKT conditions are necessary.
Remark for dual feasibility:
means minimizes , then and hence dual feasible.
Conjugacy
Conjugate function
The conjugate of a function , , is defined as
but whose domain constrained on for which the supremum is finite.
- is a pointwise supremum of a family of affine functions, hence convex.
- Fenchel-Young’s inequality: .
- if is convex and closed, the .
Rewrite Lagrange dual function
For a general linear constraint problem
using conjugate we can rewrite its Lagrange dual function as
.
Unconstrained algorithms: descent methods
The general descent algorithm
-
Determine a direction
Look for a with
-
Line search
Find to update .
-
Check stopping criterion
Line search methods
-
Exact line search
-
Backtrack line search
while , do
Direction choice methods
-
Gradient descent
-
Newton’s method
Equality constrained algorithms
We are primarily converning linear equality constraints, since linear form is the only possibility of an equality constraint to be convex.
For , the KKT conditions become:
- dual feasibility:
- primal feasibility:
Elimination of equality constraints
For a problem
we can find a specific solution to the constraint, , and a matrix such that . Thus we convert the problem into an unconstrained problem
.
Dual method: ADMM
Alternating Direction Method of Multipliers
-
Motivation
-
Dual ascend
For a linear equality constraint (hence strong duality holds) problem , solve the dual problem (which is unconstrained) using gradient ascend.
The gradient of the dual objective at is , where .
The algorithm:
-
Dual decomposition
If the primal objective is separable in , then step can be decomposed into . Note that all the are evaluated in parallel holding other components constant as from .
-
Augmented Lagrangian and the method of multipliers
Still for the same linear equality constrained problem, one injects into the Lagrangian a penalty term for the residual of the equality constraint
The algorithm is similar to dual ascend but set .
-
-
ADMM
Setting: primal objective decomposable. Augment the Lagrangian and use dual decomposition but update the components sequentially, taking latest values of other components.
Primal method: extension of Newton’s method
Feasible starting point method
Using second order approximation to primal objective at a point yields
The method starts with a feasible , and update preserves the feasibility.
Note that . The Newton step is then taken as the solution to the second order approximation problem, which is also the solution of the KKT condition equations
.
Infeasible starting point method
With a infeasible start point, we make updates to best approximate feasibility and satisfies the optimal conditions.
Remark:
The infeasible start Newton’s method takes a step at a point not gaurenteeing a decrement step to the objective value. However, once it reaches feasibility, it will coincide with what is in the feasible case Newton’s method, and thereafter the step is sure to be a decrement step.
The algorithm:
Repeat until and
- Compute primal and dual Newton steps and . The dual step is the solution to (1) minus the previous .
- Backtrack on . is the primal-dual residual
- while , do
- ,
Inequality constrained methods
Barrier method
-
Log barrier
For a problem with inequality constraints:
relax it to:
or equivalently:
.
The log barrier function is an approximation to the ideal barrier , where the here is a parameter. The log barrieir is convex so it preserves the convexity of the objective. As , the log barrier approaches the ideal barrier.
-
Central path and -suboptimal
The central path associated with the problem is defined as the set of points for such that the KKT conditions enhanced with strictly feasibility hold for the barrier relaxed problem parametrized at .
Central path conditions can also be interpreted as relaxed KKT conditions on the original problem with relaxed complementarity/duality gap:
.
For on the central path, it is a -suboptimal point for the original problem, i.e.
.
-
Barrier method
Choose a strictly feasible starting point , and , , tolerance
- Centering: , by solving , with starting point at the latest
- Update: if else quit
Remark:
-
An infeasible starting point method can be used to solve the equality constrained problem at the centering step.
-
To choose a strictly feasible starting point, we add an additional phase before called phase I. By solving an optimization problem such as:
. (where is a scalar), or
. (where is a vector)
We can compute a strictly feasible point for the original problem, or conclude that the constraints are infeasible.
Note that the phase I problem is formulated such that a strict feasible point is easy to attain, so we can solve by pure barrier method. Or alternatively, we can also use infeasible starting point Newton’s method to solve the barrier problem associated with the phase I problem.
Primal dual interior point method
TODO
Solving symmetric linear system
We are considering a linear system
, where is PSD, to find the solution is equivalent to find a minimum of:
Conjugate gradient method - ref
-
-Conjugate
are said to be -conjugate/orthogonal if for any .
-
Conjugate direction theorum
Let a set of given vectors be -conjugate, denoted as , and be arbitrary starting point, then after steps of the following process, :
Remark:
This is essentially factorizing into the -conjugate basis sequentially. corresponds to the “coordinate” under the -conjugate basis in the sense that:
, since
which is -conjugate with .
-
Conjugate gradient algorithm
Since we are not given the -conjugate basis beforehand, so compute them also along the iteration:
The update steps are the same with conjugate direction theorum:
but with additional calculation
Remark:
To verify the correctness, observing that conjugate direction theorum holds for any basis, one only needs to show that the constructed by the conjugate gradient algorithm is indeed -conjugate.
In the induction step of the proof, it is sufficient to show that both of the following hold:
the first one is straightforward, the second is the same as in the remark in the conjugate direction theorum section.