Backtracking Line Search
A backtracking line search method determines the amount to move in a given search direction by iteratively decreasing a step size $\alpha$ until an acceptable level is reached. In SimpleSolvers
we can use the sufficient decrease condition and the curvature condition to quantify this acceptable level. The sufficient decrease condition is also referred to as the Armijo condition and the sufficient decrease condition and the curvature condition are referred to as the Wolfe conditions[1] [3].
Backtracking Line Search for a Line Search Objective
We note that when performing backtracking on a line search objective care needs to be taken. This is because we need to find equivalent quantities for $\mathrm{grad}_{x_k}f$ and $p$. We first look at the derivative of the line search objective:
\[\frac{d}{d\alpha}f^\mathrm{ls}(\alpha) = \frac{d}{d\alpha}f(\mathcal{R}_{x_k}(\alpha{}p)) = \langle d|_{\mathcal{R}_{x_k}(\alpha{}p)}f, \alpha{}p \rangle,\]
because the tangent map of a retraction is the identity at zero [5], i.e. $T_{0_x}\mathcal{R} = \mathrm{id}_{T_x\mathcal{M}}$. In the equation above $d|_{\mathcal{R}_{x_k}(\alpha{}p)}f\in{}T^*\mathcal{M}$ indicates the exterior derivative of $f$ evaluated at $\mathcal{R}_{x_k}(\alpha{}p)$ and $\langle \cdot, \cdot \rangle: T^*\mathcal{M}\times{}T\mathcal{M}\to\mathbb{R}$ is the natural pairing between tangent and cotangent space[2] [6].
We again look at the example introduced when talking about the sufficient decrease condition and cast it in the form of a line search objective:
ls_obj = linesearch_objective(obj, cache)
This objective only depends on the parameter $\alpha$. We plot it:
Example
We show how to use linesearches in SimpleSolvers
to solve a simple toy problem[3]:
sl = Backtracking()
SimpleSolvers
contains a function SimpleSolvers.linesearch_objective
that allocates a SimpleSolvers.TemporaryUnivariateObjective
that only depends on $\alpha$:
We now use this to compute a backtracking line search[4]:
ls = LinesearchState(sl)
α = 50.
αₜ = ls(ls_obj, α)
0.78125
And we check whether the SimpleSolvers.SufficientDecreaseCondition
is satisfied:
sdc = SufficientDecreaseCondition(c₁, x, f(x), g, p, obj)
sdc(αₜ)
true
Similarly for the SimpleSolvers.CurvatureCondition
:
c₂ = .9
cc = CurvatureCondition(c₂, x, g, p, obj, obj.G)
cc(αₜ)
true
- 1If we use the strong curvature condition instead of the standard curvature condition we conversely also say that we use the strong Wolfe conditions.
- 2If we are not dealing with general Riemannian manifolds but only vector spaces then $d|_{\mathcal{R}_{x_k}(\alpha{}p)}f$ simply becomes $\nabla_{\mathcal{R}_{x_k}(\alpha{}p)}f$ and we further have $\langle A, B\rangle = A^T B$.
- 3Also compare this to the case of the static line search.
- 4We also note the use of the
SimpleSolvers.LinesearchState
constructor here, which has to be used together with aSimpleSolvers.LinesearchMethod
.