SimpleSolvers

SimpleSolvers.DEFAULT_WOLFE_c₁Constant
const DEFAULT_WOLFE_c₁

A constant $\epsilon$ on which a finite difference approximation of the derivative of the objective is computed. This is then used in the following stopping criterion:

\[\frac{f(\alpha) - f(\alpha_0)}{\epsilon} < \alpha\cdot{}f'(\alpha_0).\]

Extended help

source
SimpleSolvers.BacktrackingType
Backtracking <: LinesearchMethod

The backtracking method.

Constructors

Backtracking()

Extended help

The backtracking algorithm starts by setting $y_0 \gets f(0)$ and $d_0 \gets \nabla_0f$.

The algorithm is executed by calling the functor of BacktrackingState.

The following is then repeated until the stopping criterion is satisfied or config.max_iterations (1000 by default) is reached:

if value!(obj, α) ≥ y₀ + ls.ϵ * α * d₀
    α *= ls.p
else
    break
end

The stopping criterion as an equation can be written as:

\[f(lpha) < y_0 + psilon lpha abla_0f = y_0 + psilon (lpha - 0) abla_0f.\]

Note that if the stopping criterion is not reached, $lpha$ is multiplied with $p$ and the process continues.

Sometimes the parameters $p$ and $psilon$ have different names such as $au$ and $c$.

source
SimpleSolvers.BacktrackingStateType
BacktrackingState <: LinesearchState

Corresponding LinesearchState to Backtracking.

Keys

The keys are:

  • config::Options
  • α₀:
  • ϵ=$(DEFAULT_WOLFE_c₁): a default step size on whose basis we compute a finite difference approximation of the derivative of the objective. Also see DEFAULT_WOLFE_c₁.
  • p=$(DEFAULT_ARMIJO_p): a parameter with which $\alpha$ is decreased in every step until the stopping criterion is satisfied.

Functor

The functor is used the following way:

ls(obj, α = ls.α₀)

Implementation

The algorithm starts by setting:

\[x_0 \gets 0, y_0 \gets f(x_0), d_0 \gets f'(x_0), \alpha \gets \alpha_0,\]

where $f$ is the univariate objective (of type AbstractUnivariateObjective) and $\alpha_0$ is stored in ls. It then repeatedly does $\alpha \gets \alpha\cdot{}p$ until either (i) the maximum number of iterations is reached (the max_iterations keyword in Options) or (ii) the following holds:

\[ f(\alpha) < y_0 + \epsilon \cdot \alpha \cdot d_0,\]

where $\epsilon$ is stored in ls.

Info

The algorithm allocates an instance of SufficientDecreaseCondition by calling SufficientDecreaseCondition(ls.ϵ, x₀, y₀, d₀, one(α), obj), here we take the value one for the search direction $p$, this is because we already have the search direction encoded into the line search objective.

source
SimpleSolvers.BisectionType
Bisection <: LinesearchMethod

The bisection method.

Constructors

Bisection()

Extended help

The bisection algorithm starts with an interval and successively bisects it into smaller intervals until a root is found. See bisection.

source
SimpleSolvers.CurvatureConditionType
CurvatureCondition <: LinesearchCondition

The second of the Wolfe conditions [3]. The first one is the SufficientDecreaseCondition.

This encompasses the standard curvature condition and the strong curvature condition.

Constructor

CurvatureCondition(c, xₖ, gradₖ, pₖ, obj, grad; mode)

Here grad has to be a Gradient and obj an AbstractObjective. The other inputs are either arrays or numbers.

Implementation

For computational reasons CurvatureCondition also has a field gradₖ₊₁ in which the temporary new gradient is saved.

source
SimpleSolvers.GradientAutodiffType
GradientAutodiff <: Gradient

A struct that realizes Gradient by using ForwardDiff.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • ∇config: result of applying ForwardDiff.GradientConfig.

Constructors

GradientAutodiff(F, x::AbstractVector)
GradientAutodiff(F, nx::Integer)

Functor

The functor does:

grad(g, x) = ForwardDiff.gradient!(g, grad.F, x, grad.∇config)
source
SimpleSolvers.GradientFiniteDifferencesType
GradientFiniteDifferences <: Gradient

A struct that realizes Gradient by using finite differences.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • ϵ: small constant on whose basis the finite differences are computed.
  • e: auxiliary vector used for computing finite differences. It's of the form $e_1 = \begin{bmatrix} 1 & 0 & \cdots & 0 \end{bmatrix}$.
  • tx: auxiliary vector used for computing finite differences. It stores the offset in the x vector.

Constructor(s)

GradientFiniteDifferences{T}(F, nx::Integer; ϵ)

By default for ϵ is DEFAULT_GRADIENT_ϵ.

Functor

The functor does:

for j in eachindex(x,g)
    ϵⱼ = grad.ϵ * x[j] + grad.ϵ
    fill!(grad.e, 0)
    grad.e[j] = 1
    grad.tx .= x .- ϵⱼ .* grad.e
    f1 = grad.F(grad.tx)
    grad.tx .= x .+ ϵⱼ .* grad.e
    f2 = grad.F(grad.tx)
    g[j] = (f2 - f1) / (2ϵⱼ)
end
source
SimpleSolvers.GradientFunctionType
GradientFunction <: Gradient

A struct that realizes a Gradient by explicitly supplying a function.

Keys

The struct stores:

  • ∇F!: a function that can be applied in place.

Functor

The functor does:

grad(g, x) = grad.∇F!(g, x)
source
SimpleSolvers.HessianType
Hessian

Abstract type. structs derived from this need an associated functor that computes the Hessian of a function (in-place).

Also see Gradient.

Implementation

When a custom Hessian is implemented, a functor is needed:

function (hessian::Hessian)(h::AbstractMatrix, x::AbstractVector) end

This functor can also be called with compute_hessian!.

Examples

Examples include:

These examples can also be called with e.g. Hessian(x; mode = :autodiff).

source
SimpleSolvers.HessianAutodiffType
HessianAutodiff <: Hessian

A struct that realizes Hessian by using ForwardDiff.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • H: a matrix in which the (updated) Hessian is stored.
  • Hconfig: result of applying ForwardDiff.HessianConfig.

Constructors

HessianAutodiff(F, x::AbstractVector)
HessianAutodiff(F, nx::Integer)

Functor

The functor does:

hes(g, x) = ForwardDiff.hessian!(hes.H, hes.F, x, grad.Hconfig)
source
SimpleSolvers.HessianFunctionType
HessianFunction <: Hessian

A struct that realizes a Hessian by explicitly supplying a function.

Keys

The struct stores:

  • H!: a function that can be applied in place.

Functor

The functor does:

hes(H, x) = hes.H!(H, x)
source
SimpleSolvers.JacobianAutodiffType
JacobianAutodiff <: Jacobian

A struct that realizes Jacobian by using ForwardDiff.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • Jconfig: result of applying ForwardDiff.JacobianConfig.
  • ty: vector that is used for evaluating ForwardDiff.jacobian!

Constructors

JacobianAutodiff(F, y::AbstractVector)
JacobianAutodiff(F, nx::Integer)

Functor

The functor does:

jac(J, x) = ForwardDiff.jacobian!(J, jac.ty, x, grad.Jconfig)
source
SimpleSolvers.JacobianFiniteDifferencesType
JacobianFiniteDifferences <: Jacobian

A struct that realizes Jacobian by using finite differences.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • ϵ: small constant on whose basis the finite differences are computed.
  • f1:
  • f2:
  • e1: auxiliary vector used for computing finite differences. It's of the form $e_1 = \begin{bmatrix} 1 & 0 & \cdots & 0 \end{bmatrix}$.
  • e2:
  • tx: auxiliary vector used for computing finite differences. It stores the offset in the x vector.

Constructor(s)

JacobianFiniteDifferences{T}(F, nx::Integer, ny::Integer; ϵ)

By default for ϵ is DEFAULT_JACOBIAN_ϵ.

Functor

The functor does:

for j in eachindex(x)
    ϵⱼ = jac.ϵ * x[j] + jac.ϵ
    fill!(jac.e, 0)
    jac.e[j] = 1
    jac.tx .= x .- ϵⱼ .* jac.e
    f(jac.f1, jac.tx)
    jac.tx .= x .+ ϵⱼ .* jac.e
    f(jac.f2, jac.tx)
    for i in eachindex(x)
        J[i,j] = (jac.f2[i] - jac.f1[i]) / (2ϵⱼ)
    end
end
source
SimpleSolvers.JacobianFunctionType
JacobianFunction <: Jacobian

A struct that realizes a Jacobian by explicitly supplying a function.

Keys

The struct stores:

  • DF!: a function that can be applied in place.

Functor

The functor does:

jac(g, x) = jac.DF!(g, x)
source
SimpleSolvers.LUType
struct LU <: LinearSolverMethod

A custom implementation of an LU solver, meant to solve a LinearSystem.

Routines that use the LU solver include factorize!, ldiv! and solve!. In practice the LU solver is used by calling the LinearSolver constructor and ldiv! or solve!, or with an instance of LU as an argument directly, as shown in the Example section of this docstring.

constructor

The constructor is called with either no argument:

LU()

# output

LU{Missing}(missing, true)

or with pivot and static as optional booleans:

LU(; pivot=true, static=true)

# output

LU{Bool}(true, true)

Note that if we do not supply an explicit keyword static, the corresponding field is missing (as in the first case). Also see _static.

Example

We use the LU together with solve to solve a linear system:

A = [1. 2. 3.; 5. 7. 11.; 13. 17. 19.]
v = rand(3)
ls = LinearSystem(A, v)
update!(ls, A, v)

lu = LU()

solve(lu, ls) ≈ inv(A) * v

# output

true
source
SimpleSolvers.LinearSolverType
LinearSolver <: AbstractSolver

A struct that stores LinearSolverMethods and LinearSolverCaches. LinearSolvers are used to solve LinearSystems.

Constructors

LinearSolver(method, cache)
LinearSolver(method, A)
LinearSolver(method, ls::LinearSystem)
LinearSolver(method, x)
Info

We note that the constructors do not call the function factorize, so only allocate a new matrix. The factorization needs to be done manually.

You can manually factorize by either calling factorize! or solve!.

source
SimpleSolvers.LinearSolverMethodType
LinearSolverMethod <: SolverMethod

Summarizes all the methods used for solving linear systems of equations such as the LU method.

Extended help

The abstract type SolverMethod was imported from GeometricBase.

source
SimpleSolvers.LinearSystemType
LinearSystem

A LinearSystem describes $Ax = y$, where we want to solve for $x$.

Keys

  • A
  • y

Constructors

A LinearSystem can be allocated by calling:

LinearSystem(A, y)
LinearSystem)(A)
LinearSystem(y)
LinearSystem{T}(n, m)
LinearSystem{T}(n)

Note that in any case the allocated system is initialized with NaNs:

A = [1. 2. 3.; 4. 5. 6.; 7. 8. 9.]
y = [1., 2., 3.]
ls = LinearSystem(A, y)

# output

LinearSystem{Float64, Vector{Float64}, Matrix{Float64}}([NaN NaN NaN; NaN NaN NaN; NaN NaN NaN], [NaN, NaN, NaN])

In order to initialize the system with values, we have to call update!:

update!(ls, A, y)

# output

LinearSystem{Float64, Vector{Float64}, Matrix{Float64}}([1.0 2.0 3.0; 4.0 5.0 6.0; 7.0 8.0 9.0], [1.0, 2.0, 3.0])
source
SimpleSolvers.LinesearchMethodType
LinesearchMethod

Examples include StaticState, Backtracking, Bisection and Quadratic. See these examples for specific information on linesearch algorithms.

Extended help

A LinesearchMethod always goes together with a LinesearchState and each of those LinesearchStates has a functor implemented:

ls(obj, α)

where obj is a AbstractUnivariateObjective and α is an initial step size. The output of this functor is then a final step size that is used for updating the parameters.

source
SimpleSolvers.LinesearchStateType
LinesearchState

Abstract type.

Examples include StaticState, BacktrackingState, BisectionState and QuadraticState.

Implementation

A struct that is subtyped from LinesearchState needs to implement the functors:

ls(x; kwargs...)
ls(obj::AbstractUnivariateObjective, x; kwargs...)

Additionaly the following function needs to be extended:

LinesearchState(algorithm::LinesearchMethod; kwargs...)

Constructors

The following is used to construct a specific line search state based on a LinesearchMethod:

LinesearchState(algorithm::LinesearchMethod; T::DataType=Float64, kwargs...)

where the data type should be specified each time the constructor is called. This is done automatically when calling the constructor of NewtonSolver for example.

Functors

The following functors are auxiliary helper functions:

ls(f::Callable; kwargs...) = ls(TemporaryUnivariateObjective(f, missing); kwargs...)
ls(f::Callable, x::Number; kwargs...) = ls(TemporaryUnivariateObjective(f, missing), x; kwargs...)
ls(f::Callable, g::Callable; kwargs...) = ls(TemporaryUnivariateObjective(f, g); kwargs...)
ls(f::Callable, g::Callable, x::Number; kwargs...) = ls(TemporaryUnivariateObjective(f, g), x; kwargs...)
source
SimpleSolvers.NewtonOptimizerCacheType
NewtonOptimizerCache

Keys

  • : the previous iterate,
  • x: current iterate (this stores the guess called by the functions generated with linesearch_objective),
  • δ: direction of optimization step (difference between x and ); this is obtained by multiplying rhs with the inverse of the Hessian,
  • g: gradient value (this stores the gradient associated with x called by the derivative part of linesearch_objective),
  • rhs: the right hand side used to compute the update.

To understand how these are used in practice see e.g. linesearch_objective.

Also compare this to NewtonSolverCache.

source
SimpleSolvers.NewtonSolverType
NewtonSolver

A struct that comprises all Newton solvers. Those typically differ in the way the Jacobian is computed.

Constructors

The NewtonSolver can be called with an NonlinearSystem or with a Callable. Note however that the latter will probably be deprecated in the future.

linesearch = Quadratic()
F(y, x, params) = y .= tanh.(x)
x = [.5, .5]
y = zero(x)
F(y, x, nothing)

NewtonSolver(x, y; F = F, linesearch = linesearch)

# output

i=   0,
x= NaN,
f= NaN,
rxₐ= NaN,
rfₐ= NaN

What is shown here is the status of the NewtonSolver, i.e. an instance of NonlinearSolverStatus.

Keywords

source
SimpleSolvers.NewtonSolverCacheType
NewtonSolverCache

Stores , x, δx, rhs, y and J.

Compare this to NewtonOptimizerCache.

Keys

Constructor

NewtonSolverCache(x, y)
source
SimpleSolvers.NonlinearSolverStatusType
NonlinearSolverStatus

Stores absolute, relative and successive residuals for x and f. It is used as a diagnostic tool in NewtonSolver.

Keys

Examples

NonlinearSolverStatus{Float64}(3)

# output

i=   0,
x= NaN,
f= NaN,
rxₐ= NaN,
rfₐ= NaN
source
SimpleSolvers.NonlinearSystemType
NonlinearSystem

A NonlinearSystem describes $F(x) = y$, where we want to solve for $x$ and $F$ is in nonlinear in general (also compare this to LinearSystem and MultivariateObjective).

Info

NonlinearSystems are used for solvers whereas MultivariateObjectives are their equivalent for optimizers.

Keys

  • F: accessed by calling Function(nls),
  • J::Jacobian: accessed by calling Jacobian(nls),
  • f: accessed by calling value(nls),
  • j: accessed by calling jacobian(nls),
  • x_f: accessed by calling f_argument(nls),
  • x_j: accessed by calling j_argument(nls),
  • f_calls: accessed by calling f_calls(nls),
  • j_calls: accessed by calling j_calls(nls).
source
SimpleSolvers.OptimizationAlgorithmType

An OptimizationAlgorithm is a data structure that is used to dispatch on different algorithms.

It needs to implement three methods,

initialize!(alg::OptimizationAlgorithm, ::AbstractVector)
update!(alg::OptimizationAlgorithm, ::AbstractVector)
solver_step!(::AbstractVector, alg::OptimizationAlgorithm)

that initialize and update the state of the algorithm and perform an actual optimization step.

Further the following convenience methods should be implemented,

objective(alg::OptimizationAlgorithm)
gradient(alg::OptimizationAlgorithm)
hessian(alg::OptimizationAlgorithm)
linesearch(alg::OptimizationAlgorithm)

which return the objective to optimize, its gradient and (approximate) Hessian as well as the linesearch algorithm used in conjunction with the optimization algorithm if any.

See NewtonOptimizerState for a struct that was derived from OptimizationAlgorithm.

Info

Note that a OptimizationAlgorithm is not necessarily a NewtonOptimizerState as we can also have other optimizers, Adam for example.

source
SimpleSolvers.OptionsType
Options

Keys

Configurable options with defaults (values 0 and NaN indicate unlimited):

Some of the constants are defined by the functions default_tolerance and absolute_tolerance.

source
SimpleSolvers.QuadraticStateType
QuadraticState <: LinesearchState

Quadratic Polynomial line search.

Quadratic line search works by fitting a polynomial to a univariate objective (see AbstractUnivariateObjective) and then finding the minimum of that polynomial. Also compare this to BierlaireQuadraticState. The algorithm is taken from [1].

Keywords

source
SimpleSolvers.QuadraticState2Type
QuadraticState2 <: LinesearchState

Quadratic Polynomial line search. This is similar to QuadraticState, but performs multiple iterations in which all parameters $p_0$, $p_1$ and $p_2$ are changed. This is different from QuadraticState (taken from [1]), where only $p_2$ is changed. We further do not check the SufficientDecreaseCondition but rather whether the derivative is small enough.

This algorithm repeatedly builds new quadratic polynomials until a minimum is found (to sufficient accuracy). The iteration may also stop after we reaches the maximum number of iterations (see MAX_NUMBER_OF_ITERATIONS_FOR_QUADRATIC_LINESEARCH).

Keywords

  • config::Options
  • ε: A constant that checks the precision/tolerance.
  • s: A constant that determines the initial interval for bracketing. By default this is DEFAULT_BRACKETING_s.
  • s_reduction: A constant that determines the factor by which s is decreased in each new bracketing iteration.
source
SimpleSolvers.StaticType
Static <: LinesearchMethod

The static method.

Constructors

Static(α)

Keys

Keys include: -α: equivalent to a step size. The default is 1.

Extended help

source
SimpleSolvers.SufficientDecreaseConditionType
SufficientDecreaseCondition <: LinesearchCondition

The condition that determines if $\alpha_k$ is big enough.

Constructor

SufficientDecreaseCondition(c₁, xₖ, fₖ, gradₖ, pₖ, obj)

Functors

sdc(xₖ₊₁, αₖ)
sdc(αₖ)

The second functor is shorthand for sdc(compute_new_iterate(sdc.xₖ, αₖ, sdc.pₖ), T(αₖ)), also see compute_new_iterate.

Extended help

We call the constant that pertains to the sufficient decrease condition $c$. This is typically called $c_1$ in the literature [3]. See DEFAULT_WOLFE_c₁ for the relevant constant

source
SimpleSolvers.TemporaryUnivariateObjectiveType
TemporaryUnivariateObjective <: AbstractUnivariateObjective

Like UnivariateObjective but doesn't store f, d, x_f and x_d as well as f_calls and d_calls.

In practice TemporaryUnivariateObjectives are allocated by calling linesearch_objective.

Constructors

Calling line search objectives

Below we show a few constructors that can be used to allocate TemporaryUnivariateObjectives. Note however that in practice one probably should not do that and instead call linesearch_objective.

f(x) = x^2 - 1
g(x) = 2x
δx(x) = - g(x) / 2
x₀ = 3.
_f(α) = f(compute_new_iterate(x₀, α, δx(x₀)))
_d(α) = g(compute_new_iterate(x₀, α, δx(x₀)))
ls_obj = TemporaryUnivariateObjective{typeof(x₀)}(_f, _d)

# output

TemporaryUnivariateObjective{Float64, typeof(_f), typeof(_d)}(_f, _d)

Alternatively one can also do:

ls_obj = TemporaryUnivariateObjective(_f, _d, x₀)

# output

TemporaryUnivariateObjective{Float64, typeof(_f), typeof(_d)}(_f, _d)

Here we wrote ls_obj to mean line search objective.

source
SimpleSolvers.UnivariateObjectiveType
UnivariateObjective <: AbstractUnivariateObjective

Keywords

It stores the following:

  • F: objective
  • D: derivative of objective
  • f: cache for function output
  • d: cache for derivative output
  • x_f: x used to evaluate F (stored in f)
  • x_d: x used to evaluate D (stored in d)
  • f_calls: number of times F has been called
  • d_calls: number of times D has been called

Constructor

There are several constructors, the most generic (besides the default one) is:

UnivariateObjective(F, D, x; f, d)

Where no keys are inferred, except x_f and x_d. f_calls and d_calls are set to zero.

The most general constructor (i.e. the one the needs the least specification) is:

f(x::Number) = x ^ 2
UnivariateObjective(f, 1.)

# output

UnivariateObjective:

    f(x)              = NaN
    d(x)              = NaN
    x_f               = NaN
    x_d               = NaN
    number of f calls = 0
    number of d calls = 0

where ForwardDiff is used to generate the derivative of the (anonymous) function.

Functor

The functor calls value!.

source
GeometricBase.update!Method
update!(solver, x, params)

Update the solver::NewtonSolver based on x. This updates the cache (instance of type NewtonSolverCache) and the status (instance of type NonlinearSolverStatus). In course of updating the latter, we also update the nonlinear stored in solver (and status(solver)).

Info

At the moment this is neither used in solver_step! nor solve!.

source
GeometricBase.update!Method
update!(cache::NewtonOptimizerCache, x, g, hes)

Update an instance of NewtonOptimizerCache based on x.

This is used in update!(::NewtonOptimizerState, ::AbstractVector).

This sets:

\[\bar{x}^\mathtt{cache} \gets x, x^\mathtt{cache} \gets x, g^\mathtt{cache} \gets g, \mathrm{rhs}^\mathtt{cache} \gets -g, \delta^\mathtt{cache} \gets H^{-1}\mathrm{rhs}^\mathtt{cache},\]

where we wrote $H$ for the Hessian (i.e. the input argument hes).

Also see update!(::NewtonSolverCache, ::AbstractVector).

Warn

Note that this is not updating the Hessian hes. For this call update! on the Hessian.

Implementation

The multiplication by the inverse of $H$ is done with LinearAlgebra.ldiv!.

source
GeometricBase.update!Method
update!(state::NewtonOptimizerState, x, g, hes)

Update an instance of NewtonOptimizerState based on x, g and hes, where g can either be an AbstractVector or a Gradient and hes is a Hessian. This updates the NewtonOptimizerCache contained in the NewtonOptimizerState by calling update!(::NewtonOptimizerCache, ::AbstractVector, ::Union{AbstractVector, Gradient}, ::Hessian).

Info

An instance of NewtonOptimizerState stores the NewtonOptimizerCache as well as a LinesearchState. The LinesearchState stays the same at every iteration, which is why only the NewtonOptimizerState is updated.

Examples

We show that after initializing, update has to be called together with a Gradient and a Hessian:

If we only call update! once there are still NaNs in the ...

f(x) = sum(x.^2)
x = [1., 2.]
state = NewtonOptimizerState(x)
obj = MultivariateObjective(f, x)
g = gradient!(obj, x)
hes = HessianAutodiff(obj, x)
update!(hes, x)
update!(state, x, g, hes)

# output

NewtonOptimizerState{Float64, SimpleSolvers.BacktrackingState{Float64}, SimpleSolvers.NewtonOptimizerCache{Float64, Vector{Float64}}}(Backtracking with α₀ = 1.0, ϵ = 0.0001and p = 0.5., SimpleSolvers.NewtonOptimizerCache{Float64, Vector{Float64}}([1.0, 2.0], [1.0, 2.0], [-1.0, -2.0], [2.0, 4.0], [-2.0, -4.0]))
source
GeometricBase.update!Method
update!(status, x, nls)

Update the status::NonlinearSolverStatus based on x for the NonlinearSystem obj.

Info

This also updates the objective nls!

Sets and to x and f respectively and computes δ as well as γ. The new x and stored in status are used to compute δ. The new f and stored in status are used to compute γ. See NonlinearSolverStatus for an explanation of those variables.

source
GeometricBase.valueMethod
value(obj::AbstractObjective, x)

Evaluates the objective value at x (i.e. computes obj.F(x)).

Examples

using SimpleSolvers

obj = UnivariateObjective(x::Number -> x^2, 1.)
value(obj, 2.)
obj.f_calls

# output

1

Note that the f_calls counter increased by one!

If value is called on obj (an AbstractObjective) without supplying x than the output of the last obj.F call is returned:

using SimpleSolvers

obj = UnivariateObjective(x::Number -> x^2, 1.)
value(obj)

# output

NaN

In this example this is NaN since the function hasn't been called yet.

source
SimpleSolvers.absolute_toleranceMethod
absolute_tolerance(T)

Determine the absolute tolerance for a specific data type. This is used in the constructor of Options.

In comparison to default_tolerance, this should return a very small number, close to zero (i.e. not just machine precision).

Examples

absolute_tolerance(Float64)

# output

0.0
absolute_tolerance(Float32)

# output

0.0f0
source
SimpleSolvers.adjust_αMethod
adjust_α(ls, αₜ, α)

Check which conditions the new αₜ is in $[\sigma_0\alpha_0, \simga_1\alpha_0]$ and return the updated α accordingly (it is updated if it does not lie in the interval).

We first check the following:

\[ \alpha_t < \alpha_0\alpha,\]

where $\sigma_0$ is stored in ls (i.e. in an instance of QuadraticState). If this is not true we check:

\[ \alpha_t > \sigma_1\alpha,\]

where $\sigma_1$ is again stored in ls. If this second condition is also not true we simply return the unchanged $\alpha_t$. So if \alpha_t does not lie in the interval $(\sigma_0\alpha, \sigma_1\alpha)$ the interval is made bigger by either multiplying with $\sigma_0$ (default DEFAULT_ARMIJO_σ₀) or $\sigma_1$ (default DEFAULT_ARMIJO_σ₁).

source
SimpleSolvers.adjust_αMethod
adjust_α(αₜ, α)

Adjust αₜ based on the previous α. Also see adjust_α(::QuadraticState{T}, ::T, ::T) where {T}.

The check that $\alpha \in [\sigma_0\alpha_\mathrm{old}, \sigma_1\alpha_\mathrm{old}]$ should safeguard against stagnation in the iterates as well as checking that $\alpha$ decreases at least by a factor $\sigma_1$. The defaults for σ₀ and σ₁ are DEFAULT_ARMIJO_σ₀ and DEFAULT_ARMIJO_σ₁ respectively.

Implementation

Wee use defaults DEFAULT_ARMIJO_σ₀ and DEFAULT_ARMIJO_σ₁.

source
SimpleSolvers.bisectionMethod
bisection(f, xmin, xmax; config)

Perform bisection of f in the interval [xmin, xmax] with Options config.

The algorithm is repeated until a root is found (up to tolerance config.f_abstol which is determined by default_tolerance by default).

implementation

When calling bisection it first checks if $x_\mathrm{min} < x_\mathrm{max}$ and else flips the two entries.

Extended help

The bisection algorithm divides an interval into equal halves until a root is found (up to a desired accuracy).

We first initialize:

\[\begin{aligned} x_0 \gets & x_\mathrm{min}, x_1 \gets & x_\mathrm{max}, \end{aligned}\]

and then repeat:

\[\begin{aligned} & x \gets \frac{x_0 + x_1}{2}, \\ & \text{if $f(x_0)f(x) > 0$} \\ & \qquad x_0 \gets x, \\ & \text{else} \\ & \qquad x_1 \gets x, \\ & \text{end} \end{aligned}\]

So the algorithm checks in each step where the sign change occurred and moves the $x_0$ or $x_1$ accordingly. The loop is terminated (and errors) if config.max_iterations is reached (by default1000 and the Options struct).

Warning

The obvious danger with using bisections is that the supplied interval can have multiple roots (or no roots). One should be careful to avoid this when fixing the interval.

source
SimpleSolvers.bracket_minimumMethod
bracket_minimum(f, x)

Move a bracket successively in the search direction (starting at x) and increase its size until a local minimum of f is found. This is used for performing Bisections when only one x is given (and not an entire interval). This bracketing algorithm is taken from [4]. Also compare it to bracket_minimum_with_fixed_point.

Keyword arguments

Extended help

For bracketing we need two constants $s$ and $k$ (see DEFAULT_BRACKETING_s and DEFAULT_BRACKETING_k).

Before we start the algorithm we initialize it, i.e. we check that we indeed have a descent direction:

\[\begin{aligned} & a \gets x, \\ & b \gets a + s, \\ & \mathrm{if} \quad f(b) > f(a)\\ & \qquad\text{Flip $a$ and $b$ and set $s\gets-s$.}\\ & \mathrm{end} \end{aligned}\]

The algorithm then successively computes:

\[c \gets b + s,\]

and then checks whether $f(c) > f(b)$. If this is true it returns $(a, c)$ or $(c, a)$, depending on whether $a<c$ or $c<a$ respectively. If this is not satisfied $a,$ $b$ and $s$ are updated:

\[\begin{aligned} a \gets & b, \\ b \gets & c, \\ s \gets & sk, \end{aligned}\]

and the algorithm is continued. If we have not found a sign chance after $n_\mathrm{max}$ iterations (see DEFAULT_BRACKETING_nmax) the algorithm is terminated and returns an error. The interval that is returned by bracket_minimum is then typically used as a starting point for bisection.

Info

The function bracket_root is equivalent to bracket_minimum with the only difference that the criterion we check for is:

\[f(c)f(b) < 0,\]

i.e. that a sign change in the function occurs.

See bracket_root.

source
SimpleSolvers.bracket_minimum_with_fixed_pointMethod
bracket_minimum_with_fixed_point(f, x)

Find a bracket while keeping the left side (i.e. x) fixed. The algorithm is similar to bracket_minimum (also based on DEFAULT_BRACKETING_s and DEFAULT_BRACKETING_k) with the difference that for the latter the left side is also moving.

The function bracket_minimum_with_fixed_point is used as a starting point for Quadratic (taken from [1]), as the minimum of the polynomial approximation is:

\[p_2 = \frac{f(b) - f(a) - f'(0)b}{b^2},\]

where $b = \mathtt{bracket\_minimum\_with\_fixed\_point}(a)$. We check that $f(b) > f(a)$ in order to ensure that the curvature of the polynomial (i.e. $p_2$ is positive) and we have a minimum.

source
SimpleSolvers.check_gradientMethod
check_gradient(g)

Check norm, maximum value and minimum value of a vector.

Examples

using SimpleSolvers

g = [1., 1., 1., 2., 0.9, 3.]
SimpleSolvers.check_gradient(g; digits=3)

# output

norm(Gradient):               4.1
minimum(|Gradient|):          0.9
maximum(|Gradient|):          3.0
source
SimpleSolvers.check_hessianMethod
check_hessian(H)

Check the condition number, determinant, max and min value of the Hessian H.

using SimpleSolvers

H = [1. √2.; √2. 3.]
SimpleSolvers.check_hessian(H)

# output

Condition Number of Hessian: 13.9282
Determinant of Hessian:      1.0
minimum(|Hessian|):          1.0
maximum(|Hessian|):          3.0
source
SimpleSolvers.check_jacobianMethod
check_jacobian(J)

Check the condition number, determinant, max and min value of the Jacobian J.

using SimpleSolvers

J = [1. √2.; √2. 3.]
SimpleSolvers.check_jacobian(J)

# output

Condition Number of Jacobian: 13.9282
Determinant of Jacobian:      1.0
minimum(|Jacobian|):          1.0
maximum(|Jacobian|):          3.0
source
SimpleSolvers.compute_new_iterateMethod
compute_new_iterate(xₖ, αₖ, pₖ)

Compute xₖ₊₁ based on a direction pₖ and a step length αₖ.

Extended help

In the case of vector spaces this function simply does:

xₖ + αₖ * pₖ

For manifolds we instead perform a retraction [5].

source
SimpleSolvers.default_toleranceMethod
default_tolerance(T)

Determine the default tolerance for a specific data type. This is used in the constructor of Options.

Examples

default_tolerance(Float64)

# output

4.440892098500626e-16
default_tolerance(Float32)

# output

2.3841858f-7
default_tolerance(Float16)

# output

Float16(0.001953)
source
SimpleSolvers.f_callsMethod
f_calls(nls)

Tell how many times Function(nls) has been called.

Examples

F(x) = tanh.(x)
x = [1., 2., 3.]
F!(y, x, params) = y .= F(x)
nls = NonlinearSystem(F!, x, F(x))

f_calls(nls)

# output

0

After calling value once we get:

value!(nls, x, nothing)

f_calls(nls)

# output

1
source
SimpleSolvers.factorize!Method
factorize!(lsolver::LinearSolver, A)

Factorize the matrix A and store the result in cache(lsolver).A. Note that calling cache on lsolver returns the instance of LUSolverCache stored in lsolver.

Examples

A = [1. 2. 3.; 5. 7. 11.; 13. 17. 19.]
y = [1., 0., 0.]
x = similar(y)

lsolver = LinearSolver(LU(; static=false), x)
factorize!(lsolver, A)
cache(lsolver).A

# output

3×3 Matrix{Float64}:
 13.0        17.0       19.0
  0.0769231   0.692308   1.53846
  0.384615    0.666667   2.66667

Here cache(lsolver).A stores the factorized matrix. If we call factorize! with two input arguments as above, the method first copies the matrix A into the LUSolverCache. We can equivalently also do:

A = [1. 2. 3.; 5. 7. 11.; 13. 17. 19.]
y = [1., 0., 0.]

lsolver = LinearSolver(LU(), A)
factorize!(lsolver)
cache(lsolver).A

# output

3×3 StaticArraysCore.MMatrix{3, 3, Float64, 9} with indices SOneTo(3)×SOneTo(3):
 13.0        17.0       19.0
  0.0769231   0.692308   1.53846
  0.384615    0.666667   2.66667

Also note the difference between the output types of the two refactorized matrices. This is because we set the keyword static to false when calling LU. Also see _static.

source
SimpleSolvers.increase_iteration_number!Method
increase_iteration_number!(status)

Increase iteration number of status.

Examples

status = NonlinearSolverStatus{Float64}(5)
increase_iteration_number!(status)
iteration_number(status)

# output

1
source
SimpleSolvers.linearsolverMethod
linearsolver(solver)

Return the linear part (i.e. a LinearSolver) of an NewtonSolver.

Examples

x = rand(3)
y = rand(3)
F(x) = tanh.(x)
F!(y, x, params) = y .= F(x)
s = NewtonSolver(x, y; F = F!)
linearsolver(s)

# output

LinearSolver{Float64, LU{Missing}, SimpleSolvers.LUSolverCache{Float64, StaticArraysCore.MMatrix{3, 3, Float64, 9}}}(LU{Missing}(missing, true), SimpleSolvers.LUSolverCache{Float64, StaticArraysCore.MMatrix{3, 3, Float64, 9}}([0.0 0.0 0.0; 0.0 0.0 0.0; 0.0 0.0 0.0], [0, 0, 0], [0, 0, 0], 0))
source
SimpleSolvers.linesearch_objectiveMethod
linesearch_objective(objective, cache)

Create TemporaryUnivariateObjective for linesearch algorithm. The variable on which this objective depends is $\alpha$.

Example

x = [1, 0., 0.]
f = x -> sum(x .^ 3 / 6 + x .^ 2 / 2)
obj = MultivariateObjective(f, x)
gradient!(obj, x)
value!(obj, x)
cache = NewtonOptimizerCache(x)
hess = Hessian(obj, x; mode = :autodiff)
update!(hess, x)
update!(cache, x, obj.g, hess)
x₂ = [.9, 0., 0.]
gradient!(obj, x₂)
value!(obj, x₂)
update!(hess, x₂)
update!(cache, x₂, obj.g, hess)
ls_obj = linesearch_objective(obj, cache)
α = .1
(ls_obj.F(α), ls_obj.D(α))

# output

(0.4412947468016475, -0.8083161485821551)

In the example above we have to apply update! twice on the instance of NewtonOptimizerCache because it needs to store the current and the previous iterate.

Implementation

Calling the function and derivative stored in the TemporaryUnivariateObjective created with linesearch_objective does not allocate a new array, but uses the one stored in the instance of NewtonOptimizerCache.

source
SimpleSolvers.linesearch_objectiveMethod
linesearch_objective(nls, cache, params)

Make a line search objective for a Newton solver (the cache here is an instance of NewtonSolverCache).

Implementation

Producing a single-valued output

Different from the linesearch_objective for NewtonOptimizerCaches, we apply l2norm to the output of objective!. This is because the solver operates on an objective with multiple outputs from which we have to find roots, whereas an optimizer operates on an objective with a single output of which we should find a minimum.

Also see linesearch_objective(::MultivariateObjective{T}, ::NewtonOptimizerCache{T}) where {T}.

source
SimpleSolvers.meets_stopping_criteriaMethod
meets_stopping_criteria(status, config)

Determines whether the iteration stops based on the current NonlinearSolverStatus.

Warn

The function meets_stopping_criteria may return true even if the solver has not converged. To check convergence, call assess_convergence! (with the same input arguments).

The function meets_stopping_criteria returns true if one of the following is satisfied:

  • the status::NonlinearSolverStatus is converged (checked with assess_convergence!) and iteration_number(status) ≥ config.min_iterations,
  • status.f_increased and config.allow_f_increases = false (i.e. f increased even though we do not allow it),
  • iteration_number(status) ≥ config.max_iterations,
  • if any component in solution(status) is NaN,
  • if any component in status.f is NaN,
  • status.rxₐ > config.x_abstol_break (by default Inf. In theory this returns true if the residual gets too big,
  • status.rfₐ > config.f_abstol_break (by default Inf. In theory this returns true if the residual gets too big,

So convergence is only one possible criterion for which meets_stopping_criteria. We may also satisfy a stopping criterion without having convergence!

Examples

In the following example we show that meets_stopping_criteria evaluates to true when used on a freshly allocated NonlinearSolverStatus:

status = NonlinearSolverStatus{Float64}(5)
config = Options()
meets_stopping_criteria(status, config)

# output

true

This obviously has not converged. To check convergence we can use assess_convergence!:

status = NonlinearSolverStatus{Float64}(5)
config = Options()
assess_convergence!(status, config)

# output

false
source
SimpleSolvers.meets_stopping_criteriaMethod
meets_stopping_criteria(status, config)

Check if the optimizer has converged.

Implementation

meets_stopping_criteria first calls assess_convergence! and then checks if one of the following is true:

  • converged (the output of assess_convergence!) is true and status.i $\geq$ config.min_iterations,
  • if config.allow_f_increases is false: status.f_increased is true,
  • status.i $\geq$ config.max_iterations,
  • status.rxₐ $>$ config.x_abstol_break
  • status.rxᵣ $>$ config.x_reltol_break
  • status.rfₐ $>$ config.f_abstol_break
  • status.rfᵣ $>$ config.f_reltol_break
  • status.rg $>$ config.g_restol_break
  • status.x_isnan
  • status.f_isnan
  • status.g_isnan
source
SimpleSolvers.minimum_decrease_thresholdMethod
minimum_decrease_threshold(T)

The minimum value by which a function $f$ should decrease during an iteration.

The default value of $10^-4$ is often used in the literature [bierlaire2015optimization], nocedal2006numerical(@cite).

Examples

minimum_decrease_threshold(Float64)

# output

0.0001
minimum_decrease_threshold(Float32)

# output

0.0001f0
source
SimpleSolvers.print_statusMethod
print_status(status, config)

Print the solver status if:

  1. The following three are satisfied: (i) config.verbosity $\geq1$ (ii) assess_convergence!(status, config) is false (iii) iteration_number(status) > config.max_iterations
  2. config.verbosity > 1.
source
SimpleSolvers.solve!Method
solve!(x, opt)

Solve the optimization problem described by opt::Optimizer and store the result in x.

f(x) = sum(x .^ 2 + x .^ 3 / 3)
x = [1f0, 2f0]
opt = Optimizer(x, f; algorithm = Newton())

solve!(opt, x)

# output
2-element Vector{Float32}:
 4.6478817f-8
 3.0517578f-5

We can also check how many iterations it took:

iteration_number(opt)

# output

12

Too see the value of x after one iteration confer the docstring of solver_step!.

source
SimpleSolvers.solver_step!Method
solver_step!(x, state)

Compute a full iterate for an instance of NewtonOptimizerState state.

This also performs a line search.

Examples

f(x) = sum(x .^ 2 + x .^ 3 / 3)
x = [1f0, 2f0]
opt = Optimizer(x, f; algorithm = Newton())

solver_step!(opt, x)

# output

2-element Vector{Float32}:
 0.25
 0.6666666
source
SimpleSolvers.triple_point_finderMethod
triple_point_finder(f, x)

Find three points a > b > c s.t. f(a) > f(b) and f(c) > f(b). This is used for performing a quadratic line search (see QuadraticState).

Implementation

For δ we take DEFAULT_BRACKETING_s as default. For nmax we take [DEFAULTBRACKETINGnmax`](@ref) as default.

Extended help

The algorithm is taken from [2, Chapter 11.2.1].

source
SimpleSolvers.value!!Method
value!!(obj::AbstractObjective, x)

Set obj.x_f to x and obj.f to value(obj, x) and return value(obj).

source
SimpleSolvers.value!Method
value!(nls::NonlinearSystem, x)

Check if x is not equal to f_argument(nls) and then apply value!!. Else simply return value(nls).

source