SimpleSolvers

SimpleSolvers.DEFAULT_WOLFE_c₂Constant
const DEFAULT_WOLFE_c₂

The constant used in the second Wolfe condition (the CurvatureCondition). According to [1, 2] we should have

\[c_2 \in (c_1, 1),\]

where $c_1$ is the constant specified by DEFAULT_WOLFE_c₁.

Furthermore [1] recommend $c_2 = 0.9$; in [2] the authors write: "it is common to set $c_2=0.1$ when approximate line search is used with the conjugate gradient method and to 0.9 when used with Newton's method." We use $c_2 = 0.9$ as default.

source
SimpleSolvers.BacktrackingType
Backtracking <: LinesearchMethod

Keys

The keys are:

  • α₀=1.0: the initial step size $\alpha$. This is decreased iteratively by a factor $p$ until the Wolfe conditions (the SufficientDecreaseCondition and the CurvatureCondition) are satisfied.
  • c₁=0.0001: a default step size on whose basis we compute a finite difference approximation of the derivative of the problem. Also see DEFAULT_WOLFE_c₁.
  • c₂=0.9: the constant on whose basis the CurvatureCondition is tested. We should have $c_2\in(c_1, 1).$ The closer this constant is to 1, the easier it is to satisfy the CurvatureCondition.
  • p=0.5: a parameter with which $\alpha$ is decreased in every step until the stopping criterion is satisfied.

Implementation

The algorithm starts by setting:

\[\begin{aligned} x_0 &\gets 0,\\ y_0 &\gets f(x_0),\\ d_0 &\gets f'(x_0),\\ \alpha &\gets \alpha_0, \end{aligned}\]

where $f$ is of type LinesearchProblem and $\alpha_0$ is stored in ls. It then repeatedly does $\alpha \gets \alpha\cdot{}p$ until either (i) the maximum number of iterations is reached (the max_iterations keyword in Options) or (ii) the SufficientDecreaseCondition and the CurvatureCondition are satisfied.

Extended help

Sometimes the parameters $p$ and $\epsilon$ have different names such as $\tau$ and $c$.

source
SimpleSolvers.CurvatureConditionType
CurvatureCondition <: LinesearchCondition

The second of the Wolfe conditions [1]. The first one is the SufficientDecreaseCondition.

This encompasses the standard curvature condition and the strong curvature condition. This can be specified via the mode keyword.

With the standard curvature condition we check:

\[f'(\alpha) ≥ c_2 d,\]

where $c_2$ is the associated hyperparameter and $d$ is the derivative at $\alpha_0$. Further note that $f'(\alpha_0)$ and $d$ should both be negative.

With the strong curvature condition we check:

\[|f'(\alpha)| < c_2 |d|.\]

Constructor

CurvatureCondition(c, xₖ, dₖ, pₖ, grad; mode)

Here grad has to be a function computing the derivative of the objective. The other inputs are numbers.

source
SimpleSolvers.GradientAutodiffType
GradientAutodiff <: Gradient

A struct that realizes Gradient by using ForwardDiff.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • ∇config: result of applying ForwardDiff.GradientConfig.

Constructors

GradientAutodiff(F, x::AbstractVector)
GradientAutodiff{T}(F, nx::Integer)

Functor

The functor does:

grad(g, x) = ForwardDiff.gradient!(g, grad.F, x, grad.∇config)
source
SimpleSolvers.GradientFiniteDifferencesType
GradientFiniteDifferences <: Gradient

A struct that realizes Gradient by using finite differences.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • ϵ: small constant on whose basis the finite differences are computed.
  • e: auxiliary vector used for computing finite differences. It's of the form $e_1 = \begin{bmatrix} 1 & 0 & \cdots & 0 \end{bmatrix}^T$.
  • tx: auxiliary vector used for computing finite differences. It stores the offset in the x vector.

Constructor(s)

GradientFiniteDifferences{T}(F, nx::Integer; ϵ)

By default for ϵ is DEFAULT_GRADIENT_ϵ.

Functor

The functor does (for grad(g, x)):

for j in eachindex(x,g)
    ϵⱼ = grad.ϵ * x[j] + grad.ϵ
    fill!(grad.e, 0)
    grad.e[j] = 1
    grad.tx .= x .- ϵⱼ .* grad.e
    f1 = grad.F(grad.tx)
    grad.tx .= x .+ ϵⱼ .* grad.e
    f2 = grad.F(grad.tx)
    g[j] = (f2 - f1) / (2ϵⱼ)
end
source
SimpleSolvers.GradientFunctionType
GradientFunction <: Gradient

A struct that realizes a Gradient by explicitly supplying a function.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • ∇F!: a function that can be applied in place.

Functor

The functor does:

grad(g, x) = grad.∇F!(g, x)
source
SimpleSolvers.HessianType
Hessian

Abstract type. structs derived from this need an associated functor that computes the Hessian of a function (in-place).

Also see Gradient.

Implementation

When a custom Hessian is implemented, a functor is needed:

function (hessian::Hessian)(h::AbstractMatrix, x::AbstractVector) end

Examples

Examples include:

source
SimpleSolvers.HessianAutodiffType
HessianAutodiff <: Hessian

A struct that realizes Hessian by using ForwardDiff.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • Hconfig: result of applying ForwardDiff.HessianConfig.

Constructors

HessianAutodiff{T}(F, Hconfig)
HessianAutodiff(F, x::AbstractVector)
HessianAutodiff{T}(F, nx::Integer)

Functor

The functor does:

hes(g, x) = ForwardDiff.hessian!(hes.H, hes.F, x, grad.Hconfig)
source
SimpleSolvers.HessianFunctionType
HessianFunction <: Hessian

A struct that realizes a Hessian by explicitly supplying a function.

Keys

The struct stores:

  • H!: a function that can be applied in place.

Functor

The functor does:

hes(H, x) = hes.H!(H, x)
source
SimpleSolvers.JacobianAutodiffType
JacobianAutodiff <: Jacobian

A struct that realizes Jacobian by using ForwardDiff.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • Jconfig: result of applying ForwardDiff.JacobianConfig.
  • ty: vector that is used for evaluating ForwardDiff.jacobian!

Constructors

JacobianAutodiff(F, y::AbstractVector)
JacobianAutodiff{T}(F, nx::Integer)

Functor

The functor does:

jac(J, x) = ForwardDiff.jacobian!(J, jac.ty, x, grad.Jconfig)
source
SimpleSolvers.JacobianFiniteDifferencesType
JacobianFiniteDifferences <: Jacobian

A struct that realizes Jacobian by using finite differences.

Keys

The struct stores:

  • F: a function that has to be differentiated.
  • ϵ: small constant on whose basis the finite differences are computed.
  • f1: $f$ evaluated at $x - (1 + x_j)\epsilon\cdot{}x_j$ for all $j$.
  • f2: $f$ evaluated at $x + (1 + x_j)\epsilon\cdot{}x_j$ for all $j$.
  • e: auxiliary vector used for computing finite differences. It's of the form $e_1 = \begin{bmatrix} 1 & 0 & \cdots & 0 \end{bmatrix}^T$.
  • tx: auxiliary vector used for computing finite differences. It stores the offset in the x vector.

Constructor(s)

JacobianFiniteDifferences{T}(F, nx::Integer, ny::Integer; ϵ)

By default for ϵ is DEFAULT_JACOBIAN_ϵ.

Functor

The functor does:

for j in eachindex(x)
    ϵⱼ = jac.ϵ * x[j] + jac.ϵ
    fill!(jac.e, 0)
    jac.e[j] = 1
    jac.tx .= x .- ϵⱼ .* jac.e
    f(jac.f1, jac.tx)
    jac.tx .= x .+ ϵⱼ .* jac.e
    f(jac.f2, jac.tx)
    for i in eachindex(x)
        J[i,j] = (jac.f2[i] - jac.f1[i]) / (2ϵⱼ)
    end
end
source
SimpleSolvers.JacobianFunctionType
JacobianFunction <: Jacobian

A struct that realizes a Jacobian by explicitly supplying a function taken from the NonlinearProblem.

Functor


f(y, x, params) = y .= [1. √2.; √2. 3.] * x
∇f(j, x, params) = j .= [1. √2.; √2. 3.]

jac = JacobianFunction(f, ∇f, Float64)
j = zeros(Float64, 2, 2)
x = ones(Float64, 2)

jac(j, x, NullParameters())

# output

2×2 Matrix{Float64}:
 1.0      1.41421
 1.41421  3.0
source
SimpleSolvers.LUType
struct LU <: DirectMethod

A custom implementation of an LU solver, meant to solve a LinearProblem.

Routines that use the LU solver include factorize!, ldiv! and solve!.

Constructor

The constructor is called with either no argument:

LU()

# output

LU{Missing}(missing, true)

or with pivot and static as optional booleans:

LU(; pivot=true, static=true)

# output

LU{Bool}(true, true)

Note that if we do not supply an explicit keyword static, the corresponding field is missing (as in the first case). Also see _static.

Example

We use the LU together with solve to solve a linear system:

A = [1. 2. 3.; 5. 7. 11.; 13. 17. 19.]
v = rand(3)
ls = LinearProblem(A, v)
update!(ls, A, v)

lu = LU()

solve(lu, ls) ≈ inv(A) * v

# output

true

Note that role of LinearProblem here.

source
SimpleSolvers.LUSolverCacheType
LUSolverCache <: LinearSolverCache

The cache for the LU solver.

Keys

  • A: the factorized matrix A,
  • pivots: a vector of pivots used during factorization,
  • perms: a vector of permutations used during factorization,
  • info: stores an index regarding pivoting.
source
SimpleSolvers.LinearProblemType
LinearProblem

A LinearProblem describes $Ax = y$, where we want to solve for $x$.

Keys

  • A
  • y

Constructors

A LinearProblem can be allocated by calling:

LinearProblem(A, y)
LinearProblem(A)
LinearProblem(y)
LinearProblem{T}(n, m)
LinearProblem{T}(n)

Note that in any case the allocated system is initialized with NaNs:

A = [1. 2. 3.; 4. 5. 6.; 7. 8. 9.]
y = [1., 2., 3.]
ls = LinearProblem(A, y)

# output

LinearProblem{Float64, Vector{Float64}, Matrix{Float64}}([NaN NaN NaN; NaN NaN NaN; NaN NaN NaN], [NaN, NaN, NaN])

In order to initialize the system with values, we have to call update!:

update!(ls, A, y)

# output

LinearProblem{Float64, Vector{Float64}, Matrix{Float64}}([1.0 2.0 3.0; 4.0 5.0 6.0; 7.0 8.0 9.0], [1.0, 2.0, 3.0])
source
SimpleSolvers.LinearSolverType
LinearSolver <: AbstractSolver

A struct that stores LinearSolverMethods (for example LU) and LinearSolverCaches (for example LUSolverCache). LinearSolvers are used to solve LinearProblems.

Constructors

LinearSolver(method, cache)
LinearSolver(method, A)
LinearSolver(method, ls::LinearProblem)
LinearSolver(method, x)
Info

We note that the constructors do not call the function factorize, so only allocate a new matrix. The factorization needs to be done manually.

You can manually factorize by either calling factorize! or solve!.

source
SimpleSolvers.LinearSolverMethodType
LinearSolverMethod <: SolverMethod

Summarizes all the methods used for solving linear systems of equations such as the LU method.

Extended help

The abstract type SolverMethod was imported from GeometricBase.

source
SimpleSolvers.LinesearchProblemType
LinesearchProblem <: AbstractProblem

In practice LinesearchProblems are allocated by calling linesearch_problem.

Constructors

Calling line search problems

Below we show a constructor that can be used to allocate a LinesearchProblem. Note however that in practice one should call linesearch_problem and not use the constructor directly.

f(x) = x^2 - 1
g(x) = 2x
δx(x) = - g(x) / 2
x₀ = 3.
_f(α,_) = f(compute_new_iterate(x₀, α, δx(x₀)))
_d(α,_) = g(compute_new_iterate(x₀, α, δx(x₀)))
ls_obj = LinesearchProblem{typeof(x₀)}(_f, _d)

# output

LinesearchProblem{Float64, typeof(_f), typeof(_d)}(_f, _d)
source
SimpleSolvers.NewtonMethodType
NewtonMethod <: NonlinearSolverMethod

Constructors

NewtonMethod()

# output

NewtonMethod{true}(1)
QuasiNewtonMethod()

# output

QuasiNewtonMethod(5)
Info

The refactorize parameter determines how often the Jacobian is refactored. This is the difference between the NewtonSolver and QuasiNewtonSolver.

source
SimpleSolvers.NewtonSolverType
NewtonSolver

A const derived from NonlinearSolver as NewtonSolver{T} = NonlinearSolver{T,NewtonMethod{true}}.

Constructors

The NewtonSolver can be called with a NonlinearProblem or with a Callable.

See NewtonSolver(::AbstractVector{T}, ::Callable, ::AbstractVector{T}) where {T}.

F(y, x, params) = y .= sin.(x) ^ 2
x = ones(5)
y = zeros(5)

ns = NewtonSolver(x, F, y)
typeof(ns) <: NewtonSolver

# output

true

Keywords

source
SimpleSolvers.NonlinearProblemType
NonlinearProblem

A NonlinearProblem describes $F(x) = y$, where we want to solve for $x$ and $F$ is in nonlinear in general (also compare this to LinearProblem).

Info

NonlinearProblems are used for solvers whereas OptimizerProblems are their equivalent for optimizers.

Keys

  • F
  • J::Union{Callable, Missing}: accessed by calling jacobian.

Constructors

We show an example for one particular constructor:

F(y, x) = y .= sin.(x) ^ 2
NonlinearProblem(F, zeros(3))

# output

NonlinearProblem{Float64, typeof(F), Missing}(F, missing)
source
SimpleSolvers.NonlinearSolverType
NonlinearSolver

A struct that comprises Newton solvers (see NewtonMethod), the Picard solver (also known as fixed-point iteration; see PicardMethod) and the Dogleg solver (see DogLeg).

Info

The associated solvers are consts derived from NonlinearSolver. See NewtonSolver, PicardSolver and DogLegSolver. In practice we usually call those associated constructors directly rather than creating a NonlinearSolver instance manually.

Keys

source
SimpleSolvers.NonlinearSolverCacheType
NonlinearSolverCache <: AbstractNonlinearSolverCache

Derived from AbstractNonlinearSolverCache. Used in NonlinearSolver.

Keys

source
SimpleSolvers.NonlinearSolverStatusType
NonlinearSolverStatus

Stores absolute, relative and successive residuals for x and f. It is used as a diagnostic tool in NewtonSolver.

Info

Compare this to the NonlinearSolverState and the NonlinearSolverCache.

Keys

  • iteration: number of iterations
  • rxₛ: successive residual in x,
  • rfₐ: absolute residual in f,
  • rfₛ: successive residual in f,
  • x_converged::Bool
  • f_converged::Bool
  • f_increased::Bool

Examples

x = [1., 2., 3., 4.]
state = NonlinearSolverState(x)
cache = NonlinearSolverCache(x, x)
config = Options()
NonlinearSolverStatus(state, config)

# output

i=   0,
rxₛ= NaN,
rfₐ= NaN,
rfₛ= NaN
source
SimpleSolvers.OptionsType
Options

Examples

Options()

# output

                x_abstol = 4.440892098500626e-16
                x_reltol = 4.440892098500626e-16
                x_suctol = 4.440892098500626e-16
                f_abstol = 0.0
                f_reltol = 4.440892098500626e-16
                f_suctol = 4.440892098500626e-16
                f_mindec = 0.0001
                g_restol = 1.4901161193847656e-8
          x_abstol_break = Inf
          x_reltol_break = Inf
          f_abstol_break = Inf
          f_reltol_break = Inf
          g_restol_break = Inf
       allow_f_increases = true
          min_iterations = 0
          max_iterations = 1000
         warn_iterations = 1000
              show_trace = false
             store_trace = false
          extended_trace = false
              show_every = 1
               verbosity = 1
      nan_max_iterations = 10
              nan_factor = 0.5
   regularization_factor = 0.0
Info

For the first few constants (x_abstol to g_restol) the default constructor uses the functions default_tolerance and absolute_tolerance.

source
SimpleSolvers.PicardSolverMethod
PicardSolver(x, F)

Arguments

  • x: the initial guess for the solution.
  • F: the nonlinear function to solve.
  • y

Keywords

  • DF!: the Jacobian of F,
  • linesearch: the linesearch algorithm to use, defaults to Backtracking,
  • jacobian: the Jacobian of F, defaults to JacobianAutodiff,
  • options_kwargs: see Options.

Examples

F(y, x, params) = y .= sin.(x) .^ 2
x = zeros(2)
y = similar(x)

s = PicardSolver(x, F, y)
state = SolverState(s)

solve!(x, s, state)

# output

2-element Vector{Float64}:
 0.0
 0.0
source
SimpleSolvers.QuadraticType
Quadratic <: LinesearchMethod

Quadratic Polynomial line search based on the polynomial

\[p(α) = p_0 + p_1(\alpha - \alpha_0) + p_2(\alpha - \alpha_0)^2.\]

Performs multiple iterations in which all parameters $p_0$, $p_1$ and $p_2$ are adapted. We do not check the SufficientDecreaseCondition here. We instead repeatedly build new quadratic polynomials until a minimum is found (to sufficient accuracy).

This algorithm repeatedly builds new quadratic polynomials until a minimum is found (to sufficient accuracy). The iteration may also stop after we reaches the maximum number of iterations (see MAX_NUMBER_OF_ITERATIONS_FOR_QUADRATIC_LINESEARCH).

Keywords

  • ε: A constant that checks the precision/tolerance.
  • s: A constant that determines the initial interval for bracketing. By default this is DEFAULT_BRACKETING_s.
  • s_reduction: A constant that determines the factor by which s is decreased in each new bracketing iteration.

Extended help

The quadratic method. Compare this to BierlaireQuadratic. The algorithm is adjusted from [5].

source
SimpleSolvers.QuasiNewtonSolverMethod
QuasiNewtonSolver

A convenience constructor for NewtonSolver. Also see DEFAULT_ITERATIONS_QUASI_NEWTON_SOLVER.

Calling QuasiNewtonSolver hence produces an instance of NewtonSolver with the difference that refactorize ≠ 1. The Jacobian is thus not evaluated and refactored in every step.

Implementation

It does:

QuasiNewtonSolver(args...; kwargs...) = NewtonSolver(args...; refactorize=DEFAULT_ITERATIONS_QUASI_NEWTON_SOLVER, kwargs...)
Warning

This will be deprecated in the future in the favour of using NewtonSolver directly and specifying the refactorize integer.

source
SimpleSolvers.StaticType
Static <: LinesearchMethod

The static method.

Keys

Keys include:

  • α: equivalent to a step size. The default is 1.

Examples

Static()

# output

Static with α = 1.0.
source
SimpleSolvers.SufficientDecreaseConditionType
SufficientDecreaseCondition <: BacktrackingCondition

The condition that determines if the change induced by $\alpha_k$ is big enough. This is used in Backtracking.

Example

c = SimpleSolvers.DEFAULT_WOLFE_c₁
f(x) = (x - 1.) ^ 2
xₖ = 0.
fₖ = f(xₖ)
dₖ = 2xₖ - 2.

sdc = SufficientDecreaseCondition(c, fₖ, dₖ, f)
sdc(1.9), sdc(2.)

# output

(true, false)

Extended help

We call the constant that pertains to the sufficient decrease condition $c$. This is typically called $c_1$ in the literature [1]. See DEFAULT_WOLFE_c₁ for the relevant constant

source
GeometricBase.update!Method
update!(state, x, y)

Update , , x and y.

Examples

julia> f(y, x, params) = y .= sin.(x .- .5) .^ 2
f (generic function with 1 method)

julia> x = ones(1) / 4
1-element Vector{Float64}:
 0.25

julia> y = zero(x); f(y, x, NullParameters())
1-element Vector{Float64}:
 0.06120871905481365

julia> state = NonlinearSolverState(x)
NonlinearSolverState{Float64, Vector{Float64}, Vector{Float64}}(0, [NaN], [NaN], [NaN], [NaN])

julia> update!(state, x, y)
NonlinearSolverState{Float64, Vector{Float64}, Vector{Float64}}(0, [0.25], [NaN], [0.06120871905481365], [NaN])

julia> x = ones(1) / 2
1-element Vector{Float64}:
 0.5

julia> f(y, x, NullParameters())
1-element Vector{Float64}:
 0.0

julia> update!(state, x, y)
NonlinearSolverState{Float64, Vector{Float64}, Vector{Float64}}(0, [0.5], [0.25], [0.0], [0.06120871905481365])

The NonlinearSolverState stores the previous solution, the previous value, the current solution and the current value.

All of these are updated during one update! step (and initialized with NaNs).

source
LinearAlgebra.ldiv!Method
ldiv!(x, lu, b)

Compute inv(cache(lsolver).A) * b by utilizing the factorization of the lu solver (see LU and LinearSolver) and store the result in x.

Examples

julia> A = [1.; 0.; 0.;; 0.; 2.; 0.;; 0.; 0.; 4.]
3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  2.0  0.0
 0.0  0.0  4.0

julia> b = [1., 1., 1.]
3-element Vector{Float64}:
 1.0
 1.0
 1.0

julia> s = LinearSolver(LU(), A); factorize!(s); x = zeros(3)
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> ldiv!(x, s, b)
3-element Vector{Float64}:
 1.0
 0.5
 0.25
Info

Note that we need to call factorize! here after having allocated the LinearSolver.

source
SimpleSolvers.absolute_toleranceMethod
absolute_tolerance(T)

Determine the absolute tolerance for a specific data type. This is used in the constructor of Options.

In comparison to default_tolerance, this should return a very small number, close to zero (i.e. not just machine precision).

Examples

julia> absolute_tolerance(Float64)
0.0
julia> absolute_tolerance(Float32)
0.0f0
source
SimpleSolvers.bisectionMethod
bisection(f, xmin, xmax; config)

Perform bisection of f in the interval [xmin, xmax] with Options config.

The algorithm is repeated until a root is found (up to tolerance config.f_abstol which is determined by default_tolerance by default).

Info

When calling bisection it first checks if $x_\mathrm{min} < x_\mathrm{max}$ and else flips the two entries.

Info

You can also call bisection with only one x as input argument. It then uses bracket_minimum to find a suitable interval.

Extended help

The bisection algorithm divides an interval into equal halves until a root is found (up to a desired accuracy).

We first initialize:

\[\begin{aligned} \alpha_0 \gets & \alpha_\mathrm{min}, \\ \alpha_1 \gets & \alpha_\mathrm{max}, \end{aligned}\]

and then repeat:

\[\begin{aligned} & \alpha \gets \frac{\alpha_0 + \alpha_1}{2}, \\ & \text{if $f(\alpha_0)f(\alpha) > 0$} \\ & \qquad \alpha_0 \gets \alpha, \\ & \text{else} \\ & \qquad \alpha_1 \gets \alpha, \\ & \text{end} \end{aligned}\]

So the algorithm checks in each step where the sign change occurred and moves the $\alpha_0$ or $\alpha_1$ accordingly. The loop is terminated (and errors) if config.max_iterations is reached (by default 1000 in Options struct).

Warning

The obvious danger with using bisections is that the supplied interval can have multiple roots (or no roots). One should be careful to avoid this when fixing the interval.

source
SimpleSolvers.bracket_minimumMethod
bracket_minimum(f, x)

Move a bracket successively in the search direction (starting at x) and increase its size until a local minimum of f is found.

This is used in bisections when only one x is given (and not an entire interval).

This bracketing algorithm is taken from [2]. Also compare it to bracket_minimum_with_fixed_point.

Arguments

Extended help

For bracketing we need two constants $s$ and $k$ (see DEFAULT_BRACKETING_s and DEFAULT_BRACKETING_k).

Before we start the algorithm we initialize it, i.e. we check that we indeed have a descent direction:

\[\begin{aligned} & a \gets x, \\ & b \gets a + s, \\ & \mathrm{if} \quad f(b) > f(a)\\ & \qquad\text{Flip $a$ and $b$ and set $s\gets-s$.}\\ & \mathrm{end} \end{aligned}\]

The algorithm then successively computes:

\[c \gets b + s,\]

and then checks whether $f(c) > f(b)$ (also see BracketMinimumCriterion). If this is true it returns $(a, c)$ or $(c, a)$, depending on whether $a<c$ or $c<a$ respectively. If this is not satisfied $a,$ $b$ and $s$ are updated:

\[\begin{aligned} a \gets & b, \\ b \gets & c, \\ s \gets & sk, \end{aligned}\]

and the algorithm is continued. If we have not found a sign change after $n_\mathrm{max}$ iterations (see DEFAULT_BRACKETING_nmax) the algorithm is terminated and returns an error. The interval that is returned by bracket_minimum is then typically used as a starting point for bisection.

Info

The function bracket_root is equivalent to bracket_minimum with the only difference that the criterion we check for is:

\[f(c)f(b) < 0,\]

i.e. that a sign change in the function occurs. Also see BracketRootCriterion.

source
SimpleSolvers.bracket_minimum_with_fixed_pointMethod
bracket_minimum_with_fixed_point(f, x, s, k, nmax)

Find a bracket while keeping the left side (i.e. x) fixed.

The algorithm is similar to bracket_minimum (also based on DEFAULT_BRACKETING_s and DEFAULT_BRACKETING_k) with the difference that for the latter the left side is also moving.

The function bracket_minimum_with_fixed_point is used as a starting point for Quadratic (adapted from [5]), as the coefficient $p_2$ of the fitted polynomial is:

\[p_2 = \frac{f(b) - f(a) - f'(a)b}{b^2},\]

where $b = \mathtt{bracket\_minimum\_with\_fixed\_point}(a)$. We check that $f(b) > f(a)$ in order to ensure that the curvature of the polynomial (i.e. $p_2$ is positive) and we have a minimum.

source
SimpleSolvers.cacheMethod
cache(ls)

Return the cache of the LinearSolver.

Examples

julia> ls = LinearSolver(LU(), [1 2; 3 4]);

julia> cache(ls)
SimpleSolvers.LUSolverCache{Int64, StaticArraysCore.MMatrix{2, 2, Int64, 4}}([1 2; 3 4], [0, 0], [0, 0], 0)
source
SimpleSolvers.check_gradientMethod
check_gradient(g)

Check norm, maximum value and minimum value of a vector.

Examples

julia> g = [1., 1., 1., 2., 0.9, 3.];

julia> SimpleSolvers.check_gradient(g; digits=3)
norm(Gradient):               4.1
minimum(|Gradient|):          0.9
maximum(|Gradient|):          3.0
source
SimpleSolvers.check_hessianMethod
check_hessian(H)

Check the condition number, determinant, max and min value of the Hessian H.

Info

Here the Hessian H is a matrix and not of type Hessian.

julia> H = [1. √2.; √2. 3.];

julia> SimpleSolvers.check_hessian(H)
Condition Number of Hessian: 13.9282
Determinant of Hessian:      1.0
minimum(|Hessian|):          1.0
maximum(|Hessian|):          3.0
source
SimpleSolvers.check_jacobianMethod
check_jacobian(J)

Check the condition number, determinant, max and min value of the Jacobian J.

Info

Here the Jacobian J is a matrix. It is not a Jacobian object.

julia> J = [1. √2.; √2. 3.];

julia> SimpleSolvers.check_jacobian(J)
Condition Number of Jacobian: 13.9282
Determinant of Jacobian:      1.0
minimum(|Jacobian|):          1.0
maximum(|Jacobian|):          3.0
source
SimpleSolvers.compute_new_iterate!Method
compute_new_iterate!(xₖ₊₁, xₖ, αₖ, pₖ)

Compute xₖ₊₁ based on a direction pₖ and a step length αₖ.

Extended help

In the case of vector spaces this function simply does:

xₖ = xₖ + αₖ * pₖ

For manifolds we instead perform a retraction [6].

source
SimpleSolvers.default_toleranceMethod
default_tolerance(T)

Determine the default tolerance for a specific data type. This is used in the constructor of Options.

Compare this to default_precision.

Examples

julia> default_tolerance(Float64)
4.440892098500626e-16
julia> default_tolerance(Float32)
2.3841858f-7
julia> default_tolerance(Float16)
Float16(0.001953)
source
SimpleSolvers.directions!Method
directions!(s, x, params)

Compute direction₁ and direction₂ for the DogLegSolver.

This is equivalent to direction! for the NewtonSolver.

Examples

julia> J = [0 1; -1 0];

julia> f(y, x, params) = y .= cos.(J * x .- 2.) .^ 2 / l2norm(sin.(x) .- 1.);

julia> x = zeros(2); y = similar(x); s = DogLegSolver(x, y; F = f);

julia> directions!(s, x, NullParameters());

julia> direction₁(cache(s))
2-element Vector{Float64}:
 -0.25513686072399455
  0.1601152321012896

julia> direction₂(cache(s))
2-element Vector{Float64}:
 -0.22882877718014286
  0.22882877718014288

Extended help

The Gauss-Newton direction (i.e. direction₂) is computed the usual way:

\[\mathbf{d}_2 = -\mathbf{J}^{-1} \mathbf{r}\]

where $\mathbf{J}$ is the Jacobian matrix and $\mathbf{r}$ is the residual vector. The steepest descent direction (taken from [1, Equation (11.46)]) is different:

\[\mathbf{d}_1 = -\frac{||\mathbf{J}^T\mathbf{r}||^2}{\mathbf{r}^T(\mathbf{J}\mathbf{J}^T)(\mathbf{J}\mathbf{J}^T)\mathbf{r}}\mathbf{J}^T\mathbf{r}.\]

The DogLegSolver then interpolates between these two directions (this interpolation is piecewise linear).

source
SimpleSolvers.factorize!Method
factorize!(lsolver::LinearSolver, A)

Factorize the matrix A and store the result in cache(lsolver).A.

Note that calling cache on lsolver returns the instance of LUSolverCache stored in lsolver.

Examples

julia> A = [1. 2. 3.; 5. 7. 11.; 13. 17. 19.]
3×3 Matrix{Float64}:
  1.0   2.0   3.0
  5.0   7.0  11.0
 13.0  17.0  19.0

julia> x = zeros(3);

julia> lsolver = LinearSolver(LU(; static=false), x);

julia> factorize!(lsolver, A).cache.A
3×3 Matrix{Float64}:
 13.0        17.0       19.0
  0.0769231   0.692308   1.53846
  0.384615    0.666667   2.66667

julia> y = A * ldiv!(x, lsolver, ones(3));

julia> round.(y; digits = 10)
3-element Vector{Float64}:
 1.0
 1.0
 1.0

Here cache(lsolver).A stores the factorized matrix. If we call factorize! with two input arguments as above, the method first copies the matrix A into the LUSolverCache. We can equivalently also do:

julia> lsolver = LinearSolver(LU(), A);

julia> factorize!(lsolver).cache.A
3×3 StaticArraysCore.MMatrix{3, 3, Float64, 9} with indices SOneTo(3)×SOneTo(3):
 13.0        17.0       19.0
  0.0769231   0.692308   1.53846
  0.384615    0.666667   2.66667

Also note the difference between the output types of the two refactorized matrices. This is because we set the keyword static to false when calling LU. Also see _static.

Also see ldiv! for how the refactorized matrix is used.

source
SimpleSolvers.linearsolverMethod
linearsolver(solver)

Return the linear part (i.e. a LinearSolver) of an NewtonSolver.

Examples

x = rand(3)
y = rand(3)
F(x) = tanh.(x)
F!(y, x, params) = y .= F(x)
s = NewtonSolver(x, y; F = F!)
linearsolver(s)

# output

LinearSolver{Float64, LU{Missing}, SimpleSolvers.LUSolverCache{Float64, StaticArraysCore.MMatrix{3, 3, Float64, 9}}}(LU{Missing}(missing, true), SimpleSolvers.LUSolverCache{Float64, StaticArraysCore.MMatrix{3, 3, Float64, 9}}([0.0 0.0 0.0; 0.0 0.0 0.0; 0.0 0.0 0.0], [0, 0, 0], [0, 0, 0], 0))
source
SimpleSolvers.linesearch_problemMethod
linesearch_problem(nl::NonlinearSolver)

Build a line search problem based on a NonlinearSolver.

Producing a single-valued output

Different from the linesearch_problem for NewtonOptimizerCaches, we apply L2norm to the output of problem!. This is because the solver operates on an optimizer problem with multiple outputs from which we have to find roots, whereas an optimizer operates on an optimizer problem with a single output of which we should find a minimum.

Examples

We show how to set up the LinesearchProblem for a simple example and compute $f^\mathrm{ls}(\alpha_0)$ and $\partial{}f^\mathrm{ls}/\partial\alpha(\alpha_0)$.

julia> F(y, x, params) = y .= (x .- 1.).^2;

julia> x = ones(3)/2; y = similar(x); nl = NewtonSolver(x, y; F = F);

julia> _params = NullParameters();

julia> direction!(nl, x, _params, 1)
3-element Vector{Float64}:
 0.25
 0.25
 0.25

julia> ls_prob = linesearch_problem(nl);

julia> state = NonlinearSolverState(x); update!(state, x, F(y, x, _params));

julia> params = (parameters = _params, x = state.x)
(parameters = NullParameters(), x = [0.5, 0.5, 0.5])

julia> ls_prob.F(0., params)
0.1875

julia> ls_prob.D(0., params)
-0.375
source
SimpleSolvers.meets_stopping_criteriaMethod
meets_stopping_criteria(status, config)

Determines whether the iteration stops based on the current NonlinearSolverStatus.

Warning

The function meets_stopping_criteria may return true even if the solver has not converged. To check convergence, call assess_convergence (with the same input arguments).

The function meets_stopping_criteria returns true if one of the following is satisfied:

  • the status::NonlinearSolverStatus is converged (checked with isconverged) and iteration_number(status) ≥ config.min_iterations,
  • status.f_increased and config.allow_f_increases = false (i.e. f increased even though we do not allow it),
  • iteration_number(status) ≥ config.max_iterations,
  • if any component in solution(status) is NaN,
  • if any component in status.f is NaN,
  • status.rxₐ > config.x_abstol_break (by default Inf. In theory this returns true if the residual gets too big,
  • status.rfₐ > config.f_abstol_break (by default Inf. In theory this returns true if the residual gets too big,

So convergence is only one possible criterion for which meets_stopping_criteria. We may also satisfy a stopping criterion without having convergence!

Examples

In the following example we show that meets_stopping_criteria evaluates to true when used on a freshly allocated NonlinearSolverStatus:

julia> config = Options(verbosity=0);

julia> x = [NaN, 2., 3.]
3-element Vector{Float64}:
 NaN
   2.0
   3.0

julia> f = [NaN, 10., 20.]
3-element Vector{Float64}:
 NaN
  10.0
  20.0

julia> cache = NonlinearSolverCache(x, copy(x));

julia> state = NonlinearSolverState(x);

julia> update!(state, x, f); state.iterations += 1
1

julia> status = NonlinearSolverStatus(state, config);

julia> meets_stopping_criteria(state, config)
true

This obviously has not converged. To check convergence we can use assess_convergence. ```

source
SimpleSolvers.minimum_decrease_thresholdMethod
minimum_decrease_threshold(T)

The minimum value by which a function $f$ should decrease during an iteration.

The default value of $10^{-4}$ is often used in the literature [3], [1].

Examples

julia> minimum_decrease_threshold(Float64)
0.0001
julia> minimum_decrease_threshold(Float32)
0.0001f0
source
SimpleSolvers.print_statusMethod
print_status(status, config)

Print the solver status if:

  • config.verbosity $\geq1$ and one of the following two
  1. the solver is converged,
  2. status.iterations > config.max_iterations
  • config.verbosity $>1.$
source
SimpleSolvers.residualsMethod
residuals(state)

Compute the residuals for cache::NonlinearSolverCache. The computed residuals are the following:

  • rxₛ : successive residual (the norm of $x - \bar{x}$),
  • rfₐ: absolute residual in $f$,
  • rfₛ : successive residual (the norm of $y - \bar{y}$).
source
SimpleSolvers.solve!Method
solve!(x, ls::LinearSolver, lsys::LinearProblem)

Solve the LinearProblem lsys with the LinearSolver ls and store the result in x.

Also see solve(::LU, ::AbstractMatrix, ::AbstractVector).

Examples

julia> x = zeros(3)
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> A = [1.; 0.; 0.;; 0.; 2.; 0.;; 0.; 0.; 4.]
3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  2.0  0.0
 0.0  0.0  4.0

julia> b = ones(3)
3-element Vector{Float64}:
 1.0
 1.0
 1.0

julia> ls = LinearSolver(LU(), x);

julia> problem = LinearProblem(x); update!(problem, A, b);

julia> solve!(x, ls, problem)
3-element Vector{Float64}:
 1.0
 0.5
 0.25
source
SimpleSolvers.solveMethod
solve(lu, A, b)

Solve the linear problem determined by A and b.

This is the most straightforward way to solve this system.

Examples

julia> A = [1.; 0.; 0.;; 0.; 2.; 0.;; 0.; 0.; 4.]
3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  2.0  0.0
 0.0  0.0  4.0

julia> b = ones(3)
3-element Vector{Float64}:
 1.0
 1.0
 1.0

julia> solve(LU(), A, b)
3-element StaticArraysCore.SizedVector{3, Float64, Vector{Float64}} with indices SOneTo(3):
 1.0
 0.5
 0.25

Compare this to solve!(::AbstractVector, ::LinearSolver, ::LinearProblem).

source
SimpleSolvers.solver_step!Method
solver_step!(x, s, state, params)

Compute one step for solving the problem stored in an instance s of NonlinearSolver.

Examples

julia> f(y, x, params) = y .= sin.(x .- .5) .^ 2
f (generic function with 1 method)

julia> x = ones(3) / 4
3-element Vector{Float64}:
 0.25
 0.25
 0.25

julia> y = zero(x)
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> s = NewtonSolver(x, similar(x); F = f);

julia> state = NonlinearSolverState(x); update!(state, x, f(y, x, NullParameters()));

julia> solver_step!(x, s, state, NullParameters())
3-element Vector{Float64}:
 0.37767096061051814
 0.37767096061051814
 0.37767096061051814
source
SimpleSolvers.triple_point_finderMethod
triple_point_finder(f, x)

Find three points a > b > c s.t. f(a) > f(b) and f(c) > f(b). This is used for performing a quadratic line search (see Quadratic).

Implementation

For δ we take DEFAULT_BRACKETING_s as default. For nmax we take DEFAULT_BRACKETING_nmax as default.

Examples

julia> f(x) = x ^ 2
f (generic function with 1 method)

julia> x = -1.
-1.0

julia> a, b, c = round10.(triple_point_finder(f, x))
(-0.37, 0.27, 1.55)

julia> round10.((f(a), f(b), f(c)))
(0.1369, 0.0729, 2.4025)

Extended help

The algorithm is taken from [3, Chapter 11.2.1].

source