# Implementing a solver

In this section we are going to implement the Landweber solver, which essentially is a special form of gradient descent. In iteration \( k \), the new parameter vector \( x_{k+1} \) is calculated from the previous parameter vector \( x_k \) and the gradient at \( x_k \) according to the following update rule:

\[ x_{k+1} = x_k - \omega * \nabla f(x_k) \]

In order to implement this using argmin, one first needs to define the struct `Landweber`

which holds data specific to the solver (the step length \( \omega \)).
Then, the `Solver`

trait needs to be implemented for this struct.

The `Solver`

trait consists of several methods; however, not all of them need to be implemented since most come with default implementations.

`NAME`

: a`&'static str`

which holds the solvers name (mainly needed for the observers).`init(...)`

: Run before the the actual iterations and initializes the solver. Does nothing by default.`next_iter(...)`

: One iteration of the solver. Will be executed by the`Executor`

until a stopping criterion is met.`terminate(...)`

: Solver specific stopping criteria. This method is run after every iteration. Note that one can also terminate from within`next_iter`

if necessary.`terminate_internal(...)`

: By default calls`terminate`

and in addition checks if the maximum number of iterations was reached or if the best cost function value is below the target cost value. Should only be overwritten if absolutely necessary.

Both `init`

and `next_iter`

have access to the optimization problem (`problem`

) as well as the internal state (`state`

).
The methods `terminate`

and `terminate_internal`

only have access to `state`

.

The function parameter `problem`

is a wrapped version of the optimization problem and as such gives access to the cost function, gradient, Hessian, Jacobian,...).
It also keeps track of how often each of these is called.

Via `state`

the solver has access to the current parameter vector, the current best parameter vector, gradient, Hessian, Jacobian, population, the current iteration number, and so on.
The `state`

can be modified (for instance a new parameter vector is set) and is then returned by both `init`

and `next_iter`

.
The `Executor`

then takes care of updating the state properly, for instance by updating the current best parameter vector if the new parameter vector is better than the previous best.

It is advisable to design the solver such that it is generic over the actual type of the parameter vector, gradient, and so on.

The current naming convention for generics in argmin is as follows:

`O`

: Optimization problem`P`

: Parameter vector`G`

: Gradient`J`

: Jacobian`H`

: Hessian`F`

: Floats (`f32`

or`f64`

)

These individual generic parameters are then constrained by type constraints.
For instance, the Landweber iteration requires the problem `O`

to implement `Gradient`

, therefore a trait bound of the form `O: Gradient<Param = P, Gradient = G>`

is necessary.

From the Landweber update formula, we know that a scaled subtraction of two vectors is required.
This must be represented in form of a trait bound as well: `P: ArgminScaledSub<G, F, P>`

.
`ArgminScaledSub`

is a trait from `argmin-math`

which represents a scaled subtraction.
With this trait bound, we require that it must be possible to subtract a value of type `G`

scaled with a value of type `F`

from a value of type `P`

, resulting in a value of type `P`

.

The generic type `F`

represents floating point value and therefore allows users to choose which precision they want.

Implementing the algorithm is straightforward: First we get the current parameter vector `xk`

from the state via `state.take_param()`

.
Note that `take_param`

moves the parameter vector from the `state`

into `xk`

, therefore one needs to make sure to move the updated parameter vector into `state`

at the end of `next_iter`

via `state.param(...)`

.
Landweber requires the user to provide an initial parameter vector.
If this is not the case than we return an error to inform the user.
Then the gradient `grad`

is computed by calling `problem.gradient(...)`

on the parameter vector.
This will return the gradient and internally increase the gradient function evaluation count.
We compute the updated parameter vector `xkp1`

by computing `xk.scaled_sub(&self.omega, &grad)`

(which is possible because of the `ArgminScaledSub`

trait bound introduced before).
Finally, the state is updated via `state.param(xkp1)`

and returned by the function.

`#![allow(unused)] fn main() { extern crate argmin; use argmin::core::{ ArgminFloat, KV, Error, Gradient, IterState, Problem, Solver, State }; use argmin::argmin_error_closure; use serde::{Deserialize, Serialize}; use argmin_math::ArgminScaledSub; // Define a struct which holds any parameters/data which are needed during the // execution of the solver. Note that this does not include parameter vectors, // gradients, Hessians, cost function values and so on, as those will be // handled by the `Executor` and its internal state. #[derive(Serialize, Deserialize)] pub struct Landweber<F> { /// Step length omega: F, } impl<F> Landweber<F> { /// Constructor pub fn new(omega: F) -> Self { Landweber { omega } } } impl<O, F, P, G> Solver<O, IterState<P, G, (), (), F>> for Landweber<F> where // The Landweber solver requires `O` to implement `Gradient`. // `P` and `G` indicate the types of the parameter vector and gradient, // respectively. O: Gradient<Param = P, Gradient = G>, // The parameter vector of type `P` needs to implement `ArgminScaledSub` // because of the update formula P: Clone + ArgminScaledSub<G, F, P>, // `F` is the floating point type (`f32` or `f64`) F: ArgminFloat, { // This gives the solver a name which will be used for logging const NAME: &'static str = "Landweber"; // Defines the computations performed in a single iteration. fn next_iter( &mut self, // This gives access to the problem supplied to the `Executor`. `O` implements // `Gradient` and `Problem` takes care of counting the calls to the respective // functions. problem: &mut Problem<O>, // Current state of the optimization. This gives access to the parameter // vector, gradient, Hessian and cost function value of the current, // previous and best iteration as well as current iteration number, and // many more. mut state: IterState<P, G, (), (), F>, ) -> Result<(IterState<P, G, (), (), F>, Option<KV>), Error> { // First we obtain the current parameter vector from the `state` struct (`x_k`). // Landweber requires an initial parameter vector. Return an error if this was // not provided by the user. let xk = state.take_param().ok_or_else(argmin_error_closure!( NotInitialized, "Initial parameter vector required!" ))?; // Then we compute the gradient at `x_k` (`\nabla f(x_k)`) let grad = problem.gradient(&xk)?; // Now subtract `\nabla f(x_k)` scaled by `omega` from `x_k` // to compute `x_{k+1}` let xkp1 = xk.scaled_sub(&self.omega, &grad); // Return new the updated `state` Ok((state.param(xkp1), None)) } } }`

Another example with a more complex solver which requires `init`

as well as returning a `KV`

is (hopefully) coming soon!