# Introduction

argmin is a numerical optimization library written entirely in Rust.

Its goal is to offer a wide range of optimization algorithms with a consistent interface.
It is type-agnostic by design, meaning that any type and/or math backend, such as `nalgebra`

or `ndarray`

can be used -- even your own.

Observers allow one to track the progress of iterations, either by using one of the provided ones for logging to screen or disk or by implementing your own.

An optional checkpointing mechanism helps to mitigate the negative effects of crashes in unstable computing environments.

Due to Rusts powerful generics and traits, most features can be exchanged by your own tailored implementations.

argmin is designed to simplify the implementation of optimization algorithms and as such can also be used as a toolbox for the development of new algorithms. One can focus on the algorithm itself, while the handling of termination, parameter vectors, populations, gradients, Jacobians and Hessians is taken care of by the library.

IMPORTANT NOTEThis book covers version 0.7 of argmin! Parts of this book may not apply to versions below 0.7.

## The argmin ecosystem

The ecosystem consists of a number of crates:

- argmin: Optimization algorithms and framework
- argmin-math: Interface for math backend abstraction and implementations for various versions of ndarray, nalgebra and
`Vec`

s. - argmin-testfunctions: A collection of test functions
- finitediff: Finite differentiation
- modcholesky: Modified cholesky decompositions

## Algorithms

argmin comes with a number of line searches (Backtracking, More-Thuente, Hager-Zhang, trust region methods (Cauchy point, Dogleg, Steihaug), Steepest descent, (Nonlinear) conjugate gradient, Newton method, Newton-CG, Quasi-Newton methods (BFGS, L-BFGS, DFP, SR1-TrustRegion), Gauss-Newton methods (with and without line search), Golden-section search, Landweber, Brents optimization and root finding methods, Nelder-Mead, Simulated Annealing, Particle Swarm Optimization and CMA-ES.

For a complete and up-to-date list of all algorithms please visit the API documentation.

Examples for each algorithm can be found on Github. Make sure to choose the tag matching the argmin version you are using.

## Documentation

This book is a guide on how to use argmins algorithms as well as on how to implement algorithms using argmins framework. For detailed information on specific algorithms or traits, please refer to argmins API documentation.

The argmin-math documentation outlines the abstractions over the math backends and the argmin-testfunctions API documentation lists all available test functions.

For details on how to use `finitediff`

for finite differentiation, please refer to the corresponding API documentation.

The documentation of `modcholesky`

can be found here.

## License

Both the source code as well as the documentation are licensed under either of

- Apache License, Version 2.0, ([LICENSE-APACHE][__link35] or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License ([LICENSE-MIT][__link37] or http://opensource.org/licenses/MIT)

at your option.

#### Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

# Getting started

This chapter shows how to integrate argmin into your project and how to define an optimization problem and apply a solver to it. It also show cases other features such as observers and checkpointing.

# Using argmin

In order to use argmin, one needs to add both `argmin`

and `argmin-math`

to `Cargo.toml`

:

```
[dependencies]
argmin = { version = "0.7" }
argmin-math = { version = "0.2", features = ["ndarray_latest-serde,nalgebra_latest-serde"] }
```

or, for the current development version:

```
[dependencies]
argmin = { git = "https://github.com/argmin-rs/argmin" }
argmin-math = { git = "https://github.com/argmin-rs/argmin", features = ["ndarray_latest-serde,nalgebra_latest-serde"] }
```

Via adding `argmin-math`

one can choose which math backend should be available (and whether `serde`

-support is enabled or not).
The examples above use both the `ndarray_latest-serde`

and `nalgebra_latest-serde`

features of `argmin-math`

.
For details on which options are available in `argmin-math`

, please refer to the documentation.

## Crate features

argmin offers a number of features which can be enabled or disabled depending on your needs.

### Default

`slog-logger`

: Support for logging observers based on`slog`

.`serde1`

: Support for`serde`

. Needed for checkpointing and writing parameters to disk as well as logging to disk. Deactivating this feature leads to fewer dependencies and can lower compilation time, but it will also disable checkpointing and logging to disk.

### Optional

`ctrl`

: This feature uses the`ctrlc`

crate to properly stop the optimization (and return the current best result) after pressing`Ctrl+C`

during an optimization run.`rayon`

: This feature adds`rayon`

as a depenceny and allows for parallel computation of cost functions, operators, gradients, Jacobians and Hessians. Note that only solvers that operate on multiple parameter vectors per iteration benefit from this feature (e.g. Particle Swarm Optimization).`full`

: Enables all default and optional features.

### Experimental support for compiling to WebAssembly

Compiling to WASM requires the feature `wasm-bindgen`

.
WASM support is still experimental. Please report any issues you encounter when using argmin in a WASM context.

## Which math backend to use

argmin offers abstractions over basic `Vec`

s, `ndarray`

and `nalgebra`

types.
For performance reasons, the latter two should be prefered. Which one to use is a matter of taste and may depend on what you are already using.

`Vec`

s on the other hand do not have very efficient implementations for the different mathematical operations and therefore are not well suited for solvers which heavily rely on matrix operations.
However, `Vec`

s are suitable for solvers such as Simulated Annealing and Particle Swarm Optimization, which mainly operate on the parameter vectors themselves.

# Basic concept

There are three components needed for solving an optimization problem in argmin:

- A definition of the optimization problem / model
- A solver
- An executor

The Executor applies the solver to the optimization problem. It also accepts observers and checkpointing mechanisms, as well as an initial guess of the parameter vector, the cost function value at that initial guess, gradient, and so on.

A solver is anything that implements the Solver trait. This trait defines how the optimization algorithm is initialized, how a single iteration is performed and when and how to terminate the iterations.

The optimization problem needs to implement a subset of the traits
`CostFunction`

,
`Gradient`

,
`Jacobian`

,
`Hessian`

, and
`Operator`

.
Which subset is needed is given by the requirements of the solver.
For example, `SteepestDescent`

, as a gradient descent method, requires `CostFunction`

and `Gradient`

, while Newton's method expects `Gradient`

and `Hessian`

.

# Defining an optimization problem

Depending on the requirements of the solver that is to be used, the optimization problem needs to implement a subset of the traits

`CostFunction`

: Computes the cost or fitness for a parameter vector`p`

`Gradient`

: Computes the gradient for a parameter vector`p`

`Jacobian`

: Computes the Jacobian for a parameter vector`p`

`Hessian`

: Computes the Hessian for a parameter vector`p`

`Operator`

: Applies an operator to the parameter vector`p`

`Anneal`

: Create a new parameter vector by "annealing" of the curent parameter vector`p`

(needed for SimulatedAnnealing).

Which subset is needed is given in the documentation of each solver.

## Example

The following code snippet shows how to use the Rosenbrock test functions from `argmin-testfunctions`

in argmin.
For the sake of simplicity, this example will use the `vec`

math backend.

`#![allow(unused)] fn main() { extern crate argmin; extern crate argmin_testfunctions; use argmin_testfunctions::{ rosenbrock_2d, rosenbrock_2d_derivative, rosenbrock_2d_hessian }; use argmin::core::{Error, CostFunction, Gradient, Hessian}; /// First, we create a struct called `Rosenbrock` for your problem struct Rosenbrock { a: f64, b: f64, } /// Implement `CostFunction` for `Rosenbrock` /// /// First, we need to define the types which we will be using. Our parameter /// vector will be a `Vec` of `f64` values and our cost function value will /// be a 64 bit floating point value. /// This is reflected in the associated types `Param` and `Output`, respectively. /// /// The method `cost` then defines how the cost function is computed for a /// parameter vector `p`. Note that we have access to the fields `a` and `b` /// of `Rosenbrock`. impl CostFunction for Rosenbrock { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the return value computed by the cost function type Output = f64; /// Apply the cost function to a parameter `p` fn cost(&self, p: &Self::Param) -> Result<Self::Output, Error> { // Evaluate 2D Rosenbrock function Ok(rosenbrock_2d(p, self.a, self.b)) } } /// Implement `Gradient` for `Rosenbrock` /// /// Similarly to `CostFunction`, we need to define the type of our parameter /// vectors and of the gradient we are computing. Since the gradient is also /// a vector, it is of type `Vec<f64>` just like `Param`. impl Gradient for Rosenbrock { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the gradient type Gradient = Vec<f64>; /// Compute the gradient at parameter `p`. fn gradient(&self, p: &Self::Param) -> Result<Self::Gradient, Error> { // Compute gradient of 2D Rosenbrock function Ok(rosenbrock_2d_derivative(p, self.a, self.b)) } } /// Implement `Hessian` for `Rosenbrock` /// /// Again the types of the involved parameter vector and the Hessian needs to /// be defined. Since the Hessian is a 2D matrix, we use `Vec<Vec<f64>>` here. impl Hessian for Rosenbrock { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the Hessian type Hessian = Vec<Vec<f64>>; /// Compute the Hessian at parameter `p`. fn hessian(&self, p: &Self::Param) -> Result<Self::Hessian, Error> { // Compute Hessian of 2D Rosenbrock function let t = rosenbrock_2d_hessian(p, self.a, self.b); // Reshape the output Ok(vec![vec![t[0], t[1]], vec![t[2], t[3]]]) } } }`

The following example shows how to implement the `Operator`

trait when the operator is the 2x2 matrix `[4.0, 1.0; 1.0, 3.0]`

.

`#![allow(unused)] fn main() { extern crate argmin; extern crate argmin_testfunctions; use argmin::core::{Error, Operator}; struct MyProblem {} impl Operator for MyProblem { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the output type Output = Vec<f64>; fn apply(&self, p: &Self::Param) -> Result<Self::Output, Error> { Ok(vec![4.0 * p[0] + 1.0 * p[1], 1.0 * p[0] + 3.0 * p[1]]) } } }`

## Parallel evaluation with `bulk_*`

methods

NOTESo far only Particle Swarm Optimization allows parallel evaluations of parameter vectors. Therefore no increse in performance should be expected for other solvers.

All of the above mentioned traits come with additional methods which enable processing several parameter vectors at once.
These methods require the `rayon`

feature to be enabled. Without this feature, the methods resort to sequental processing of all inputs.
The methods are shown in the following table:

Trait | Single input | Multiple inputs |
---|---|---|

`CostFunction` | `cost` | `bulk_cost` |

`Gradient` | `gradient` | `bulk_gradient` |

`Jacobian` | `jacobian` | `bulk_jacobian` |

`Hessian` | `hessian` | `bulk_hessian` |

`Operator` | `apply` | `bulk_apply` |

`Anneal` | `anneal` | `bulk_anneal` |

These `bulk_*`

methods come with a default implementation, which essentially looks like this:

`#![allow(unused)] fn main() { extern crate argmin; extern crate argmin_testfunctions; use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative, rosenbrock_2d_hessian}; use argmin::core::{Error, CostFunction, SyncAlias, SendAlias}; struct Rosenbrock { a: f64, b: f64 } impl CostFunction for Rosenbrock { type Param = Vec<f64>; type Output = f64; /// Conventional cost function which only processes a single parameter vector fn cost(&self, p: &Self::Param) -> Result<Self::Output, Error> { // [ ... ] Ok(rosenbrock_2d(p, self.a, self.b)) } //////////////////////////////////////////////////////////////////////////// /// Bulk cost function which processes an array of parameter vectors /// //////////////////////////////////////////////////////////////////////////// fn bulk_cost<'a, P>(&self, params: &'a [P]) -> Result<Vec<Self::Output>, Error> where P: std::borrow::Borrow<Self::Param> + SyncAlias, Self::Output: SendAlias, Self: SyncAlias, { #[cfg(feature = "rayon")] { if self.parallelize() { params.par_iter().map(|p| self.cost(p.borrow())).collect() } else { params.iter().map(|p| self.cost(p.borrow())).collect() } } #[cfg(not(feature = "rayon"))] { params.iter().map(|p| self.cost(p.borrow())).collect() } } /// Indicates whether parameter vectors passed to `bulk_cost` should be /// evaluated in parallel or sequentially. /// This allows one to turn off parallelization for individual traits, even if /// the `rayon` feature is enabled. fn parallelize(&self) -> bool { true } } }`

If needed, the `bulk_*`

methods can be overwritten by custom implementations.

Besides the `bulk_*`

method, there is also a method `parallelize`

which by default returns `true`

.
This is a convenient way to turn parallelization off, even if the `rayon`

feature is enabled.
It allows one to for instance turn parallel processing on for `CostFunction`

and off for `Gradient`

.
To turn it off, simply overwrite `parallelize`

such that it returns `false`

.

# Running a solver

The `Executor`

s constructor takes a solver and an optimization problem as input and applies the solver to the problem.
The initial state of the optimization run can be modified via the `configure`

method.
This method accepts a closure with the state as only parameter.
This allows one to provide initial parameter vectors, cost function values, Hessians, and so on via the closure.
There are diffent kinds/types of state and the particular kind of state used depends on the solver.
Most solvers internally use `IterState`

, but some (for instance Particle Swarm Optimization) use `PopulationState`

.
Please refer to the respective documentation for details on how to modify the state.

Once the `Executor`

is configured, the optimization is run via the `run`

method.
This method returns an `OptimizationResult`

which contains the provided problem, the solver and the final state.
Assuming the variable is called `res`

, the final parameter vector can be accessed via `res.state().get_best_param()`

and the corresponding cost function value via `res.state().get_best_cost()`

.

For an overview, `OptimizationResult`

s `Display`

implementation can be used to print the result: `println!("{}", res)`

.

The following example shows how to use the `SteepestDescent`

solver to solve a problem which implements `CostFunction`

and `Gradient`

(which are both required by the solver).

`#![allow(unused_imports)] extern crate argmin; extern crate argmin_testfunctions; use argmin::core::{State, Error, Executor, CostFunction, Gradient}; use argmin::solver::gradientdescent::SteepestDescent; use argmin::solver::linesearch::MoreThuenteLineSearch; use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative}; struct MyProblem {} // Implement `CostFunction` for `MyProblem` impl CostFunction for MyProblem { // [...] /// Type of the parameter vector type Param = Vec<f64>; /// Type of the return value computed by the cost function type Output = f64; /// Apply the cost function to a parameter `p` fn cost(&self, p: &Self::Param) -> Result<Self::Output, Error> { Ok(rosenbrock_2d(p, 1.0, 100.0)) } } // Implement `Gradient` for `MyProblem` impl Gradient for MyProblem { // [...] /// Type of the parameter vector type Param = Vec<f64>; /// Type of the return value computed by the cost function type Gradient = Vec<f64>; /// Compute the gradient at parameter `p`. fn gradient(&self, p: &Self::Param) -> Result<Self::Gradient, Error> { Ok(rosenbrock_2d_derivative(p, 1.0, 100.0)) } } fn run() -> Result<(), Error> { // Create new instance of cost function let cost = MyProblem {}; // Define initial parameter vector let init_param: Vec<f64> = vec![-1.2, 1.0]; // Set up line search needed by `SteepestDescent` let linesearch = MoreThuenteLineSearch::new(); // Set up solver -- `SteepestDescent` requires a linesearch let solver = SteepestDescent::new(linesearch); // Create an `Executor` object let res = Executor::new(cost, solver) // Via `configure`, one has access to the internally used state. // This state can be initialized, for instance by providing an // initial parameter vector. // The maximum number of iterations is also set via this method. // In this particular case, the state exposed is of type `IterState`. // The documentation of `IterState` shows how this struct can be // manipulated. .configure(|state| state // Set initial parameters .param(init_param) // Set maximum iterations to 10 .max_iters(10) ) // run the solver on the defined problem .run()?; // print result println!("{}", res); // Extract results from state // Best parameter vector let best = res.state().get_best_param().unwrap(); // Cost function value associated with best parameter vector let best_cost = res.state().get_best_cost(); // Reason why the optimizer terminated let termination_reason = res.state().get_termination_reason(); // Time needed for optimization let time_needed = res.state().get_time().unwrap(); // Total number of iterations needed let num_iterations = res.state().get_iter(); // Iteration number where the last best parameter vector was found let num_iterations_best = res.state().get_last_best_iter(); // Number of evaluation counts per method (Cost, Gradient) let function_evaluation_counts = res.state().get_func_counts(); Ok(()) } fn main() { if let Err(ref e) = run() { println!("{}", e); std::process::exit(1); } }`

# Observing progress

Argmin offers an interface to observe the state of the solver at initialization as well as after every iteration. This includes the parameter vector, gradient, Jacobian, Hessian, iteration number, cost values and many more as well as solver-specific metrics. This interface can be used to implement loggers, send the information to a storage or to plot metrics.

The observer `WriteToFile`

saves the parameter vector to disk and as such requires the parameter vector to be serializable.
Hence this feature is only available with the `serde1`

feature.

The observer `SlogLogger`

logs the progress of the optimization to screen or to disk.
This requires the `slog-logger`

feature.
Writing to disk in addtion requires the `serde1`

feature.

For each observer it can be defined how often it will observe the progress of the solver.
This is indicated via the enum `ObserverMode`

which can be either `Always`

, `Never`

, `NewBest`

(whenever a new best solution is found) or `Every(i)`

which means every `i`

th iteration.

Custom observers can be used as well by implementing the `Observe`

trait (see the chapter on implementing an observer for details).

The following example shows how to add an observer to an `Executor`

which logs progress to the terminal.
The observer is configured via `ObserverMode::Always`

such that it will log every iteration to screen.
Multiple observers can be added to a single `Executor`

.

`#![allow(unused_imports)] extern crate argmin; extern crate argmin_testfunctions; use argmin::core::{Error, Executor, CostFunction, Gradient}; use argmin::core::observers::{SlogLogger, ObserverMode}; use argmin::solver::gradientdescent::SteepestDescent; use argmin::solver::linesearch::MoreThuenteLineSearch; use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative}; struct Rosenbrock { a: f64, b: f64, } /// Implement `CostFunction` for `Rosenbrock` impl CostFunction for Rosenbrock { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the return value computed by the cost function type Output = f64; /// Apply the cost function to a parameter `p` fn cost(&self, p: &Self::Param) -> Result<Self::Output, Error> { Ok(rosenbrock_2d(p, 1.0, 100.0)) } } /// Implement `Gradient` for `Rosenbrock` impl Gradient for Rosenbrock { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the return value computed by the cost function type Gradient = Vec<f64>; /// Compute the gradient at parameter `p`. fn gradient(&self, p: &Self::Param) -> Result<Self::Gradient, Error> { Ok(rosenbrock_2d_derivative(p, 1.0, 100.0)) } } fn run() -> Result<(), Error> { // Define cost function (must implement `CostFunction` and `Gradient`) let cost = Rosenbrock { a: 1.0, b: 100.0 }; // Define initial parameter vector let init_param: Vec<f64> = vec![-1.2, 1.0]; // Set up line search let linesearch = MoreThuenteLineSearch::new(); // Set up solver let solver = SteepestDescent::new(linesearch); // [...] let res = Executor::new(cost, solver) .configure(|state| state.param(init_param).max_iters(10)) // Add an observer which will log all iterations to the terminal .add_observer(SlogLogger::term(), ObserverMode::Always) .run()?; println!("{}", res); Ok(()) } fn main() { if let Err(ref e) = run() { println!("{}", e); std::process::exit(1); } }`

# Checkpointing

Checkpointing is a useful mechanism for mitigating the effects of crashes when software is run in an unstable environment, particularly for long run times. Checkpoints are snapshots of the current state of the optimization which can be resumed from in case of a crash. These checkpoints are saved regularly at a user-chosen frequency.

Currently only saving checkpoints to disk with `FileCheckpoint`

is provided.
Via the `Checkpoint`

trait other checkpointing approaches can be implemented (see the chapter on implementing a checkpointing method for details).

The `CheckpointingFrequency`

defines how often checkpoints are saved and can be chosen to be either `Always`

(every iteration), `Every(u64)`

(every Nth iteration) or `Never`

.

The following example shows how the `checkpointing`

method is used to configure and activate checkpointing.
If no checkpoint is available on disk yet, an optimization will be started from scratch.
If the run crashes and a checkpoint is found on disk, then it will resume from the checkpoint.

## Example

`extern crate argmin; extern crate argmin_testfunctions; use argmin::core::{CostFunction, Error, Executor, Gradient, observers::ObserverMode}; #[cfg(feature = "serde1")] use argmin::core::checkpointing::{FileCheckpoint, CheckpointingFrequency}; #[cfg(feature = "slog-logger")] use argmin::core::observers::SlogLogger; use argmin::solver::landweber::Landweber; use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative}; #[derive(Default)] struct Rosenbrock {} /// Implement `CostFunction` for `Rosenbrock` impl CostFunction for Rosenbrock { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the return value computed by the cost function type Output = f64; /// Apply the cost function to a parameter `p` fn cost(&self, p: &Self::Param) -> Result<Self::Output, Error> { Ok(rosenbrock_2d(p, 1.0, 100.0)) } } /// Implement `Gradient` for `Rosenbrock` impl Gradient for Rosenbrock { /// Type of the parameter vector type Param = Vec<f64>; /// Type of the return value computed by the cost function type Gradient = Vec<f64>; /// Compute the gradient at parameter `p`. fn gradient(&self, p: &Self::Param) -> Result<Self::Gradient, Error> { Ok(rosenbrock_2d_derivative(p, 1.0, 100.0)) } } fn run() -> Result<(), Error> { // define inital parameter vector let init_param: Vec<f64> = vec![1.2, 1.2]; let my_optimization_problem = Rosenbrock {}; let iters = 35; let solver = Landweber::new(0.001); // [...] #[cfg(feature = "serde1")] let checkpoint = FileCheckpoint::new( // Directory ".checkpoints", // File base name "optim", // How often to save a checkpoint (in this case every 20 iterations) CheckpointingFrequency::Every(20) ); #[cfg(feature = "serde1")] let res = Executor::new(my_optimization_problem, solver) .configure(|state| state.param(init_param).max_iters(iters)) .checkpointing(checkpoint) .run()?; // [...] Ok(()) } fn main() { if let Err(ref e) = run() { println!("{}", e); } }`

# Advanced topics

# Implementing a solver

In this section we are going to implement the Landweber solver, which essentially is a special form of gradient descent. In iteration \( k \), the new parameter vector \( x_{k+1} \) is calculated from the previous parameter vector \( x_k \) and the gradient at \( x_k \) according to the following update rule:

\[ x_{k+1} = x_k - \omega * \nabla f(x_k) \]

In order to implement this using argmin, one first needs to define the struct `Landweber`

which holds data specific to the solver (the step length \( \omega \)).
Then, the `Solver`

trait needs to be implemented for this struct.

The `Solver`

trait consists of several methods; however, not all of them need to be implemented since most come with default implementations.

`NAME`

: a`&'static str`

which holds the solvers name (mainly needed for the observers).`init(...)`

: Run before the the actual iterations and initializes the solver. Does nothing by default.`next_iter(...)`

: One iteration of the solver. Will be executed by the`Executor`

until a stopping critereon is met.`terminate(...)`

: Solver specific stopping criteria. This method is run after every iteration. Note that one can also terminate from within`next_iter`

if necessary.`terminate_internal(...)`

: By default calls`terminate`

and in addition checks if the maximum number of iterations was reached or if the current cost function value is below the target cost function value. Should only be overwritten if absolutely necessary.

Both `init`

and `next_iter`

have access to the optimization problem (`problem`

) as well as the internal state (`state`

).
The methods `terminate`

and `terminate_interal`

only have access to `state`

.

The function parameter `problem`

is a wrapped version of the optimization problem and as such gives access to the cost function, gradient, Hessian, Jacobian,...).
It also keeps track of how often each of these is called.

Via `state`

the solver has access to the current parameter vector, the current best parameter vector, gradient, Hessian, Jacobian, population, the current iteration number, and so on.
The `state`

can be modified (for instance a new parameter vector is set) and is then returned by both `init`

and `next_iter`

.
The `Executor`

then takes care of updating the state properly, for instance by updating the current best parameter vector if the new parameter vector is better than the previous best.

It is advisable to design the solver such that it is generic over the actual type of the parameter vector, gradient, and so on.

The current naming convention for generics in argmin is as follows:

`O`

: Optimization problem`P`

: Parameter vector`G`

: Gradient`J`

: Jacobian`H`

: Hessian`F`

: Floats (`f32`

or`f64`

)

These individual generic parameters are then constrained by type constraints.
For instance, the Landweber iteration requires the problem `O`

to implement `Gradient`

, therefore a trait bound of the form `O: Gradient<Param = P, Gradient = G>`

is necessary.

From the Landweber update formula, we know that a scaled subtraction of two vectors is required.
This must be represented in form of a trait bound as well: `P: ArgminScaledSub<G, F, P>`

.
`ArgminScaledSub`

is a trait from `argmin-math`

which represents a scaled subtraction.
With this trait bound, we require that it must be possible to substract a value of type `G`

scaled with a value of type `F`

from a value of type `P`

, resulting in a value of type `P`

.

The generic type `F`

represents floating point value and therefore allows users to choose which precision they want.

Implementing the algorithm is straightforward: First we get the current parameter vector `xk`

from the state via `state.take_param()`

.
Note that `take_param`

moves the parameter vector from the `state`

into `xk`

, therefore one needs to make sure to move the updated parameter vector into `state`

at the end of `next_iter`

via `state.param(...)`

.
Landweber requires the user to provide an initial parameter vector.
If this is not the case than we return an error to inform the user.
Then the gradient `grad`

is computed by calling `problem.gradient(...)`

on the parameter vector.
This will return the gradient and internally increase the gradient function evaluation count.
We compute the updated parameter vector `xkp1`

by computing `xk.scaled_sub(&self.omega, &grad)`

(which is possible because of the `ArgminScaledSub`

trait bound introduced before).
Finally, the state is updated via `state.param(xkp1)`

and returned by the function.

`#![allow(unused)] fn main() { extern crate argmin; use argmin::core::{ ArgminFloat, KV, Error, Gradient, IterState, Problem, Solver, State }; use argmin::argmin_error_closure; use serde::{Deserialize, Serialize}; use argmin_math::ArgminScaledSub; // Define a struct which holds any parameters/data which are needed during the // execution of the solver. Note that this does not include parameter vectors, // gradients, Hessians, cost function values and so on, as those will be // handled by the `Executor` and its interal state. #[derive(Serialize, Deserialize)] pub struct Landweber<F> { /// Step length omega: F, } impl<F> Landweber<F> { /// Constructor pub fn new(omega: F) -> Self { Landweber { omega } } } impl<O, F, P, G> Solver<O, IterState<P, G, (), (), F>> for Landweber<F> where // The Landweber solver requires `O` to implement `Gradient`. // `P` and `G` indicate the types of the parameter vector and gradient, // respectively. O: Gradient<Param = P, Gradient = G>, // The parameter vector of type `P` needs to implement `ArgminScaledSub` // because of the update formula P: Clone + ArgminScaledSub<G, F, P>, // `F` is the floating point type (`f32` or `f64`) F: ArgminFloat, { // This gives the solver a name which will be used for logging const NAME: &'static str = "Landweber"; // Defines the computations performed in a single iteration. fn next_iter( &mut self, // This gives access to the problem supplied to the `Executor`. `O` implements // `Gradient` and `Problem` takes care of counting the calls to the respective // functions. problem: &mut Problem<O>, // Current state of the optimization. This gives access to the parameter // vector, gradient, Hessian and cost function value of the current, // previous and best iteration as well as current iteration number, and // many more. mut state: IterState<P, G, (), (), F>, ) -> Result<(IterState<P, G, (), (), F>, Option<KV>), Error> { // First we obtain the current parameter vector from the `state` struct (`x_k`). // Landweber requires an initial parameter vector. Return an error if this was // not provided by the user. let xk = state.take_param().ok_or_else(argmin_error_closure!( NotInitialized, "Initial parameter vector required!" ))?; // Then we compute the gradient at `x_k` (`\nabla f(x_k)`) let grad = problem.gradient(&xk)?; // Now subtract `\nabla f(x_k)` scaled by `omega` from `x_k` // to compute `x_{k+1}` let xkp1 = xk.scaled_sub(&self.omega, &grad); // Return new the updated `state` Ok((state.param(xkp1), None)) } } }`

Another example with a more complex solver which requires `init`

as well as returning a `KV`

is (hopefully) coming soon!

# Implementing an observer

Coming soon!

# Implementing a checkpointing method

Coming soon!

# Contributing

This crate is looking for contributors! Potential projects can be found in the Github issues, but feel free to suggest your own ideas. Besides adding optimization methods and new features, other contributions are also highly welcome, for instance improving performance, documentation, writing examples (with real world problems), developing tests, adding observers, implementing a C interface or Python wrappers. Bug reports (and fixes) are of course also highly appreciated.

## Running the tests

The repository is organized as a workspace with the two crates `argmin`

and `argmin-math`

.
It is recommended to run the tests for both crates individually, as this simplifies the choice of features.
In case of `argmin`

, all combinations of features should lead to working tests.
For `argmin-math`

, this is not the case, since some of the features are mutually exclusive.

Therefore it is recommended to run the tests for `argmin`

from the root directory of the repository like this:

```
cargo test -p argmin --all-features
```

This will work for `--all-features`

and any other combination of features.
Note that not all tests will run if only a subset of the features is enabled.

In terms of `argmin-math`

, one can just test the default features:

```
cargo test -p argmin-math
```

Or the default features plus the latest `ndarray`

/`nalgebra`

backends:

```
cargo test -p argmin-math --features "latest_all"
```

Individual backends can be tested as well; however, care has to be taken to not add two different versions of the same backend, as that may not work.