optimizers

Module: optimizers.brute2fine

class dmipy.optimizers.brute2fine.GlobalBruteOptimizer(model, acquisition_scheme, x0_vector=None, Ns=5, N_sphere_samples=30)

Brute-Force optimizer. Given a model and an acquisition scheme, first computes a global grid of parameters and corresponding signal attenuations. All parameters except the spherical orientation parameter ‘mu’ are sampled between their corresponding parameter_ranges in ‘Ns’ equal steps. For ‘mu’ a spherical grid of ‘N_sphere_samples” points is used, which were generated using the work of Caruyer et al. [1].

When calling the function with measured data, the closest parameters are return based on the sum-squared error between the signal grid and the data.

Parameters:

model: dmipy MultiCompartmentModel instance, :

Can be composed of any model combination.

acquisition_scheme: DmipyAcquisitionScheme instance, :

acquisition scheme of the to-be-fitted data.

x0_vector: array of size (Nparameters,) :

optional initial guess parameters. As long as the initial guess does not vary voxel-by-voxel the parameter grid will be estimated, only including the initial guess value for the parameters that were included in x0_vector.

Ns: integer, :

number of equally spaced sampling points for regular parameters.

N_sphere_sampled: integer, :

number of sampled sphere points to sample orientation ‘mu’.

References

[R28]Caruyer, Emmanuel, et al. “Design of multishell sampling schemes with uniform coverage in diffusion MRI.” Magnetic resonance in medicine 69.6 (2013): 1534-1540.

Methods

precompute_signal_grid(model, x0_vector, Ns, N_sphere_samples)

Function that estimates the parameter grid and corresponding signal attenuation.

NOTE: In the current implementation initial guesses for volume fractions are still ignored

class dmipy.optimizers.brute2fine.Brute2FineOptimizer(model, acquisition_scheme, Ns=5)

Brute force optimizer with refining. Essentially this function does both the brute force optimization like GlobalBruteOptimizer (without treating mu differently, which is currently suboptimal), but then follows it with a gradient-descent based refining step [1, 2] to find the local minimum.

All parameters are optimized within their parameter_bounds. Volume fraction are optimized by ‘nesting’ them, such that given a set of models \(m_1...m_N\), and the partial volume ratios math:v_1…v_{N-1}, the partial volume function is

\[v_1 m_1 + (1 - v_1) v_2 m_2 + ... + (1 - v_1)...(1-v_{N-1}) m_N\]
Parameters:

model: dmipy MultiCompartmentModel instance, :

Can be composed of any model combination.

acquisition_scheme: DmipyAcquisitionScheme instance, :

acquisition scheme of the to-be-fitted data.

Ns: integer, :

number of equally spaced sampling points for regular parameters.

References

[R29]Byrd, R H and P Lu and J. Nocedal. 1995. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM Journal on Scientific and Statistical Computing 16 (5): 1190-1208.
[R30]Zhu, C and R H Byrd and J Nocedal. 1997. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization. ACM Transactions on Mathematical Software 23 (4): 550-560.

Methods

objective_function(parameter_vector, data)

The objective function for brute-force and gradient-based optimizer.

dmipy.optimizers.brute2fine.nested_to_normalized_fractions(nested_fractions)

Calculates the normal volume fractions from nested ones.

dmipy.optimizers.brute2fine.normalized_to_nested_fractions_array(normalized_fractions)

Calculates the nested volume fractions from normal ones.

dmipy.optimizers.brute2fine.find_minimum_argument(data_grid, signal)

Finds the index in the simulated data_grid that has the lowest sum-squared error to the signal.

Module: optimizers.mix

class dmipy.optimizers.mix.MixOptimizer(model, acquisition_scheme, maxiter=150)

The stochastic Microstructure In Crossings (MIX) optimizer [1] uses a three-step process to fit the parameters of a multi-compartment (MC) model to data. The key innovation is that they separate linear from non-linear parameters in the fitting process, meaning the linear volume fractions and non-linear other ones(e.g. diffusivities) are optimized at different stages in the process.

In the first step [1] describes using a genetic algorithm to estimate the non-linear parameters of an MC model. For this we use scipy’s differential_evolution algorithm.

In the second step [1] describes using CVX to estimate the linear volume fractions of an MC model. For this we use scipy’s COBYLA algorithm since it allows us to impose the parameter constraints we need for volume fractions; namely that they are positive and sum up to one.

The third and last step in [1] is a refining step to find a local minimum given the solutions of step one and two. For this we use scipy’s gradient-based L-BFGS-B algorithm with nested volume fractions.

The final result takes a model’s parameter_ranges into account, only yielding parameters within their allowed optimization domain.

Parameters:

model : MultiCompartmentModel instance,

A multicompartment model that has been instantiated using dMipy.

acquisition_scheme : DmipyAcquisitionScheme instance,

An acquisition scheme that has been instantiated using dMipy.

maxiter : integer

The maximum allowed iterations for the differential evolution algorithm

References

[R31]Farooq, Hamza, et al. “Microstructure Imaging of Crossing (MIX) White Matter Fibers from diffusion MRI.” Nature Scientific reports 6 (2016).

Methods

cobyla_cost_function(fractions, phi, data)

Objective function of linear parameter estimation using COBYLA.

objective_function(optimized_parameter_vector, data, acquisition_scheme, x0_params)

Objective function of final refining step using L-BFGS-B

stochastic_objective_function(optimized_parameter_vector, data, acquisition_scheme, x0_params)

Objective function for stochastic non-linear parameter estimation using differential_evolution

dmipy.optimizers.mix.nested_to_normalized_fractions(nested_fractions)

Function to convert nested to normalized volume fractions.

dmipy.optimizers.mix.cobyla_positivity_constraint(volume_fractions, *args)

COBYLA positivity constraint on volume fractions

dmipy.optimizers.mix.cobyla_unity_constraint(volume_fractions, *args)

COBYLA unity constraint on volume fractions

Brute2FineOptimizer

class dmipy.optimizers.brute2fine.Brute2FineOptimizer(model, acquisition_scheme, Ns=5)

Brute force optimizer with refining. Essentially this function does both the brute force optimization like GlobalBruteOptimizer (without treating mu differently, which is currently suboptimal), but then follows it with a gradient-descent based refining step [1, 2] to find the local minimum.

All parameters are optimized within their parameter_bounds. Volume fraction are optimized by ‘nesting’ them, such that given a set of models \(m_1...m_N\), and the partial volume ratios math:v_1…v_{N-1}, the partial volume function is

\[v_1 m_1 + (1 - v_1) v_2 m_2 + ... + (1 - v_1)...(1-v_{N-1}) m_N\]
Parameters:

model: dmipy MultiCompartmentModel instance, :

Can be composed of any model combination.

acquisition_scheme: DmipyAcquisitionScheme instance, :

acquisition scheme of the to-be-fitted data.

Ns: integer, :

number of equally spaced sampling points for regular parameters.

References

[R32]Byrd, R H and P Lu and J. Nocedal. 1995. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM Journal on Scientific and Statistical Computing 16 (5): 1190-1208.
[R33]Zhu, C and R H Byrd and J Nocedal. 1997. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization. ACM Transactions on Mathematical Software 23 (4): 550-560.

Methods

__init__(model, acquisition_scheme, Ns=5)
objective_function(parameter_vector, data)

The objective function for brute-force and gradient-based optimizer.

GlobalBruteOptimizer

class dmipy.optimizers.brute2fine.GlobalBruteOptimizer(model, acquisition_scheme, x0_vector=None, Ns=5, N_sphere_samples=30)

Brute-Force optimizer. Given a model and an acquisition scheme, first computes a global grid of parameters and corresponding signal attenuations. All parameters except the spherical orientation parameter ‘mu’ are sampled between their corresponding parameter_ranges in ‘Ns’ equal steps. For ‘mu’ a spherical grid of ‘N_sphere_samples” points is used, which were generated using the work of Caruyer et al. [1].

When calling the function with measured data, the closest parameters are return based on the sum-squared error between the signal grid and the data.

Parameters:

model: dmipy MultiCompartmentModel instance, :

Can be composed of any model combination.

acquisition_scheme: DmipyAcquisitionScheme instance, :

acquisition scheme of the to-be-fitted data.

x0_vector: array of size (Nparameters,) :

optional initial guess parameters. As long as the initial guess does not vary voxel-by-voxel the parameter grid will be estimated, only including the initial guess value for the parameters that were included in x0_vector.

Ns: integer, :

number of equally spaced sampling points for regular parameters.

N_sphere_sampled: integer, :

number of sampled sphere points to sample orientation ‘mu’.

References

[R34]Caruyer, Emmanuel, et al. “Design of multishell sampling schemes with uniform coverage in diffusion MRI.” Magnetic resonance in medicine 69.6 (2013): 1534-1540.

Methods

__init__(model, acquisition_scheme, x0_vector=None, Ns=5, N_sphere_samples=30)
precompute_signal_grid(model, x0_vector, Ns, N_sphere_samples)

Function that estimates the parameter grid and corresponding signal attenuation.

NOTE: In the current implementation initial guesses for volume fractions are still ignored

brute

dmipy.optimizers.brute2fine.brute(func, ranges, args=(), Ns=20, full_output=0, finish=<function fmin>, disp=False)

Minimize a function over a given range by brute force.

Uses the “brute force” method, i.e. computes the function’s value at each point of a multidimensional grid of points, to find the global minimum of the function.

The function is evaluated everywhere in the range with the datatype of the first call to the function, as enforced by the vectorize NumPy function. The value and type of the function evaluation returned when full_output=True are affected in addition by the finish argument (see Notes).

Parameters:

func : callable

The objective function to be minimized. Must be in the form f(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function.

ranges : tuple

Each component of the ranges tuple must be either a “slice object” or a range tuple of the form (low, high). The program uses these to create the grid of points on which the objective function will be computed. See Note 2 for more detail.

args : tuple, optional

Any additional fixed parameters needed to completely specify the function.

Ns : int, optional

Number of grid points along the axes, if not otherwise specified. See Note2.

full_output : bool, optional

If True, return the evaluation grid and the objective function’s values on it.

finish : callable, optional

An optimization function that is called with the result of brute force minimization as initial guess. finish should take func and the initial guess as positional arguments, and take args as keyword arguments. It may additionally take full_output and/or disp as keyword arguments. Use None if no “polishing” function is to be used. See Notes for more details.

disp : bool, optional

Set to True to print convergence messages.

Returns:

x0 : ndarray

A 1-D array containing the coordinates of a point at which the objective function had its minimum value. (See Note 1 for which point is returned.)

fval : float

Function value at the point x0. (Returned when full_output is True.)

grid : tuple

Representation of the evaluation grid. It has the same length as x0. (Returned when full_output is True.)

Jout : ndarray

Function values at each point of the evaluation grid, i.e., Jout = func(*grid). (Returned when full_output is True.)

See also

basinhopping, differential_evolution

Notes

Note 1: The program finds the gridpoint at which the lowest value of the objective function occurs. If finish is None, that is the point returned. When the global minimum occurs within (or not very far outside) the grid’s boundaries, and the grid is fine enough, that point will be in the neighborhood of the global minimum.

However, users often employ some other optimization program to “polish” the gridpoint values, i.e., to seek a more precise (local) minimum near brute’s best gridpoint. The brute function’s finish option provides a convenient way to do that. Any polishing program used must take brute’s output as its initial guess as a positional argument, and take brute’s input values for args as keyword arguments, otherwise an error will be raised. It may additionally take full_output and/or disp as keyword arguments.

brute assumes that the finish function returns either an OptimizeResult object or a tuple in the form: (xmin, Jmin, ... , statuscode), where xmin is the minimizing value of the argument, Jmin is the minimum value of the objective function, “…” may be some other returned values (which are not used by brute), and statuscode is the status code of the finish program.

Note that when finish is not None, the values returned are those of the finish program, not the gridpoint ones. Consequently, while brute confines its search to the input grid points, the finish program’s results usually will not coincide with any gridpoint, and may fall outside the grid’s boundary. Thus, if a minimum only needs to be found over the provided grid points, make sure to pass in finish=None.

Note 2: The grid of points is a numpy.mgrid object. For brute the ranges and Ns inputs have the following effect. Each component of the ranges tuple can be either a slice object or a two-tuple giving a range of values, such as (0, 5). If the component is a slice object, brute uses it directly. If the component is a two-tuple range, brute internally converts it to a slice object that interpolates Ns points from its low-value to its high-value, inclusive.

Examples

We illustrate the use of brute to seek the global minimum of a function of two variables that is given as the sum of a positive-definite quadratic and two deep “Gaussian-shaped” craters. Specifically, define the objective function f as the sum of three other functions, f = f1 + f2 + f3. We suppose each of these has a signature (z, *params), where z = (x, y), and params and the functions are as defined below.

>>> params = (2, 3, 7, 8, 9, 10, 44, -1, 2, 26, 1, -2, 0.5)
>>> def f1(z, *params):
...     x, y = z
...     a, b, c, d, e, f, g, h, i, j, k, l, scale = params
...     return (a * x**2 + b * x * y + c * y**2 + d*x + e*y + f)
>>> def f2(z, *params):
...     x, y = z
...     a, b, c, d, e, f, g, h, i, j, k, l, scale = params
...     return (-g*np.exp(-((x-h)**2 + (y-i)**2) / scale))
>>> def f3(z, *params):
...     x, y = z
...     a, b, c, d, e, f, g, h, i, j, k, l, scale = params
...     return (-j*np.exp(-((x-k)**2 + (y-l)**2) / scale))
>>> def f(z, *params):
...     return f1(z, *params) + f2(z, *params) + f3(z, *params)

Thus, the objective function may have local minima near the minimum of each of the three functions of which it is composed. To use fmin to polish its gridpoint result, we may then continue as follows:

>>> rranges = (slice(-4, 4, 0.25), slice(-4, 4, 0.25))
>>> from scipy import optimize
>>> resbrute = optimize.brute(f, rranges, args=params, full_output=True,
...                           finish=optimize.fmin)
>>> resbrute[0]  # global minimum
array([-1.05665192,  1.80834843])
>>> resbrute[1]  # function value at global minimum
-3.4085818767

Note that if finish had been set to None, we would have gotten the gridpoint [-1.0 1.75] where the rounded function value is -2.892.

cart2mu

dmipy.optimizers.brute2fine.cart2mu(xyz)

Function to estimate spherical coordinates from cartesian coordinates according to wikipedia. Conforms with the dipy notation.

Parameters:

cartesian_coordinates : array of size (3) or (N x 3),

array of cartesian coordinate vectors [x, y, z].

Returns:

spherical_coordinates : array of size (2) or (N x 2),

array of spherical coordinate vectors [theta, phi]. range of theta [0, pi]. range of phi [-pi, pi].

find_minimum_argument

dmipy.optimizers.brute2fine.find_minimum_argument(data_grid, signal)

Finds the index in the simulated data_grid that has the lowest sum-squared error to the signal.

minimize

dmipy.optimizers.brute2fine.minimize(fun, x0, args=(), method=None, jac=None, hess=None, hessp=None, bounds=None, constraints=(), tol=None, callback=None, options=None)

Minimization of scalar function of one or more variables.

In general, the optimization problems are of the form:

minimize f(x) subject to

g_i(x) >= 0,  i = 1,...,m
h_j(x)  = 0,  j = 1,...,p

where x is a vector of one or more variables. g_i(x) are the inequality constraints. h_j(x) are the equality constrains.

Optionally, the lower and upper bounds for each element in x can also be specified using the bounds argument.

Parameters:

fun : callable

The objective function to be minimized. Must be in the form f(x, *args). The optimizing argument, x, is a 1-D array of points, and args is a tuple of any additional fixed parameters needed to completely specify the function.

x0 : ndarray

Initial guess. len(x0) is the dimensionality of the minimization problem.

args : tuple, optional

Extra arguments passed to the objective function and its derivatives (Jacobian, Hessian).

method : str or callable, optional

Type of solver. Should be one of

  • ‘Nelder-Mead’ (see here)
  • ‘Powell’ (see here)
  • ‘CG’ (see here)
  • ‘BFGS’ (see here)
  • ‘Newton-CG’ (see here)
  • ‘L-BFGS-B’ (see here)
  • ‘TNC’ (see here)
  • ‘COBYLA’ (see here)
  • ‘SLSQP’ (see here)
  • ‘dogleg’ (see here)
  • ‘trust-ncg’ (see here)
  • ‘trust-exact’ (see here)
  • ‘trust-krylov’ (see here)
  • custom - a callable object (added in version 0.14.0), see below for description.

If not given, chosen to be one of BFGS, L-BFGS-B, SLSQP, depending if the problem has constraints or bounds.

jac : bool or callable, optional

Jacobian (gradient) of objective function. Only for CG, BFGS, Newton-CG, L-BFGS-B, TNC, SLSQP, dogleg, trust-ncg, trust-krylov, trust-region-exact. If jac is a Boolean and is True, fun is assumed to return the gradient along with the objective function. If False, the gradient will be estimated numerically. jac can also be a callable returning the gradient of the objective. In this case, it must accept the same arguments as fun.

hess, hessp : callable, optional

Hessian (matrix of second-order derivatives) of objective function or Hessian of objective function times an arbitrary vector p. Only for Newton-CG, dogleg, trust-ncg, trust-krylov, trust-region-exact. Only one of hessp or hess needs to be given. If hess is provided, then hessp will be ignored. If neither hess nor hessp is provided, then the Hessian product will be approximated using finite differences on jac. hessp must compute the Hessian times an arbitrary vector.

bounds : sequence, optional

Bounds for variables (only for L-BFGS-B, TNC and SLSQP). (min, max) pairs for each element in x, defining the bounds on that parameter. Use None for one of min or max when there is no bound in that direction.

constraints : dict or sequence of dict, optional

Constraints definition (only for COBYLA and SLSQP). Each constraint is defined in a dictionary with fields:

type : str

Constraint type: ‘eq’ for equality, ‘ineq’ for inequality.

fun : callable

The function defining the constraint.

jac : callable, optional

The Jacobian of fun (only for SLSQP).

args : sequence, optional

Extra arguments to be passed to the function and Jacobian.

Equality constraint means that the constraint function result is to be zero whereas inequality means that it is to be non-negative. Note that COBYLA only supports inequality constraints.

tol : float, optional

Tolerance for termination. For detailed control, use solver-specific options.

options : dict, optional

A dictionary of solver options. All methods accept the following generic options:

maxiter : int

Maximum number of iterations to perform.

disp : bool

Set to True to print convergence messages.

For method-specific options, see show_options().

callback : callable, optional

Called after each iteration, as callback(xk), where xk is the current parameter vector.

Returns:

res : OptimizeResult

The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, success a Boolean flag indicating if the optimizer exited successfully and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

See also

minimize_scalar
Interface to minimization algorithms for scalar univariate functions
show_options
Additional options accepted by the solvers

Notes

This section describes the available solvers that can be selected by the ‘method’ parameter. The default method is BFGS.

Unconstrained minimization

Method Nelder-Mead uses the Simplex algorithm [R35], [R36]. This algorithm is robust in many applications. However, if numerical computation of derivative can be trusted, other algorithms using the first and/or second derivatives information might be preferred for their better performance in general.

Method Powell is a modification of Powell’s method [R37], [R38] which is a conjugate direction method. It performs sequential one-dimensional minimizations along each vector of the directions set (direc field in options and info), which is updated at each iteration of the main minimization loop. The function need not be differentiable, and no derivatives are taken.

Method CG uses a nonlinear conjugate gradient algorithm by Polak and Ribiere, a variant of the Fletcher-Reeves method described in [R39] pp. 120-122. Only the first derivatives are used.

Method BFGS uses the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) [R39] pp. 136. It uses the first derivatives only. BFGS has proven good performance even for non-smooth optimizations. This method also returns an approximation of the Hessian inverse, stored as hess_inv in the OptimizeResult object.

Method Newton-CG uses a Newton-CG algorithm [R39] pp. 168 (also known as the truncated Newton method). It uses a CG method to the compute the search direction. See also TNC method for a box-constrained minimization with a similar algorithm. Suitable for large-scale problems.

Method dogleg uses the dog-leg trust-region algorithm [R39] for unconstrained minimization. This algorithm requires the gradient and Hessian; furthermore the Hessian is required to be positive definite.

Method trust-ncg uses the Newton conjugate gradient trust-region algorithm [R39] for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems.

Method trust-krylov uses the Newton GLTR trust-region algorithm [14]_, [15]_ for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems. On indefinite problems it requires usually less iterations than the trust-ncg method and is recommended for medium and large-scale problems.

Method trust-exact is a trust-region method for unconstrained minimization in which quadratic subproblems are solved almost exactly [13]_. This algorithm requires the gradient and the Hessian (which is not required to be positive definite). It is, in many situations, the Newton method to converge in fewer iteraction and the most recommended for small and medium-size problems.

Constrained minimization

Method L-BFGS-B uses the L-BFGS-B algorithm [R40], [R41] for bound constrained minimization.

Method TNC uses a truncated Newton algorithm [R39], [R42] to minimize a function with variables subject to bounds. This algorithm uses gradient information; it is also called Newton Conjugate-Gradient. It differs from the Newton-CG method described above as it wraps a C implementation and allows each variable to be given upper and lower bounds.

Method COBYLA uses the Constrained Optimization BY Linear Approximation (COBYLA) method [R43], [10]_, [11]_. The algorithm is based on linear approximations to the objective function and each constraint. The method wraps a FORTRAN implementation of the algorithm. The constraints functions ‘fun’ may return either a single number or an array or list of numbers.

Method SLSQP uses Sequential Least SQuares Programming to minimize a function of several variables with any combination of bounds, equality and inequality constraints. The method wraps the SLSQP Optimization subroutine originally implemented by Dieter Kraft [12]_. Note that the wrapper handles infinite values in bounds by converting them into large floating values.

Custom minimizers

It may be useful to pass a custom minimization method, for example when using a frontend to this method such as scipy.optimize.basinhopping or a different library. You can simply pass a callable as the method parameter.

The callable is called as method(fun, x0, args, **kwargs, **options) where kwargs corresponds to any other parameters passed to minimize (such as callback, hess, etc.), except the options dict, which has its contents also passed as method parameters pair by pair. Also, if jac has been passed as a bool type, jac and fun are mangled so that fun returns just the function values and jac is converted to a function returning the Jacobian. The method shall return an OptimizeResult object.

The provided method callable must be able to accept (and possibly ignore) arbitrary parameters; the set of parameters accepted by minimize may expand in future versions and then these parameters will be passed to the method. You can find an example in the scipy.optimize tutorial.

New in version 0.11.0.

References

[R35](1, 2) Nelder, J A, and R Mead. 1965. A Simplex Method for Function Minimization. The Computer Journal 7: 308-13.
[R36](1, 2) Wright M H. 1996. Direct search methods: Once scorned, now respectable, in Numerical Analysis 1995: Proceedings of the 1995 Dundee Biennial Conference in Numerical Analysis (Eds. D F Griffiths and G A Watson). Addison Wesley Longman, Harlow, UK. 191-208.
[R37](1, 2) Powell, M J D. 1964. An efficient method for finding the minimum of a function of several variables without calculating derivatives. The Computer Journal 7: 155-162.
[R38](1, 2) Press W, S A Teukolsky, W T Vetterling and B P Flannery. Numerical Recipes (any edition), Cambridge University Press.
[R39](1, 2, 3, 4, 5, 6, 7, 8) Nocedal, J, and S J Wright. 2006. Numerical Optimization. Springer New York.
[R40](1, 2) Byrd, R H and P Lu and J. Nocedal. 1995. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM Journal on Scientific and Statistical Computing 16 (5): 1190-1208.
[R41](1, 2) Zhu, C and R H Byrd and J Nocedal. 1997. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization. ACM Transactions on Mathematical Software 23 (4): 550-560.
[R42](1, 2) Nash, S G. Newton-Type Minimization Via the Lanczos Method. 1984. SIAM Journal of Numerical Analysis 21: 770-778.
[R43](1, 2) Powell, M J D. A direct search optimization method that models the objective and constraint functions by linear interpolation. 1994. Advances in Optimization and Numerical Analysis, eds. S. Gomez and J-P Hennart, Kluwer Academic (Dordrecht), 51-67.
[10]Powell M J D. Direct search algorithms for optimization calculations. 1998. Acta Numerica 7: 287-336.
[11]Powell M J D. A view of algorithms for optimization without derivatives. 2007.Cambridge University Technical Report DAMTP 2007/NA03
[12]Kraft, D. A software package for sequential quadratic programming. 1988. Tech. Rep. DFVLR-FB 88-28, DLR German Aerospace Center – Institute for Flight Mechanics, Koln, Germany.
[13]Conn, A. R., Gould, N. I., and Toint, P. L. Trust region methods. 2000. Siam. pp. 169-200.
[14]F. Lenders, C. Kirches, A. Potschka: “trlib: A vector-free implementation of the GLTR method for iterative solution of the trust region problem”, https://arxiv.org/abs/1611.04718
[15]N. Gould, S. Lucidi, M. Roma, P. Toint: “Solving the Trust-Region Subproblem using the Lanczos Method”, SIAM J. Optim., 9(2), 504–525, (1999).

Examples

Let us consider the problem of minimizing the Rosenbrock function. This function (and its respective derivatives) is implemented in rosen (resp. rosen_der, rosen_hess) in the scipy.optimize.

>>> from scipy.optimize import minimize, rosen, rosen_der

A simple application of the Nelder-Mead method is:

>>> x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
>>> res = minimize(rosen, x0, method='Nelder-Mead', tol=1e-6)
>>> res.x
array([ 1.,  1.,  1.,  1.,  1.])

Now using the BFGS algorithm, using the first derivative and a few options:

>>> res = minimize(rosen, x0, method='BFGS', jac=rosen_der,
...                options={'gtol': 1e-6, 'disp': True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 26
         Function evaluations: 31
         Gradient evaluations: 31
>>> res.x
array([ 1.,  1.,  1.,  1.,  1.])
>>> print(res.message)
Optimization terminated successfully.
>>> res.hess_inv
array([[ 0.00749589,  0.01255155,  0.02396251,  0.04750988,  0.09495377],  # may vary
       [ 0.01255155,  0.02510441,  0.04794055,  0.09502834,  0.18996269],
       [ 0.02396251,  0.04794055,  0.09631614,  0.19092151,  0.38165151],
       [ 0.04750988,  0.09502834,  0.19092151,  0.38341252,  0.7664427 ],
       [ 0.09495377,  0.18996269,  0.38165151,  0.7664427,   1.53713523]])

Next, consider a minimization problem with several constraints (namely Example 16.4 from [R39]). The objective function is:

>>> fun = lambda x: (x[0] - 1)**2 + (x[1] - 2.5)**2

There are three constraints defined as:

>>> cons = ({'type': 'ineq', 'fun': lambda x:  x[0] - 2 * x[1] + 2},
...         {'type': 'ineq', 'fun': lambda x: -x[0] - 2 * x[1] + 6},
...         {'type': 'ineq', 'fun': lambda x: -x[0] + 2 * x[1] + 2})

And variables must be positive, hence the following bounds:

>>> bnds = ((0, None), (0, None))

The optimization problem is solved using the SLSQP method as:

>>> res = minimize(fun, (2, 0), method='SLSQP', bounds=bnds,
...                constraints=cons)

It should converge to the theoretical solution (1.4 ,1.7).

nested_to_normalized_fractions

dmipy.optimizers.brute2fine.nested_to_normalized_fractions(nested_fractions)

Calculates the normal volume fractions from nested ones.

normalized_to_nested_fractions_array

dmipy.optimizers.brute2fine.normalized_to_nested_fractions_array(normalized_fractions)

Calculates the nested volume fractions from normal ones.

optional_package

dmipy.optimizers.brute2fine.optional_package(name, trip_msg=None)

Return package-like thing and module setup for package name

Parameters:

name : str

package name

trip_msg : None or str

message to give when someone tries to use the return package, but we could not import it, and have returned a TripWire object instead. Default message if None.

Returns:

pkg_like : module or TripWire instance

If we can import the package, return it. Otherwise return an object raising an error when accessed

have_pkg : bool

True if import for package was successful, false otherwise

module_setup : function

callable usually set as setup_module in calling namespace, to allow skipping tests.

MixOptimizer

class dmipy.optimizers.mix.MixOptimizer(model, acquisition_scheme, maxiter=150)

The stochastic Microstructure In Crossings (MIX) optimizer [1] uses a three-step process to fit the parameters of a multi-compartment (MC) model to data. The key innovation is that they separate linear from non-linear parameters in the fitting process, meaning the linear volume fractions and non-linear other ones(e.g. diffusivities) are optimized at different stages in the process.

In the first step [1] describes using a genetic algorithm to estimate the non-linear parameters of an MC model. For this we use scipy’s differential_evolution algorithm.

In the second step [1] describes using CVX to estimate the linear volume fractions of an MC model. For this we use scipy’s COBYLA algorithm since it allows us to impose the parameter constraints we need for volume fractions; namely that they are positive and sum up to one.

The third and last step in [1] is a refining step to find a local minimum given the solutions of step one and two. For this we use scipy’s gradient-based L-BFGS-B algorithm with nested volume fractions.

The final result takes a model’s parameter_ranges into account, only yielding parameters within their allowed optimization domain.

Parameters:

model : MultiCompartmentModel instance,

A multicompartment model that has been instantiated using dMipy.

acquisition_scheme : DmipyAcquisitionScheme instance,

An acquisition scheme that has been instantiated using dMipy.

maxiter : integer

The maximum allowed iterations for the differential evolution algorithm

References

[R44]Farooq, Hamza, et al. “Microstructure Imaging of Crossing (MIX) White Matter Fibers from diffusion MRI.” Nature Scientific reports 6 (2016).

Methods

__init__(model, acquisition_scheme, maxiter=150)
cobyla_cost_function(fractions, phi, data)

Objective function of linear parameter estimation using COBYLA.

objective_function(optimized_parameter_vector, data, acquisition_scheme, x0_params)

Objective function of final refining step using L-BFGS-B

stochastic_objective_function(optimized_parameter_vector, data, acquisition_scheme, x0_params)

Objective function for stochastic non-linear parameter estimation using differential_evolution

cobyla_positivity_constraint

dmipy.optimizers.mix.cobyla_positivity_constraint(volume_fractions, *args)

COBYLA positivity constraint on volume fractions

cobyla_unity_constraint

dmipy.optimizers.mix.cobyla_unity_constraint(volume_fractions, *args)

COBYLA unity constraint on volume fractions

differential_evolution

dmipy.optimizers.mix.differential_evolution(func, bounds, args=(), strategy='best1bin', maxiter=1000, popsize=15, tol=0.01, mutation=(0.5, 1), recombination=0.7, seed=None, callback=None, disp=False, polish=True, init='latinhypercube', atol=0)

Finds the global minimum of a multivariate function. Differential Evolution is stochastic in nature (does not use gradient methods) to find the minimium, and can search large areas of candidate space, but often requires larger numbers of function evaluations than conventional gradient based techniques.

The algorithm is due to Storn and Price [R45].

Parameters:

func : callable

The objective function to be minimized. Must be in the form f(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function.

bounds : sequence

Bounds for variables. (min, max) pairs for each element in x, defining the lower and upper bounds for the optimizing argument of func. It is required to have len(bounds) == len(x). len(bounds) is used to determine the number of parameters in x.

args : tuple, optional

Any additional fixed parameters needed to completely specify the objective function.

strategy : str, optional

The differential evolution strategy to use. Should be one of:

  • ‘best1bin’
  • ‘best1exp’
  • ‘rand1exp’
  • ‘randtobest1exp’
  • ‘best2exp’
  • ‘rand2exp’
  • ‘randtobest1bin’
  • ‘best2bin’
  • ‘rand2bin’
  • ‘rand1bin’

The default is ‘best1bin’.

maxiter : int, optional

The maximum number of generations over which the entire population is evolved. The maximum number of function evaluations (with no polishing) is: (maxiter + 1) * popsize * len(x)

popsize : int, optional

A multiplier for setting the total population size. The population has popsize * len(x) individuals.

tol : float, optional

Relative tolerance for convergence, the solving stops when np.std(pop) <= atol + tol * np.abs(np.mean(population_energies)), where and atol and tol are the absolute and relative tolerance respectively.

mutation : float or tuple(float, float), optional

The mutation constant. In the literature this is also known as differential weight, being denoted by F. If specified as a float it should be in the range [0, 2]. If specified as a tuple (min, max) dithering is employed. Dithering randomly changes the mutation constant on a generation by generation basis. The mutation constant for that generation is taken from U[min, max). Dithering can help speed convergence significantly. Increasing the mutation constant increases the search radius, but will slow down convergence.

recombination : float, optional

The recombination constant, should be in the range [0, 1]. In the literature this is also known as the crossover probability, being denoted by CR. Increasing this value allows a larger number of mutants to progress into the next generation, but at the risk of population stability.

seed : int or np.random.RandomState, optional

If seed is not specified the np.RandomState singleton is used. If seed is an int, a new np.random.RandomState instance is used, seeded with seed. If seed is already a np.random.RandomState instance, then that np.random.RandomState instance is used. Specify seed for repeatable minimizations.

disp : bool, optional

Display status messages

callback : callable, callback(xk, convergence=val), optional

A function to follow the progress of the minimization. xk is the current value of x0. val represents the fractional value of the population convergence. When val is greater than one the function halts. If callback returns True, then the minimization is halted (any polishing is still carried out).

polish : bool, optional

If True (default), then scipy.optimize.minimize with the L-BFGS-B method is used to polish the best population member at the end, which can improve the minimization slightly.

init : string, optional

Specify how the population initialization is performed. Should be one of:

  • ‘latinhypercube’
  • ‘random’

The default is ‘latinhypercube’. Latin Hypercube sampling tries to maximize coverage of the available parameter space. ‘random’ initializes the population randomly - this has the drawback that clustering can occur, preventing the whole of parameter space being covered.

atol : float, optional

Absolute tolerance for convergence, the solving stops when np.std(pop) <= atol + tol * np.abs(np.mean(population_energies)), where and atol and tol are the absolute and relative tolerance respectively.

Returns:

res : OptimizeResult

The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, success a Boolean flag indicating if the optimizer exited successfully and message which describes the cause of the termination. See OptimizeResult for a description of other attributes. If polish was employed, and a lower minimum was obtained by the polishing, then OptimizeResult also contains the jac attribute.

Notes

Differential evolution is a stochastic population based method that is useful for global optimization problems. At each pass through the population the algorithm mutates each candidate solution by mixing with other candidate solutions to create a trial candidate. There are several strategies [R46] for creating trial candidates, which suit some problems more than others. The ‘best1bin’ strategy is a good starting point for many systems. In this strategy two members of the population are randomly chosen. Their difference is used to mutate the best member (the best in best1bin), \(b_0\), so far:

\[b' = b_0 + mutation * (population[rand0] - population[rand1])\]

A trial vector is then constructed. Starting with a randomly chosen ‘i’th parameter the trial is sequentially filled (in modulo) with parameters from b’ or the original candidate. The choice of whether to use b’ or the original candidate is made with a binomial distribution (the ‘bin’ in ‘best1bin’) - a random number in [0, 1) is generated. If this number is less than the recombination constant then the parameter is loaded from b’, otherwise it is loaded from the original candidate. The final parameter is always loaded from b’. Once the trial candidate is built its fitness is assessed. If the trial is better than the original candidate then it takes its place. If it is also better than the best overall candidate it also replaces that. To improve your chances of finding a global minimum use higher popsize values, with higher mutation and (dithering), but lower recombination values. This has the effect of widening the search radius, but slowing convergence.

New in version 0.15.0.

References

[R45](1, 2) Storn, R and Price, K, Differential Evolution - a Simple and Efficient Heuristic for Global Optimization over Continuous Spaces, Journal of Global Optimization, 1997, 11, 341 - 359.
[R46](1, 2) http://www1.icsi.berkeley.edu/~storn/code.html
[R47]http://en.wikipedia.org/wiki/Differential_evolution

Examples

Let us consider the problem of minimizing the Rosenbrock function. This function is implemented in rosen in scipy.optimize.

>>> from scipy.optimize import rosen, differential_evolution
>>> bounds = [(0,2), (0, 2), (0, 2), (0, 2), (0, 2)]
>>> result = differential_evolution(rosen, bounds)
>>> result.x, result.fun
(array([1., 1., 1., 1., 1.]), 1.9216496320061384e-19)

Next find the minimum of the Ackley function (http://en.wikipedia.org/wiki/Test_functions_for_optimization).

>>> from scipy.optimize import differential_evolution
>>> import numpy as np
>>> def ackley(x):
...     arg1 = -0.2 * np.sqrt(0.5 * (x[0] ** 2 + x[1] ** 2))
...     arg2 = 0.5 * (np.cos(2. * np.pi * x[0]) + np.cos(2. * np.pi * x[1]))
...     return -20. * np.exp(arg1) - np.exp(arg2) + 20. + np.e
>>> bounds = [(-5, 5), (-5, 5)]
>>> result = differential_evolution(ackley, bounds)
>>> result.x, result.fun
(array([ 0.,  0.]), 4.4408920985006262e-16)

fmin_cobyla

dmipy.optimizers.mix.fmin_cobyla(func, x0, cons, args=(), consargs=None, rhobeg=1.0, rhoend=0.0001, maxfun=1000, disp=None, catol=0.0002)

Minimize a function using the Constrained Optimization BY Linear Approximation (COBYLA) method. This method wraps a FORTRAN implementation of the algorithm.

Parameters:

func : callable

Function to minimize. In the form func(x, *args).

x0 : ndarray

Initial guess.

cons : sequence

Constraint functions; must all be >=0 (a single function if only 1 constraint). Each function takes the parameters x as its first argument, and it can return either a single number or an array or list of numbers.

args : tuple, optional

Extra arguments to pass to function.

consargs : tuple, optional

Extra arguments to pass to constraint functions (default of None means use same extra arguments as those passed to func). Use () for no extra arguments.

rhobeg : float, optional

Reasonable initial changes to the variables.

rhoend : float, optional

Final accuracy in the optimization (not precisely guaranteed). This is a lower bound on the size of the trust region.

disp : {0, 1, 2, 3}, optional

Controls the frequency of output; 0 implies no output.

maxfun : int, optional

Maximum number of function evaluations.

catol : float, optional

Absolute tolerance for constraint violations.

Returns:

x : ndarray

The argument that minimises f.

See also

minimize
Interface to minimization algorithms for multivariate functions. See the ‘COBYLA’ method in particular.

Notes

This algorithm is based on linear approximations to the objective function and each constraint. We briefly describe the algorithm.

Suppose the function is being minimized over k variables. At the jth iteration the algorithm has k+1 points v_1, …, v_(k+1), an approximate solution x_j, and a radius RHO_j. (i.e. linear plus a constant) approximations to the objective function and constraint functions such that their function values agree with the linear approximation on the k+1 points v_1,.., v_(k+1). This gives a linear program to solve (where the linear approximations of the constraint functions are constrained to be non-negative).

However the linear approximations are likely only good approximations near the current simplex, so the linear program is given the further requirement that the solution, which will become x_(j+1), must be within RHO_j from x_j. RHO_j only decreases, never increases. The initial RHO_j is rhobeg and the final RHO_j is rhoend. In this way COBYLA’s iterations behave like a trust region algorithm.

Additionally, the linear program may be inconsistent, or the approximation may give poor improvement. For details about how these issues are resolved, as well as how the points v_i are updated, refer to the source code or the references below.

References

Powell M.J.D. (1994), “A direct search optimization method that models the objective and constraint functions by linear interpolation.”, in Advances in Optimization and Numerical Analysis, eds. S. Gomez and J-P Hennart, Kluwer Academic (Dordrecht), pp. 51-67

Powell M.J.D. (1998), “Direct search algorithms for optimization calculations”, Acta Numerica 7, 287-336

Powell M.J.D. (2007), “A view of algorithms for optimization without derivatives”, Cambridge University Technical Report DAMTP 2007/NA03

Examples

Minimize the objective function f(x,y) = x*y subject to the constraints x**2 + y**2 < 1 and y > 0:

>>> def objective(x):
...     return x[0]*x[1]
...
>>> def constr1(x):
...     return 1 - (x[0]**2 + x[1]**2)
...
>>> def constr2(x):
...     return x[1]
...
>>> from scipy.optimize import fmin_cobyla
>>> fmin_cobyla(objective, [0.0, 0.1], [constr1, constr2], rhoend=1e-7)
array([-0.70710685,  0.70710671])

The exact solution is (-sqrt(2)/2, sqrt(2)/2).

minimize

dmipy.optimizers.mix.minimize(fun, x0, args=(), method=None, jac=None, hess=None, hessp=None, bounds=None, constraints=(), tol=None, callback=None, options=None)

Minimization of scalar function of one or more variables.

In general, the optimization problems are of the form:

minimize f(x) subject to

g_i(x) >= 0,  i = 1,...,m
h_j(x)  = 0,  j = 1,...,p

where x is a vector of one or more variables. g_i(x) are the inequality constraints. h_j(x) are the equality constrains.

Optionally, the lower and upper bounds for each element in x can also be specified using the bounds argument.

Parameters:

fun : callable

The objective function to be minimized. Must be in the form f(x, *args). The optimizing argument, x, is a 1-D array of points, and args is a tuple of any additional fixed parameters needed to completely specify the function.

x0 : ndarray

Initial guess. len(x0) is the dimensionality of the minimization problem.

args : tuple, optional

Extra arguments passed to the objective function and its derivatives (Jacobian, Hessian).

method : str or callable, optional

Type of solver. Should be one of

  • ‘Nelder-Mead’ (see here)
  • ‘Powell’ (see here)
  • ‘CG’ (see here)
  • ‘BFGS’ (see here)
  • ‘Newton-CG’ (see here)
  • ‘L-BFGS-B’ (see here)
  • ‘TNC’ (see here)
  • ‘COBYLA’ (see here)
  • ‘SLSQP’ (see here)
  • ‘dogleg’ (see here)
  • ‘trust-ncg’ (see here)
  • ‘trust-exact’ (see here)
  • ‘trust-krylov’ (see here)
  • custom - a callable object (added in version 0.14.0), see below for description.

If not given, chosen to be one of BFGS, L-BFGS-B, SLSQP, depending if the problem has constraints or bounds.

jac : bool or callable, optional

Jacobian (gradient) of objective function. Only for CG, BFGS, Newton-CG, L-BFGS-B, TNC, SLSQP, dogleg, trust-ncg, trust-krylov, trust-region-exact. If jac is a Boolean and is True, fun is assumed to return the gradient along with the objective function. If False, the gradient will be estimated numerically. jac can also be a callable returning the gradient of the objective. In this case, it must accept the same arguments as fun.

hess, hessp : callable, optional

Hessian (matrix of second-order derivatives) of objective function or Hessian of objective function times an arbitrary vector p. Only for Newton-CG, dogleg, trust-ncg, trust-krylov, trust-region-exact. Only one of hessp or hess needs to be given. If hess is provided, then hessp will be ignored. If neither hess nor hessp is provided, then the Hessian product will be approximated using finite differences on jac. hessp must compute the Hessian times an arbitrary vector.

bounds : sequence, optional

Bounds for variables (only for L-BFGS-B, TNC and SLSQP). (min, max) pairs for each element in x, defining the bounds on that parameter. Use None for one of min or max when there is no bound in that direction.

constraints : dict or sequence of dict, optional

Constraints definition (only for COBYLA and SLSQP). Each constraint is defined in a dictionary with fields:

type : str

Constraint type: ‘eq’ for equality, ‘ineq’ for inequality.

fun : callable

The function defining the constraint.

jac : callable, optional

The Jacobian of fun (only for SLSQP).

args : sequence, optional

Extra arguments to be passed to the function and Jacobian.

Equality constraint means that the constraint function result is to be zero whereas inequality means that it is to be non-negative. Note that COBYLA only supports inequality constraints.

tol : float, optional

Tolerance for termination. For detailed control, use solver-specific options.

options : dict, optional

A dictionary of solver options. All methods accept the following generic options:

maxiter : int

Maximum number of iterations to perform.

disp : bool

Set to True to print convergence messages.

For method-specific options, see show_options().

callback : callable, optional

Called after each iteration, as callback(xk), where xk is the current parameter vector.

Returns:

res : OptimizeResult

The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, success a Boolean flag indicating if the optimizer exited successfully and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

See also

minimize_scalar
Interface to minimization algorithms for scalar univariate functions
show_options
Additional options accepted by the solvers

Notes

This section describes the available solvers that can be selected by the ‘method’ parameter. The default method is BFGS.

Unconstrained minimization

Method Nelder-Mead uses the Simplex algorithm [R48], [R49]. This algorithm is robust in many applications. However, if numerical computation of derivative can be trusted, other algorithms using the first and/or second derivatives information might be preferred for their better performance in general.

Method Powell is a modification of Powell’s method [R50], [R51] which is a conjugate direction method. It performs sequential one-dimensional minimizations along each vector of the directions set (direc field in options and info), which is updated at each iteration of the main minimization loop. The function need not be differentiable, and no derivatives are taken.

Method CG uses a nonlinear conjugate gradient algorithm by Polak and Ribiere, a variant of the Fletcher-Reeves method described in [R52] pp. 120-122. Only the first derivatives are used.

Method BFGS uses the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) [R52] pp. 136. It uses the first derivatives only. BFGS has proven good performance even for non-smooth optimizations. This method also returns an approximation of the Hessian inverse, stored as hess_inv in the OptimizeResult object.

Method Newton-CG uses a Newton-CG algorithm [R52] pp. 168 (also known as the truncated Newton method). It uses a CG method to the compute the search direction. See also TNC method for a box-constrained minimization with a similar algorithm. Suitable for large-scale problems.

Method dogleg uses the dog-leg trust-region algorithm [R52] for unconstrained minimization. This algorithm requires the gradient and Hessian; furthermore the Hessian is required to be positive definite.

Method trust-ncg uses the Newton conjugate gradient trust-region algorithm [R52] for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems.

Method trust-krylov uses the Newton GLTR trust-region algorithm [14]_, [15]_ for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems. On indefinite problems it requires usually less iterations than the trust-ncg method and is recommended for medium and large-scale problems.

Method trust-exact is a trust-region method for unconstrained minimization in which quadratic subproblems are solved almost exactly [13]_. This algorithm requires the gradient and the Hessian (which is not required to be positive definite). It is, in many situations, the Newton method to converge in fewer iteraction and the most recommended for small and medium-size problems.

Constrained minimization

Method L-BFGS-B uses the L-BFGS-B algorithm [R53], [R54] for bound constrained minimization.

Method TNC uses a truncated Newton algorithm [R52], [R55] to minimize a function with variables subject to bounds. This algorithm uses gradient information; it is also called Newton Conjugate-Gradient. It differs from the Newton-CG method described above as it wraps a C implementation and allows each variable to be given upper and lower bounds.

Method COBYLA uses the Constrained Optimization BY Linear Approximation (COBYLA) method [R56], [10]_, [11]_. The algorithm is based on linear approximations to the objective function and each constraint. The method wraps a FORTRAN implementation of the algorithm. The constraints functions ‘fun’ may return either a single number or an array or list of numbers.

Method SLSQP uses Sequential Least SQuares Programming to minimize a function of several variables with any combination of bounds, equality and inequality constraints. The method wraps the SLSQP Optimization subroutine originally implemented by Dieter Kraft [12]_. Note that the wrapper handles infinite values in bounds by converting them into large floating values.

Custom minimizers

It may be useful to pass a custom minimization method, for example when using a frontend to this method such as scipy.optimize.basinhopping or a different library. You can simply pass a callable as the method parameter.

The callable is called as method(fun, x0, args, **kwargs, **options) where kwargs corresponds to any other parameters passed to minimize (such as callback, hess, etc.), except the options dict, which has its contents also passed as method parameters pair by pair. Also, if jac has been passed as a bool type, jac and fun are mangled so that fun returns just the function values and jac is converted to a function returning the Jacobian. The method shall return an OptimizeResult object.

The provided method callable must be able to accept (and possibly ignore) arbitrary parameters; the set of parameters accepted by minimize may expand in future versions and then these parameters will be passed to the method. You can find an example in the scipy.optimize tutorial.

New in version 0.11.0.

References

[R48](1, 2) Nelder, J A, and R Mead. 1965. A Simplex Method for Function Minimization. The Computer Journal 7: 308-13.
[R49](1, 2) Wright M H. 1996. Direct search methods: Once scorned, now respectable, in Numerical Analysis 1995: Proceedings of the 1995 Dundee Biennial Conference in Numerical Analysis (Eds. D F Griffiths and G A Watson). Addison Wesley Longman, Harlow, UK. 191-208.
[R50](1, 2) Powell, M J D. 1964. An efficient method for finding the minimum of a function of several variables without calculating derivatives. The Computer Journal 7: 155-162.
[R51](1, 2) Press W, S A Teukolsky, W T Vetterling and B P Flannery. Numerical Recipes (any edition), Cambridge University Press.
[R52](1, 2, 3, 4, 5, 6, 7, 8) Nocedal, J, and S J Wright. 2006. Numerical Optimization. Springer New York.
[R53](1, 2) Byrd, R H and P Lu and J. Nocedal. 1995. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM Journal on Scientific and Statistical Computing 16 (5): 1190-1208.
[R54](1, 2) Zhu, C and R H Byrd and J Nocedal. 1997. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization. ACM Transactions on Mathematical Software 23 (4): 550-560.
[R55](1, 2) Nash, S G. Newton-Type Minimization Via the Lanczos Method. 1984. SIAM Journal of Numerical Analysis 21: 770-778.
[R56](1, 2) Powell, M J D. A direct search optimization method that models the objective and constraint functions by linear interpolation. 1994. Advances in Optimization and Numerical Analysis, eds. S. Gomez and J-P Hennart, Kluwer Academic (Dordrecht), 51-67.
[10]Powell M J D. Direct search algorithms for optimization calculations. 1998. Acta Numerica 7: 287-336.
[11]Powell M J D. A view of algorithms for optimization without derivatives. 2007.Cambridge University Technical Report DAMTP 2007/NA03
[12]Kraft, D. A software package for sequential quadratic programming. 1988. Tech. Rep. DFVLR-FB 88-28, DLR German Aerospace Center – Institute for Flight Mechanics, Koln, Germany.
[13]Conn, A. R., Gould, N. I., and Toint, P. L. Trust region methods. 2000. Siam. pp. 169-200.
[14]F. Lenders, C. Kirches, A. Potschka: “trlib: A vector-free implementation of the GLTR method for iterative solution of the trust region problem”, https://arxiv.org/abs/1611.04718
[15]N. Gould, S. Lucidi, M. Roma, P. Toint: “Solving the Trust-Region Subproblem using the Lanczos Method”, SIAM J. Optim., 9(2), 504–525, (1999).

Examples

Let us consider the problem of minimizing the Rosenbrock function. This function (and its respective derivatives) is implemented in rosen (resp. rosen_der, rosen_hess) in the scipy.optimize.

>>> from scipy.optimize import minimize, rosen, rosen_der

A simple application of the Nelder-Mead method is:

>>> x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
>>> res = minimize(rosen, x0, method='Nelder-Mead', tol=1e-6)
>>> res.x
array([ 1.,  1.,  1.,  1.,  1.])

Now using the BFGS algorithm, using the first derivative and a few options:

>>> res = minimize(rosen, x0, method='BFGS', jac=rosen_der,
...                options={'gtol': 1e-6, 'disp': True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 26
         Function evaluations: 31
         Gradient evaluations: 31
>>> res.x
array([ 1.,  1.,  1.,  1.,  1.])
>>> print(res.message)
Optimization terminated successfully.
>>> res.hess_inv
array([[ 0.00749589,  0.01255155,  0.02396251,  0.04750988,  0.09495377],  # may vary
       [ 0.01255155,  0.02510441,  0.04794055,  0.09502834,  0.18996269],
       [ 0.02396251,  0.04794055,  0.09631614,  0.19092151,  0.38165151],
       [ 0.04750988,  0.09502834,  0.19092151,  0.38341252,  0.7664427 ],
       [ 0.09495377,  0.18996269,  0.38165151,  0.7664427,   1.53713523]])

Next, consider a minimization problem with several constraints (namely Example 16.4 from [R52]). The objective function is:

>>> fun = lambda x: (x[0] - 1)**2 + (x[1] - 2.5)**2

There are three constraints defined as:

>>> cons = ({'type': 'ineq', 'fun': lambda x:  x[0] - 2 * x[1] + 2},
...         {'type': 'ineq', 'fun': lambda x: -x[0] - 2 * x[1] + 6},
...         {'type': 'ineq', 'fun': lambda x: -x[0] + 2 * x[1] + 2})

And variables must be positive, hence the following bounds:

>>> bnds = ((0, None), (0, None))

The optimization problem is solved using the SLSQP method as:

>>> res = minimize(fun, (2, 0), method='SLSQP', bounds=bnds,
...                constraints=cons)

It should converge to the theoretical solution (1.4 ,1.7).

nested_to_normalized_fractions

dmipy.optimizers.mix.nested_to_normalized_fractions(nested_fractions)

Function to convert nested to normalized volume fractions.