leastsq: Levenberg Marquardt with constraints#

This module implements a Levenberg-Marquardt algorithm with constraints on the fitted parameters without introducing any other dependendency than numpy.

If scipy dependency is not an issue, and no constraints are applied to the fitting parameters, there is no real gain compared to the use of scipy.optimize.curve_fit other than a more conservative calculation of uncertainties on fitted parameters.

This module is a refactored version of PyMca Gefit.py module.

For a tutorial on how to use leastsq(), see Using leastsq().

Functions#

leastsq(model, xdata, ydata, p0, sigma=None, constraints=None, model_deriv=None, epsfcn=None, deltachi=None, full_output=None, check_finite=True, left_derivative=False, max_iter=100)[source]#

Use non-linear least squares Levenberg-Marquardt algorithm to fit a function, f, to data with optional constraints on the fitted parameters.

Assumes ydata = f(xdata, *params) + eps

Parameters:
  • model – callable The model function, f(x, …). It must take the independent variable as the first argument and the parameters to fit as separate remaining arguments. The returned value is a one dimensional array of floats.

  • xdata – An M-length sequence. The independent variable where the data is measured.

  • ydata – An M-length sequence The dependent data — nominally f(xdata, …)

  • p0 – N-length sequence Initial guess for the parameters.

  • sigma – None or M-length sequence, optional If not None, the uncertainties in the ydata array. These are used as weights in the least-squares problem i.e. minimising np.sum( ((f(xdata, *popt) - ydata) / sigma)**2 ) If None, the uncertainties are assumed to be 1

  • constraints (optional, None or 2D sequence) –

    If provided, it is a 2D sequence of dimension (n_parameters, 3) where, for each parameter denoted by the index i, the meaning is

    • constraints[i][0]

      • 0 - Free (CFREE)

      • 1 - Positive (CPOSITIVE)

      • 2 - Quoted (CQUOTED)

      • 3 - Fixed (CFIXED)

      • 4 - Factor (CFACTOR)

      • 5 - Delta (CDELTA)

      • 6 - Sum (CSUM)

    • constraints[i][1]

      • Ignored if constraints[i][0] is 0, 1, 3

      • Min value of the parameter if constraints[i][0] is CQUOTED

      • Index of fitted parameter to which it is related

    • constraints[i][2]

      • Ignored if constraints[i][0] is 0, 1, 3

      • Max value of the parameter if constraints[i][0] is CQUOTED

      • Factor to apply to related parameter with index constraints[i][1]

      • Difference with parameter with index constraints[i][1]

      • Sum obtained when adding parameter with index constraints[i][1]

  • model_deriv (optional, None or callable) – None (default) or function providing the derivatives of the fitting function respect to the fitted parameters. It will be called as model_deriv(xdata, parameters, index) where parameters is a sequence with the current values of the fitting parameters, index is the fitting parameter index for which the the derivative has to be provided in the supplied array of xdata points.

  • epsfcn (optional, float) – float A variable used in determining a suitable parameter variation when calculating the numerical derivatives (for model_deriv=None). Normally the actual step length will be sqrt(epsfcn)*x Original Gefit module was using epsfcn 1.0e-5 while default value is now numpy.finfo(numpy.float64).eps as in scipy

  • deltachi (optional, float) – float A variable used to control the minimum change in chisq to consider the fitting process not worth to be continued. Default is 0.1 %.

  • full_output – bool, optional non-zero to return all optional outputs. The default is None what will give a warning in case of a constrained fit without having set this kweyword.

  • check_finite – bool, optional If True, check that the input arrays do not contain nans of infs, and raise a ValueError if they do. Setting this parameter to False will ignore input arrays values containing nans. Default is True.

  • left_derivative (optional, bool) – This parameter only has an influence if no derivative function is provided. When True the left and right derivatives of the model will be calculated for each fitted parameters thus leading to the double number of function evaluations. Default is False. Original Gefit module was always using left_derivative as True.

  • max_iter – Maximum number of iterations (default is 100)

Returns:

Returns a tuple of length 2 (or 3 if full_ouput is True) with the content:

popt: array

Optimal values for the parameters so that the sum of the squared error of f(xdata, *popt) - ydata is minimized

pcov: 2d array

If no constraints are applied, this array contains the estimated covariance of popt. The diagonal provides the variance of the parameter estimate. To compute one standard deviation errors use perr = np.sqrt(np.diag(pcov)). If constraints are applied, this array does not contain the estimated covariance of the parameters actually used during the fitting process but the uncertainties after recalculating the covariance if all the parameters were free. To get the actual uncertainties following error propagation of the actually fitted parameters one should set full_output to True and access the uncertainties key.

infodict: dict

a dictionary of optional outputs with the keys:

uncertainties

The actual uncertainty on the optimized parameters.

nfev

The number of function calls

fvec

The function evaluated at the output

niter

The number of iterations performed

chisq

The chi square np.sum( ((f(xdata, *popt) - ydata) / sigma)**2 )

reduced_chisq

The chi square np.sum( ((f(xdata, *popt) - ydata) / sigma)**2 ) divided by the number of degrees of freedom (M - number_of_free_parameters)

chisq_alpha_beta(model, parameters, x, y, weight, constraints=None, model_deriv=None, epsfcn=None, left_derivative=False, last_evaluation=None, full_output=False)[source]#

Get chi square, the curvature matrix alpha and the matrix beta according to the input parameters. If all the parameters are unconstrained, the covariance matrix is the inverse of the alpha matrix.

Parameters:
  • model – callable The model function, f(x, …). It must take the independent variable as the first argument and the parameters to fit as separate remaining arguments. The returned value is a one dimensional array of floats.

  • parameters – N-length sequence Values of parameters at which function and derivatives are to be calculated.

  • x – An M-length sequence. The independent variable where the data is measured.

  • y – An M-length sequence The dependent data — nominally f(xdata, …)

  • weight – M-length sequence Weights to be applied in the calculation of chi square As a reminder chisq = np.sum(weigth * (model(x, *parameters) - y)**2)

  • constraints (optional, None or 2D sequence) –

    If provided, it is a 2D sequence of dimension (n_parameters, 3) where, for each parameter denoted by the index i, the meaning is

    • constraints[i][0]

      • 0 - Free (CFREE)

      • 1 - Positive (CPOSITIVE)

      • 2 - Quoted (CQUOTED)

      • 3 - Fixed (CFIXED)

      • 4 - Factor (CFACTOR)

      • 5 - Delta (CDELTA)

      • 6 - Sum (CSUM)

    • constraints[i][1]

      • Ignored if constraints[i][0] is 0, 1, 3

      • Min value of the parameter if constraints[i][0] is CQUOTED

      • Index of fitted parameter to which it is related

    • constraints[i][2]

      • Ignored if constraints[i][0] is 0, 1, 3

      • Max value of the parameter if constraints[i][0] is CQUOTED

      • Factor to apply to related parameter with index constraints[i][1]

      • Difference with parameter with index constraints[i][1]

      • Sum obtained when adding parameter with index constraints[i][1]

  • model_deriv (optional, None or callable) – None (default) or function providing the derivatives of the fitting function respect to the fitted parameters. It will be called as model_deriv(xdata, parameters, index) where parameters is a sequence with the current values of the fitting parameters, index is the fitting parameter index for which the the derivative has to be provided in the supplied array of xdata points.

  • epsfcn (optional, float) – float A variable used in determining a suitable parameter variation when calculating the numerical derivatives (for model_deriv=None). Normally the actual step length will be sqrt(epsfcn)*x Original Gefit module was using epsfcn 1.0e-10 while default value is now numpy.finfo(numpy.float64).eps as in scipy

  • left_derivative (optional, bool) – This parameter only has an influence if no derivative function is provided. When True the left and right derivatives of the model will be calculated for each fitted parameters thus leading to the double number of function evaluations. Default is False. Original Gefit module was always using left_derivative as True.

  • last_evaluation – An M-length array Used for optimization purposes. If supplied, this array will be taken as the result of evaluating the function, that is as the result of model(x, *parameters) thus avoiding the evaluation call.

  • full_output

    bool, optional

    Additional output used for internal purposes with the keys:

    function_calls

    The number of model function calls performed.

    fitparam

    A sequence with the actual free parameters

    free_index

    Sequence with the indices of the free parameters in input parameters sequence.

    noigno

    Sequence with the indices of the original parameters considered in the calculations.