Python scipy.optimise.curve_fit gives linear fit

问题

I have come across a problem when playing with the parameters of the curve_fit from scipy. I have initially copied the code suggested by the docs. I then changed the equation slightly and it was fine, but having increased the np.linspace, the whole prediction ended up being a straight line. Any ideas?

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt


def f(x, a, b, c):
    # This works fine on smaller numbers
    return (a - c) * np.exp(-x / b) + c


xdata = np.linspace(60, 3060, 200)
ydata = f(xdata, 100, 400, 20)

# noise
np.random.seed(1729)
ydata = ydata + np.random.normal(size=xdata.size) * 0.2

# graph
fig, ax = plt.subplots()
plt.plot(xdata, ydata, marker="o")
pred, covar = curve_fit(f, xdata, ydata)
plt.plot(xdata, f(xdata, *pred), label="prediciton")
plt.show()

回答1:

Here is example code using your data and equation, with the initial parameter estimates given by scipy's differential_evolution genetic algorithm module. That module uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search. In this example those bounds are taken from the data maximum and minimum values. It is much easier to supply ranges for the initial parameter estimates rather than specific values.

import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings


def func(x, a, b, c):
    return (a - c) * numpy.exp(-x / b) + c


xData = numpy.linspace(60, 3060, 200)
yData = func(xData, 100, 400, 20)

# noise
numpy.random.seed(1729)
yData = yData + numpy.random.normal(size=xData.size) * 0.2


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
    warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
    val = func(xData, *parameterTuple)
    return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():
    # min and max used for bounds
    maxX = max(xData)
    minX = min(xData)
    maxY = max(yData)
    minY = min(yData)

    parameterBounds = []
    parameterBounds.append([minY, maxY]) # search bounds for a
    parameterBounds.append([minX, maxX]) # search bounds for b
    parameterBounds.append([minY, maxY]) # search bounds for c

    # "seed" the numpy random number generator for repeatable results
    result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
    return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters) 

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    # first the raw data as a scatter plot
    axes.plot(xData, yData,  'D')

    # create data for the fitted equation plot
    xModel = numpy.linspace(min(xData), max(xData))
    yModel = func(xModel, *fittedParameters)

    # now the model as a line plot
    axes.plot(xModel, yModel)

    axes.set_xlabel('X Data') # X axis data label
    axes.set_ylabel('Y Data') # Y axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

回答2:

You may need to start with a better guess, The default initial guess (1.0, 1.0, 1.0) seems to be in the divergent region.

I use the initial guess p0 = (50,200,100) and it works

fig, ax = plt.subplots()
plt.plot(xdata, ydata, marker="o")
pred, covar = curve_fit(f, xdata, ydata, p0 = (50,200,100))
plt.plot(xdata, f(xdata, *pred), label="prediciton")
plt.show()

回答3:

This is due to a limitation of Levenberg–Marquardt algorithm which curve_fit uses by default. The good way to use it is to provide some decent initial guess for parameters before optimize. In my experiense this is particularly important when optimizing exponential functions like your example. With such iterative algorithms as LM, the quality of your starting point determines that where the result will converge. The more parameters you have the more likely that your final result will converge to a completely unwanted curve. Overall the solution is finding a good initial guess somehow as other answers did.

来源：https://stackoverflow.com/questions/59391249/python-scipy-optimise-curve-fit-gives-linear-fit

标签

python

scipy

curve-fitting

scipy-optimize