scipy curve_fit strange result

纵饮孤独 提交于 2021-02-07 10:47:15

问题


I am trying to fit a distribution with scipy's curve_fit. I tried to fit a one component exponential function which resulted in an almost straight line (see figure). I also tried a two component exponential fit which seemed to work nicely. Two components just means that a part of the equation repeats with different input parameters. Anyway, here is the one component fit function:

def Exponential(Z,w0,z0,Z0):
    z = Z - Z0
    termB = (newsigma**2 + z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    termA = (newsigma**2 - z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    return w0/2.0 * numpy.exp(-(z**2 / (2.0*newsigma**2))) * (numpy.exp(termA**2)*erfc(termA) + numpy.exp(termB**2)*erfc(termB))

and the fitting is done with

fitexp = curve_fit(Exponential,newx,y2)

Then I tried something, just to try it out. I took two parameters of the two component fit, but did not use them in the calculation.

def ExponentialNew(Z,w0,z0,w1,z1,Z0):
    z = Z - Z0
    termB = (newsigma**2 + z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    termA = (newsigma**2 - z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    return w0/2.0 * numpy.exp(-(z**2 / (2.0*newsigma**2))) * (numpy.exp(termA**2)*erfc(termA) + numpy.exp(termB**2)*erfc(termB))

And suddenly this works.

Now, my quation is. WHY? As you can see, there is absolutely no difference in the calculation of the fit. It just gets two extra variables that are not used. Should this not get the same result?

@Andras_Deak An actual example:

from scipy.special import erfc
import numpy
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

#setup data
x = [-58.,-54.,-50.,-46.,-42.,-38.,-34.,-30.,-26.,-22.,-18.,-14.,-10.,-6.,-2.,2.,6.,10.,14.,18.,22.,26.,30.,34.,38.,42.,46.,50.,54.,58.]
y = [23.06763817, 16.89802085, 17.83258379, 16.63446237, 13.81878965, 12.97965839, 14.30451789, 16.98288216, 22.26811491, 28.56756908, 33.06990344, 38.59842098, 54.19860393, 86.37381604, 137.47253315, 199.49724512, 238.66047662, 219.89405445, 160.68820199, 103.88901303, 65.92405727, 43.84596266, 31.5395342, 25.9610156, 22.71683709, 18.06740651, 13.85362374, 11.12867065, 10.36502799, 11.31855619]
y_err = [17.9823065, 4.13684885, 1.66490726, 2.4109372, 2.93359141, 1.9701747, 3.19214881,  3.65593012, 2.89089074, 3.58922121, 4.25505348, 4.72728874, 6.77736567, 11.3888196, 21.87771722, 39.0087495, 56.6910311, 51.7592369, 26.39750958, 10.62678862, 7.85893395, 8.11741621, 7.91731416, 7.07739132, 5.41818744, 6.11286843, 8.27070757, 7.85323065, 4.26885499, 0.9047867]

#function to fit
def Exponential2(Z, w0, z0, w1, z1, Z0):
    z = Z - Z0
    s = 3.98098937586
    a = z**2 / (2.0*s**2)
    b = (s**2 + z*z0) / (numpy.sqrt(2.0)*s*z0)
    c = (s**2 - z*z0) / (numpy.sqrt(2.0)*s*z0)
    d = (s**2 + z*z1) / (numpy.sqrt(2.0)*s*z1)
    e = (s**2 - z*z1) / (numpy.sqrt(2.0)*s*z1)
    return w0/2.0 * numpy.exp(-a) * (numpy.exp(c**2)*erfc(c) + numpy.exp(b**2)*erfc(b)) + w1/2.0 * numpy.exp(-a) * (numpy.exp(e**2)*erfc(e) + numpy.exp(d**2)*erfc(d))


#derive and set initial guess
ymaxpos = x[numpy.where(y==numpy.max(y))[0]]
p0_2 = [numpy.max(y),5,numpy.max(y)/2.0,20,ymaxpos]

#fit
fitexp2 = curve_fit(Exponential2,x,y,p0=p0_2,sigma=y_err)

#get results
w0err = numpy.sqrt(numpy.diag(fitexp2[1]))[0]
z0err = numpy.sqrt(numpy.diag(fitexp2[1]))[1]
w1err = numpy.sqrt(numpy.diag(fitexp2[1]))[2]
z1err = numpy.sqrt(numpy.diag(fitexp2[1]))[3]
w0 = fitexp2[0][0]
z0 = fitexp2[0][1]
w1 = fitexp2[0][2]
z1 = fitexp2[0][3]
Z0 = fitexp2[0][4]
#new x array for smoother curve
smoothx = numpy.arange(-58,59,0.1)
y2 = Exponential2(smoothx,w0,z0,w1,z1,Z0)

print 'Exponential 2: w0: '+str(w0.round(3))+' +/- '+str(w0err.round(3))+' \t z0: '+str(z0.round(3))+' +/- '+str(z0err.round(3))+' \t w1: '+str(w1.round(3))+' +/- '+str(w1err.round(3))+' \t\t z1: '+str(z1.round(3))+' +/- '+str(z1err.round(3))

#plot
fig = plt.figure()
ax = fig.add_subplot(111)
ax.errorbar(x,y,y_err,fmt='o',markersize=2,label='data')
ax.plot(smoothx,y2,label='fit',color='red')
ax.grid()
ax.legend()
plt.show()

As you can see, the plot does look good, but the returned value z1 is totaly unrealistic.

Exponential 2: w0: 312.608 +/- 36.764    z0: 8.263 +/- 1.158     w1: 12.689 +/- 9.138        z1: 1862257.883 +/- 45201809883.8

回答1:


In my experience curve_fit can sometimes act up and stick with the initial values for the parameters. I would suspect that in your case adding a few fake parameters changed the heuristics of how the relevant parameters are being initialized (although this contradicts the documentation's statement that with no initial values given, they all default to 1).

It helps a lot in obtaining reliable fits if you specify reasonable bounds and initial values for your fitting parameters (I mean the p0 and bounds keywords). The fact that the default starting values should all be 1 suggests that for most use cases, the default won't cut it.



来源:https://stackoverflow.com/questions/40008017/scipy-curve-fit-strange-result

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!