I\'m trying to fit a Gaussian for my data (which is already a rough gaussian). I\'ve already taken the advice of those here and tried curve_fit
and leasts
Here is corrected code:
import pylab as plb
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
x = ar(range(10))
y = ar([0,1,2,3,4,5,4,3,2,1])
n = len(x) #the number of data
mean = sum(x*y)/n #note this correction
sigma = sum(y*(x-mean)**2)/n #note this correction
def gaus(x,a,x0,sigma):
return a*exp(-(x-x0)**2/(2*sigma**2))
popt,pcov = curve_fit(gaus,x,y,p0=[1,mean,sigma])
plt.plot(x,y,'b+:',label='data')
plt.plot(x,gaus(x,*popt),'ro:',label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()
result:
sigma = sum(y*(x - mean)**2)
should be
sigma = np.sqrt(sum(y*(x - mean)**2))
You get a horizontal straight line because it did not converge.
Better convergence is attained if the first parameter of the fitting (p0) is put as max(y), 5 in the example, instead of 1.
After losing hours trying to find my error, the problem is your formula:
sigma = sum(y*(x-mean)**2)/n
This previous formula is wrong, the correct formula is the square root of this!;
sqrt(sum(y*(x-mean)**2)/n)
Hope this helps
You need good starting values such that the curve_fit
function converges at "good" values. I can not really say why your fit did not converge (even though the definition of your mean is strange - check below) but I will give you a strategy that works for non-normalized Gaussian-functions like your one.
The estimated parameters should be close to the final values (use the weighted arithmetic mean - divide by the sum of all values):
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
x = np.arange(10)
y = np.array([0, 1, 2, 3, 4, 5, 4, 3, 2, 1])
# weighted arithmetic mean (corrected - check the section below)
mean = sum(x * y) / sum(y)
sigma = np.sqrt(sum(y * (x - mean)**2) / sum(y))
def Gauss(x, a, x0, sigma):
return a * np.exp(-(x - x0)**2 / (2 * sigma**2))
popt,pcov = curve_fit(Gauss, x, y, p0=[max(y), mean, sigma])
plt.plot(x, y, 'b+:', label='data')
plt.plot(x, Gauss(x, *popt), 'r-', label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()
I personally prefer using numpy.
Since the reviewers did not like my edit on #Developer's code, I will explain for what case I would suggest an improved code. The mean of developer does not correspond to one of the normal definitions of the mean.
Your definition returns:
>>> sum(x * y)
125
Developer's definition returns:
>>> sum(x * y) / len(x)
12.5 #for Python 3.x
The weighted arithmetic mean:
>>> sum(x * y) / sum(y)
5.0
Similarly you can compare the definitions of standard deviation (sigma
). Compare with the figure of the resulting fit:
In Python 2.x you should additionally use the new division to not run into weird results or convert the the numbers before the division explicitly:
from __future__ import division
or e.g.
sum(x * y) * 1. / sum(y)
There is another way of performing the fit, which is by using the 'lmfit' package. It basically uses the cuve_fit but is much better in fitting and offers complex fitting as well. Detailed step by step instructions are given in the below link. http://cars9.uchicago.edu/software/python/lmfit/model.html#model.best_fit