When using curve_fit from scipy.optimize to fit a some data in python, one first defines the fitting function (e.g. a 2nd order polynomial) as follows:
def f
Here's an example of fitting a curve defined in terms of an integral. The curve is the integral of sin(t*w)/t+p
over t
from 0 to Pi. Our x data points correspond to w
, and we're adjusting the p
parameter to to get the data to fit.
import math, numpy, scipy.optimize, scipy.integrate
def integrand(t, args):
w, p = args
return math.sin(t * w)/t + p
def curve(w, p):
res = scipy.integrate.quad(integrand, 0.0, math.pi, [w, p])
return res[0]
vcurve = numpy.vectorize(curve, excluded=set([1]))
truexdata = numpy.asarray([0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0])
trueydata = vcurve(truexdata, 1.0)
xdata = truexdata + 0.1 * numpy.random.randn(8)
ydata = trueydata + 0.1 * numpy.random.randn(8)
popt, pcov = scipy.optimize.curve_fit(vcurve,
xdata, ydata,
p0=[2.0])
print popt
That'll print something out fairly close to 1.0, which is what we used as p
when we created the trueydata
.
Note that we use numpy.vectorize
on the curve function to produce a vectorized version compatible with scipy.optimize.curve_fit
.
Sometimes you can be lucky and you're able to evaluate the integral analytically. In the following example the product of h(t)=exp(-(t-x)**2/2)
and a second degree polynomial g(t)
is integrated from 0 to infinity. Sympy is used to evaluate the Integral and generate a function usable for curve_fit()
:
import sympy as sy
sy.init_printing() # LaTeX-like pretty printing of IPython
t, x = sy.symbols("t, x", real=True)
h = sy.exp(-(t-x)**2/2)
a0, a1, a2 = sy.symbols('a:3', real=True) # unknown coefficients
g = a0 + a1*t + a2*t**2
gh = (g*h).simplify() # the intgrand
G = sy.integrate(gh, (t, 0, sy.oo)).simplify() # integrate from 0 to infinty
# Generate numeric function to be usable by curve_fit()
G_opt = sy.lambdify((x, t, a0, a1, a2), G)
print(G_opt(1, 2, 3, 4, 5)) # example usage
Note that in general the problem is often ill-posed since the integral does not neccesarily converge in a large enough neighborhood of the solution (which is assumed by curve_fit()
).