I have been doing some fitting in python using numpy (which uses least squares).
I was wondering if there was a way of getting it to fit data while forcing it throug
One simple and straightforward way is to utilize constrained least squares where constraints are weighted with a largish number M, like:
from numpy import dot
from numpy.linalg import solve
from numpy.polynomial.polynomial import Polynomial as P, polyvander as V
def clsq(A, b, C, d, M= 1e5):
"""A simple constrained least squared solution of Ax= b, s.t. Cx= d,
based on the idea of weighting constraints with a largish number M."""
return solve(dot(A.T, A)+ M* dot(C.T, C), dot(A.T, b)+ M* dot(C.T, d))
def cpf(x, y, x_c, y_c, n, M= 1e5):
"""Constrained polynomial fit based on clsq solution."""
return P(clsq(V(x, n), y, V(x_c, n), y_c, M))
Obviously this is not really a all inclusive silver bullet solution, but apparently it seems to work reasonable well with an simple example (for M in [0, 4, 24, 124, 624, 3124]
):
In []: x= linspace(-6, 6, 23)
In []: y= sin(x)+ 4e-1* rand(len(x))- 2e-1
In []: x_f, y_f= linspace(-(3./ 2)* pi, (3./ 2)* pi, 4), array([1, -1, 1, -1])
In []: n, x_s= 5, linspace(-6, 6, 123)
In []: plot(x, y, 'bo', x_f, y_f, 'bs', x_s, sin(x_s), 'b--')
Out[]: <snip>
In []: for M in 5** (arange(6))- 1:
....: plot(x_s, cpf(x, y, x_f, y_f, n, M)(x_s))
....:
Out[]: <snip>
In []: ylim([-1.5, 1.5])
Out[]: <snip>
In []: show()
and producing output like:
Edit: Added 'exact' solution:
from numpy import dot
from numpy.linalg import solve
from numpy.polynomial.polynomial import Polynomial as P, polyvander as V
from scipy.linalg import qr
def solve_ns(A, b): return solve(dot(A.T, A), dot(A.T, b))
def clsq(A, b, C, d):
"""An 'exact' constrained least squared solution of Ax= b, s.t. Cx= d"""
p= C.shape[0]
Q, R= qr(C.T)
xr, AQ= solve(R[:p].T, d), dot(A, Q)
xaq= solve_ns(AQ[:, p:], b- dot(AQ[:, :p], xr))
return dot(Q[:, :p], xr)+ dot(Q[:, p:], xaq)
def cpf(x, y, x_c, y_c, n):
"""Constrained polynomial fit based on clsq solution."""
return P(clsq(V(x, n), y, V(x_c, n), y_c))
and testing the fit:
In []: x= linspace(-6, 6, 23)
In []: y= sin(x)+ 4e-1* rand(len(x))- 2e-1
In []: x_f, y_f= linspace(-(3./ 2)* pi, (3./ 2)* pi, 4), array([1, -1, 1, -1])
In []: n, x_s= 5, linspace(-6, 6, 123)
In []: p= cpf(x, y, x_f, y_f, n)
In []: p(x_f)
Out[]: array([ 1., -1., 1., -1.])
The mathematically correct way of doing a fit with fixed points is to use Lagrange multipliers. Basically, you modify the objective function you want to minimize, which is normally the sum of squares of the residuals, adding an extra parameter for every fixed point. I have not succeeded in feeding a modified objective function to one of scipy's minimizers. But for a polynomial fit, you can figure out the details with pen and paper and convert your problem into the solution of a linear system of equations:
def polyfit_with_fixed_points(n, x, y, xf, yf) :
mat = np.empty((n + 1 + len(xf),) * 2)
vec = np.empty((n + 1 + len(xf),))
x_n = x**np.arange(2 * n + 1)[:, None]
yx_n = np.sum(x_n[:n + 1] * y, axis=1)
x_n = np.sum(x_n, axis=1)
idx = np.arange(n + 1) + np.arange(n + 1)[:, None]
mat[:n + 1, :n + 1] = np.take(x_n, idx)
xf_n = xf**np.arange(n + 1)[:, None]
mat[:n + 1, n + 1:] = xf_n / 2
mat[n + 1:, :n + 1] = xf_n.T
mat[n + 1:, n + 1:] = 0
vec[:n + 1] = yx_n
vec[n + 1:] = yf
params = np.linalg.solve(mat, vec)
return params[:n + 1]
To test that it works, try the following, where n
is the number of points, d
the degree of the polynomial and f
the number of fixed points:
n, d, f = 50, 8, 3
x = np.random.rand(n)
xf = np.random.rand(f)
poly = np.polynomial.Polynomial(np.random.rand(d + 1))
y = poly(x) + np.random.rand(n) - 0.5
yf = np.random.uniform(np.min(y), np.max(y), size=(f,))
params = polyfit_with_fixed_points(d, x , y, xf, yf)
poly = np.polynomial.Polynomial(params)
xx = np.linspace(0, 1, 1000)
plt.plot(x, y, 'bo')
plt.plot(xf, yf, 'ro')
plt.plot(xx, poly(xx), '-')
plt.show()
And of course the fitted polynomial goes exactly through the points:
>>> yf
array([ 1.03101335, 2.94879161, 2.87288739])
>>> poly(xf)
array([ 1.03101335, 2.94879161, 2.87288739])
If you use curve_fit()
, you can use sigma
argument to give every point a weight. The following example gives the first , middle, last point very small sigma, so the fitting result will be very close to these three points:
N = 20
x = np.linspace(0, 2, N)
np.random.seed(1)
noise = np.random.randn(N)*0.2
sigma =np.ones(N)
sigma[[0, N//2, -1]] = 0.01
pr = (-2, 3, 0, 1)
y = 1+3.0*x**2-2*x**3+0.3*x**4 + noise
def f(x, *p):
return np.poly1d(p)(x)
p1, _ = optimize.curve_fit(f, x, y, (0, 0, 0, 0, 0), sigma=sigma)
p2, _ = optimize.curve_fit(f, x, y, (0, 0, 0, 0, 0))
x2 = np.linspace(0, 2, 100)
y2 = np.poly1d(p)(x2)
plot(x, y, "o")
plot(x2, f(x2, *p1), "r", label=u"fix three points")
plot(x2, f(x2, *p2), "b", label=u"no fix")
legend(loc="best")