Consider the following MWE
import numpy as np
from scipy.optimize import curve_fit
X=np.arange(1,10,1)
Y=abs(X+np.random.randn(15,9))
def linear(x, a, b):
r
Least squares won't give the same result because the noise is transformed by log in this case. If the noise is zero, both methods give the same result.
import numpy as np
from numpy import random as rng
from scipy.optimize import curve_fit
rng.seed(0)
X=np.arange(1,7)
Y = np.zeros((4, 6))
for i in range(4):
b = a = i + 1
Y[i] = (X/b)**a + 0.01 * randn(6)
def linear(x, a, b):
return (x/b)**a
coeffs=[]
for ix in range(Y.shape[0]):
print(ix)
c0, pcov = curve_fit(linear, X, Y[ix])
coeffs.append(c0)
coefs
is
[array([ 0.99309127, 0.98742861]),
array([ 2.00197613, 2.00082722]),
array([ 2.99130237, 2.99390585]),
array([ 3.99644048, 3.9992937 ])]
I'll use scikit-learn's implementation of linear regression since I believe that scales well.
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
Take logs of X
and Y
lX = np.log(X)[None, :]
lY = np.log(Y)
Now fit and check that coeffiecients are the same as before.
lr.fit(lX.T, lY.T)
lr.coef_
Which gives similar exponent.
array([ 0.98613517, 1.98643974, 2.96602892, 4.01718514])
Now check the divisor.
np.exp(-lr.intercept_ / lr.coef_.ravel())
Which gives similar coefficient, you can see the methods diverging somewhat though in their answers.
array([ 0.99199406, 1.98234916, 2.90677142, 3.73416501])