问题
I want to do a linear regression for a scatter plot using polyfit, and I also want the residual to see how good the linear regression is. But I am unsure how I get this as it isn't possible to get the residual as an output value from polyfit since this is one dimensional. My code:
p = np.polyfit(lengths, breadths, 1)
m = p[0]
b = p[1]
yfit = np.polyval(p,lengths)
newlengths = []
for y in lengths:
newlengths.append(y*m+b)
ax.plot(lengths, newlengths, '-', color="#2c3e50")
I saw a stackoverflow answer where they used polyval - but I am unsure of what that gives me. Is that the exact values for the lengths? Should I find the error by finding the delta of each element from the polyval and 'breadth'?
回答1:
You can use the keyword full=True
when calling polyfit
(see http://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html) to get the least-square error of your fit:
coefs, residual, _, _, _ = np.polyfit(lengths, breadths, 1, full=True)
You can get the same answer by doing:
coefs = np.polyfit(lengths, breadths, 1)
yfit = np.polyval(coefs,lengths)
residual = np.sum((breadths-yfit)**2)
or
residual = np.std(breadths-yfit)**2 * len(breadths)
Additionally, if you want to plot the residuals, you can do:
coefs = np.polyfit(lengths, breadths, 1)
yfit = np.polyval(coefs,lengths)
plot(lengths, breadths-yfit)
来源:https://stackoverflow.com/questions/29632733/how-to-get-the-sum-of-least-squares-error-from-polyfit-in-one-dimension-python