问题
I have a fitted 3D data-set using scipy.linalg.lstsq
function.
I was using:
# best-fit quadratic curve
A = np.c_[np.ones(data.shape[0]), data[:,:2], np.prod(data[:,:2], axis=1), data[:,:2]**2]
C,_,_,_ = scipy.linalg.lstsq(A, data[:,2])
#evaluating on grid
Z = np.dot(np.c_[np.ones(XX.shape), XX, YY, XX*YY, XX**2, YY**2], C).reshape(X.shape)
But How can I be able to get the R^2 value from this for the fitted-surface .?
Is there any way I can check the significance of the fitting result ?
Any ideas related to that would be highly appreciated.
thank you.
回答1:
Following http://en.wikipedia.org/wiki/Coefficient_of_determination:
B = data[:,2]
SStot = ((B - B.mean())**2).sum()
SSres = ((B - np.dot(A,C))**2).sum()
R2 = 1 - SSres / SStot
As noted in the Wikipedia article, R2 has a lot of shortcomings. To the best of my knowledge, scipy/numpy compare poorly to a library like statsmodels.
If you want to run multivariate regressions as you need to compute ex-post estimated coefficient standard errors, t-stats, p-values and so on and so forth if you want to know what's going on in your data.
There are plenty of posts dedicated to running OLS with Python so just pick one e.g.: http://www.datarobot.com/blog/ordinary-least-squares-in-python/
来源:https://stackoverflow.com/questions/30319891/get-the-r2-value-from-scipy-linalg-lstsq