r in stats.linregress compared to r-squared in statsmodels

社会主义新天地 提交于 2019-12-24 09:26:08

问题


I'm working on a program to investigate the correlation between magnitude and redshift for some quasars, and I'm using statsmodels and scipy.stats.linregress to compute the statistics of the data; statsmodels to compute r-squared (among other parameters), and stats.linregress to compute r (among others).

Some example output is:

W1 r-squared: 0.855715
W1 r-value  : 0.414026
W2 r-squared: 0.861169
W2 r-value  : 0.517381
W3 r-squared: 0.874051
W3 r-value  : 0.418523
W4 r-squared: 0.856747
W4 r-value  : 0.294094
Visual minus WISE r-squared: 0.87366
Visual minus WISE r-value  : -0.521463

My question is, why do the r and r-squared values not match

(i.e. for the W1 band, 0.414026**2 != 0.855715)?

The code for my computation function is as follows:

def computeStats(x, y, yName):
    from scipy import stats
    import statsmodels.api as sm

    #   Compute model parameters
    model = sm.OLS(y, x, missing= 'drop')
    results = model.fit()
    #   Mask NaN values in both axes
    mask = ~np.isnan(y) & ~np.isnan(x)
    #   Compute fit parameters
    params = stats.linregress(x[mask], y[mask])
    fit = params[0]*x + params[1]
    fitEquation = '$(%s)=(%.4g \pm %.4g) \\times redshift+%.4g$'%(yName,
                params[0],  #   slope
                params[4],  #   stderr in slope
                params[1])  #   y-intercept

    print('%s r-squared: %g'%(name, arrayresults.rsquared))
    print('%s r-value  : %g'%(name, arrayparams[2]))

    return results, params, fit, fitEquation

Am I interpreting the statistics incorrectly? Or do the two modules compute the regressions using different methods?


回答1:


By default, OLS in statsmodels does not include the constant term (i.e. the intercept) in the linear equation. (The constant term corresponds to a column of ones in the design matrix.)

To match linregress, create model like this:

    model = sm.OLS(y, sm.add_constant(x), missing= 'drop')


来源:https://stackoverflow.com/questions/51738734/r-in-stats-linregress-compared-to-r-squared-in-statsmodels

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!