python sklearn multiple linear regression display r-squared

前端 未结 2 1687
礼貌的吻别
礼貌的吻别 2021-01-30 14:59

I calculated my multiple linear regression equation and I want to see the adjusted R-squared. I know that the score function allows me to see r-squared, but it is not adjusted.<

相关标签:
2条回答
  • 2021-01-30 15:33
    regressor = LinearRegression(fit_intercept=False)
    regressor.fit(x_train, y_train)
    print(f'r_sqr value: {regressor.score(x_train, y_train)}')
    
    0 讨论(0)
  • 2021-01-30 15:52

    There are many different ways to compute R^2 and the adjusted R^2, the following are few of them (computed with the data you provided):

    from sklearn.linear_model import LinearRegression
    model = LinearRegression()
    X, y = df[['NumberofEmployees','ValueofContract']], df.AverageNumberofTickets
    model.fit(X, y)
    

    SST = SSR + SSE (ref definitions)

    # compute with formulas from the theory
    yhat = model.predict(X)
    SS_Residual = sum((y-yhat)**2)       
    SS_Total = sum((y-np.mean(y))**2)     
    r_squared = 1 - (float(SS_Residual))/SS_Total
    adjusted_r_squared = 1 - (1-r_squared)*(len(y)-1)/(len(y)-X.shape[1]-1)
    print r_squared, adjusted_r_squared
    # 0.877643371323 0.863248473832
    
    # compute with sklearn linear_model, although could not find any function to compute adjusted-r-square directly from documentation
    print model.score(X, y), 1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)
    # 0.877643371323 0.863248473832 
    

    Another way:

    # compute with statsmodels, by adding intercept manually
    import statsmodels.api as sm
    X1 = sm.add_constant(X)
    result = sm.OLS(y, X1).fit()
    #print dir(result)
    print result.rsquared, result.rsquared_adj
    # 0.877643371323 0.863248473832
    

    Yet another way:

    # compute with statsmodels, another way, using formula
    import statsmodels.formula.api as sm
    result = sm.ols(formula="AverageNumberofTickets ~ NumberofEmployees + ValueofContract", data=df).fit()
    #print result.summary()
    print result.rsquared, result.rsquared_adj
    # 0.877643371323 0.863248473832
    
    0 讨论(0)
提交回复
热议问题