python statsmodels linear regression

早过忘川 提交于 2019-12-24 06:23:14

问题


I am attempting to make a linear regression model based on pre project data and ultimately attempt to calculate some modeled data where I could compare pre/post project data... Can anyone tell me what the best proactice is else I maybe off in the weeds somewhere...

For starters:

import statsmodels.api as sm
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

ng = pd.read_csv('C:/Users/ngDataBaseline.csv',  thousands=',', index_col='Date', parse_dates=True)
ng.head()

This will output:

    HDD Therm
Date        
2011-10-01  386 498
2011-11-01  663 1810
2011-12-01  972 4263
2012-01-01  1131    5981
2012-02-01  977 6951

And from statsmodels to fit my model I am using:

import statsmodels.formula.api as smf

formula = 'Therm ~ HDD'
model = smf.ols(formula, data=ng)
results = model.fit()
results.summary()

inter = results.params['Intercept']
slope = results.params['HDD']
inter, slope

prints:

(-532.6244255918659, 6.331883644532255)

So now I think I can import post project data and use some simple math in this format to calculate modeled data: Y = mX + b

ng_postproject = pd.read_csv('C:/Users/ng_postproject.csv',  thousands=',', index_col='Date', parse_dates=True)

ng_postproject.head()

And this will output:

    HDD Therm
Date        
2014-10-01  291 663
2014-11-01  545 1413
2014-12-01  1069    6754
2015-01-01  1134    7782
2015-02-01  1415    10285

This is what I am using to calculate a modeled Therm usage.

ng_postproject['Therm_modeled'] = ng_postproject['HDD'].apply(lambda x: x * slope + inter)


ng_postproject['Therm_modeled']

Date
2014-10-01    1309.953715
2014-11-01    2918.252161
2014-12-01    6236.159190
2015-01-01    6647.731627
2015-02-01    8426.990931

Now if I am not too far off in the weeds I should be able to add in a column header and compare post/pre project data... It would be really nice too if I could implement a confidence interval as well... Thanks for any response.

来源:https://stackoverflow.com/questions/52635962/python-statsmodels-linear-regression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!