I am running OLS on products by month. While this works fine for a single product, my dataframe contains many products. If I create a groupby object OLS gives an error.
Use get_group
to get each individual group and perform OLS model on each one:
for group in linear_regression_grouped.groups.keys():
df= linear_regression_grouped.get_group(group)
X = df['period_num']
y = df['TOTALS']
model = sm.OLS(y, X)
results = model.fit()
print results.summary()
But in real case, you also want to have the intercept term so the model should be defined slightly differently:
for group in linear_regression_grouped.groups.keys():
df= linear_regression_grouped.get_group(group)
df['constant']=1
X = df[['period_num','constant']]
y = df['TOTALS']
model = sm.OLS(y,X)
results = model.fit()
print results.summary()
The results (with intercept and without) are, certainly, very different.
You could do something like this ...
import pandas as pd
import statsmodels.api as sm
for products in linear_regression_df.product_desc.unique():
tempdf = linear_regression_df[linear_regression_df.product_desc == products]
X = tempdf['period_num']
y = tempdf['TOTALS']
model = sm.OLS(y, X)
results = model.fit()
print results.params # Or whatever summary info you want