I\'m doing logistic regression using pandas 0.11.0
(data handling) and statsmodels 0.4.3
to do the actual regression, on Mac OSX Lion.
I\'m goin
write_path = '/my/path/here/output.csv'
with open(write_path, 'w') as f:
f.write(result.summary().as_csv())
There is actually a built-in method documented in the documentation here:
f = open('csvfile.csv','w')
f.write(result.summary().as_csv())
f.close()
I believe this is a much easier (and clean) way to output the summaries to csv files.
There is no premade table of parameters and their result statistics currently available.
Essentially you need to stack all the results yourself, whether in a list, numpy array or pandas DataFrame depends on what's more convenient for you.
for example, if I want one numpy array that has the results for a model, llf and results in the summary parameter table, then I could use
res_all = []
for res in results:
low, upp = res.confint().T # unpack columns
res_all.append(numpy.concatenate(([res.llf], res.params, res.tvalues, res.pvalues,
low, upp)))
But it might be better to align with pandas, depending on what structure you have across models.
You could write a helper function that takes all the results from the results instance and concatenates them in a row.
(I'm not sure what's the most convenient for writing to csv by rows)
edit:
Here is an example storing the regression results in a dataframe
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/multilinear.py#L21
the loop is on line 159.
summary() and similar code outside of statsmodels, for example http://johnbeieler.org/py_apsrtable/ for combining several results, is oriented towards printing and not to store variables.
BTW you can use dir(results) to find out all the attribute of an object
I found this formulation to be a little more straightforward. You can add/subtract columns by following the syntax from the examples (pvals,coeff,conf_lower,conf_higher).
import pandas as pd #This can be left out if already present...
def results_summary_to_dataframe(results):
'''This takes the result of an statsmodel results table and transforms it into a dataframe'''
pvals = results.pvalues
coeff = results.params
conf_lower = results.conf_int()[0]
conf_higher = results.conf_int()[1]
results_df = pd.DataFrame({"pvals":pvals,
"coeff":coeff,
"conf_lower":conf_lower,
"conf_higher":conf_higher
})
#Reordering...
results_df = results_df[["coeff","pvals","conf_lower","conf_higher"]]
return results_df