Python 2.7 - statsmodels - formatting and writing summary output

后端未结

关注

 5  1756

I\'m doing logistic regression using pandas 0.11.0(data handling) and statsmodels 0.4.3 to do the actual regression, on Mac OSX Lion.

I\'m goin

相关标签:

5条回答

北恋

2021-02-05 21:28

write_path = '/my/path/here/output.csv'
with open(write_path, 'w') as f:
    f.write(result.summary().as_csv())

0 讨论(0)

星月不相逢

2021-02-05 21:29
There is actually a built-in method documented in the documentation here:
```
f = open('csvfile.csv','w')
f.write(result.summary().as_csv())
f.close()
```
I believe this is a much easier (and clean) way to output the summaries to csv files.
0 讨论(0)
发布评论:

提交评论
- 加载中...
余生分开走

2021-02-05 21:45
There is no premade table of parameters and their result statistics currently available.

Essentially you need to stack all the results yourself, whether in a list, numpy array or pandas DataFrame depends on what's more convenient for you.

for example, if I want one numpy array that has the results for a model, llf and results in the summary parameter table, then I could use
```
res_all = []
for res in results:
    low, upp = res.confint().T   # unpack columns 
    res_all.append(numpy.concatenate(([res.llf], res.params, res.tvalues, res.pvalues, 
                   low, upp)))
```
But it might be better to align with pandas, depending on what structure you have across models.

You could write a helper function that takes all the results from the results instance and concatenates them in a row.

(I'm not sure what's the most convenient for writing to csv by rows)

edit:

Here is an example storing the regression results in a dataframe

https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/multilinear.py#L21

the loop is on line 159.

summary() and similar code outside of statsmodels, for example http://johnbeieler.org/py_apsrtable/ for combining several results, is oriented towards printing and not to store variables.
0 讨论(0)
发布评论:

提交评论
- 加载中...
囚心锁ツ

2021-02-05 21:47
- results.params : for coefficient
- results.pvalues : for p-values
BTW you can use dir(results) to find out all the attribute of an object
0 讨论(0)
发布评论:

提交评论
- 加载中...

野趣味

2021-02-05 21:52

I found this formulation to be a little more straightforward. You can add/subtract columns by following the syntax from the examples (pvals,coeff,conf_lower,conf_higher).

import pandas as pd     #This can be left out if already present...

def results_summary_to_dataframe(results):
    '''This takes the result of an statsmodel results table and transforms it into a dataframe'''
    pvals = results.pvalues
    coeff = results.params
    conf_lower = results.conf_int()[0]
    conf_higher = results.conf_int()[1]

    results_df = pd.DataFrame({"pvals":pvals,
                               "coeff":coeff,
                               "conf_lower":conf_lower,
                               "conf_higher":conf_higher
                                })

    #Reordering...
    results_df = results_df[["coeff","pvals","conf_lower","conf_higher"]]
    return results_df

0 讨论(0)