Print OLS regression summary to text file

17,785

Solution 1

In order to write out the result of pandas.stats.api.ols, use a text file to match the output format, for instance:

from pandas.stats.api import ols
grps = df.groupby(['FID'])
for fid, grp in grps:
    result = ols(y=grp.loc[:, 'MEAN'], x=grp.loc[:, ['Accum_Prcp', 'Accum_HDD']])

    text_file = open("Output {}.txt".format(fid), "w")
    text_file.write(result.summary)
    text_file.close()

Solution 2

As of statsmodels 0.9, the Summary class supports export to multiple formats, including CSV and text:

import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf

dat = sm.datasets.get_rdataset("Guerry", "HistData").data
results = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()

with open('summary.txt', 'w') as fh:
    fh.write(results.summary().as_text())

with open('summary.csv', 'w') as fh:
    fh.write(results.summary().as_csv())

The output of as_csv() is not machine-readable. Dumping results parameters with repr() would be.

Share:
17,785

Related videos on Youtube

Stefano Potter
Author by

Stefano Potter

Updated on October 07, 2022

Comments

  • Stefano Potter
    Stefano Potter about 1 year

    I am running OLS regression using pandas.stats.api.ols using a groupby with the following code:

    from pandas.stats.api import ols
    df=pd.read_csv(r'F:\file.csv')
    
    result=df.groupby(['FID']).apply(lambda d: ols(y=d.loc[:, 'MEAN'], x=d.loc[:, ['Accum_Prcp', 'Accum_HDD']]))
    for i in result:
        x=pd.DataFrame({'FID':i.index, 'delete':i.values})
        frame = pd.concat([x,DataFrame(x['delete'].tolist())], axis=1, join='outer')
        del frame['delete']
        print frame
    

    but this returns the error:

    AttributeError: 'OLS' object has no attribute 'index'
    

    I have about 2,000 items in my group by and when I print each one out they look something like this:

    -

    ------------------------Summary of Regression Analysis-------------------------
    
    Formula: Y ~ <Accum_Prcp> + <Accum_HDD> + <intercept>
    
    Number of Observations:         79
    Number of Degrees of Freedom:   3
    
    R-squared:         0.1242
    Adj R-squared:     0.1012
    
    Rmse:              0.1929
    
    F-stat (2, 76):     5.3890, p-value:     0.0065
    
    Degrees of Freedom: model 2, resid 76
    
    -----------------------Summary of Estimated Coefficients------------------------
          Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
    --------------------------------------------------------------------------------
        Accum_Prcp     0.0009     0.0003       3.28     0.0016     0.0004     0.0015
         Accum_HDD     0.0000     0.0000       1.98     0.0516     0.0000     0.0000
         intercept     0.4750     0.0811       5.86     0.0000     0.3161     0.6340
    ---------------------------------End of Summary---------------------------------
    

    I want to be able to export each one to a csv so that I can view them individually.

  • Stefan
    Stefan about 7 years
    For statsmodels 0.8.0, there is result.summary().as_text().