Why do I get only one parameter from a statsmodels OLS fit
Solution 1
Try this:
X = sm.add_constant(X)
sm.OLS(y,X)
as in the documentation:
An intercept is not included by default and should be added by the user
statsmodels.tools.tools.add_constant
Solution 2
Just to be complete, this works:
>>> import numpy
>>> import statsmodels.api as sm
>>> y = numpy.array([1,2,3,4,5,6,7,8,9])
>>> X = numpy.array([1,1,2,2,3,3,4,4,5])
>>> X = sm.add_constant(X)
>>> res_ols = sm.OLS(y, X).fit()
>>> res_ols.params
array([-0.35714286, 1.92857143])
It does give me a different slope coefficient, but I guess that figures as we now do have an intercept.
Solution 3
Try this, it worked for me:
import statsmodels.formula.api as sm
from statsmodels.api import add_constant
X_train = add_constant(X_train)
X_test = add_constant(X_test)
model = sm.OLS(y_train,X_train)
results = model.fit()
y_pred=results.predict(X_test)
results.params
Solution 4
I'm running 0.6.1 and it looks like the "add_constant" function has been moved into the statsmodels.tools module. Here's what I ran that worked:
res_ols = sm.OLS(y, statsmodels.tools.add_constant(X)).fit()
Solution 5
i did add the code X = sm.add_constant(X)
but python did not return the intercept value so using a little algebra i decided to do it myself in code:
this code computes regression over 35 samples, 7 features plus one intercept value that i added as feature to the equation:
import statsmodels.api as sm
from sklearn import datasets ## imports datasets from scikit-learn
import numpy as np
import pandas as pd
x=np.empty((35,8)) # (numSamples, oneIntercept + numFeatures))
feature_names = np.empty((8,))
y = np.empty((35,))
dbfv = open("dataset.csv").readlines()
interceptConstant = 1;
i = 0
# reading data and writing in numpy arrays
while i<len(dbfv):
cells = dbfv[i].split(",")
j = 0
x[i][j] = interceptConstant
feature_names[j] = str(j)
while j<len(cells)-1:
x[i][j+1] = cells[j]
feature_names[j+1] = str(j+1)
j += 1
y[i] = cells[len(cells)-1]
i += 1
# creating dataframes
df = pd.DataFrame(x, columns=feature_names)
target = pd.DataFrame(y, columns=["TARGET"])
X = df
y = target["TARGET"]
model = sm.OLS(y, X).fit()
print(model.params)
# predictions = model.predict(X) # make the predictions by the model
# Print out the statistics
print(model.summary())
Tom
Updated on July 12, 2022Comments
-
Tom almost 2 years
Here is what I am doing:
$ python Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin >>> import statsmodels.api as sm >>> statsmodels.__version__ '0.5.0' >>> import numpy >>> y = numpy.array([1,2,3,4,5,6,7,8,9]) >>> X = numpy.array([1,1,2,2,3,3,4,4,5]) >>> res_ols = sm.OLS(y, X).fit() >>> res_ols.params array([ 1.82352941])
I had expected an array with two elements?!? The intercept and the slope coefficient?
-
Tom over 10 yearsI was looking at the ols example ate the wls page so I guess that is why I overlooked the add_constant(), as it's not mentioned on that page.
-
Desta Haileselassie Hagos almost 7 years@behzad-nouri, I would appreciate if you could have a look at this: stackoverflow.com/questions/44747203/…
-
FaCoffee over 6 yearsI am quite puzzled by this. Why isn't an intercept added by default? Why do you want to run the linear regression without the bloody constant? It makes no sense to me.
-
Josef over 5 yearsuse
import statsmodels.api as sm
instead.formula.api
will not haveOLS
(capital case) in the next release, onlyols
(lower case for formula interface) -
Golden Lion about 2 yearswhat does adding a column of ones to an array do to X?