How to predict new values using statsmodels.formula.api (python)
Solution 1
You can provide new values to the .predict()
model as illustrated in output #11 in this notebook from the docs for a single observation. You can provide multiple observations as 2d array
, for instance a DataFrame
- see docs.
Since you are using the formula API, your input needs to be in the form of a pd.DataFrame
so that the column references are available. In your case, you could use something like .predict(pd.DataFrame({'mean_area': [1,2,3]})
.
statsmodels
.predict()
uses the observations used for fitting only as default when no alternative is provided.
Solution 2
import statsmodels.formula.api as smf
model = smf.ols('y ~ x', data=df).fit()
# Predict for a list of observations, list length can be 1 to many..**
prediction = model.get_prediction(exog=dict(x=[5,10,25]))
prediction.summary_frame(alpha=0.05)
Comments
-
vishmay almost 2 years
I trained the logistic model using the following, from breast cancer data and ONLY using one feature 'mean_area'
from statsmodels.formula.api import logit logistic_model = logit('target ~ mean_area',breast) result = logistic_model.fit()
There is a built in predict method in the trained model. However that gives the predicted values of all the training samples. As follows
predictions = result.predict()
Suppose I want the prediction for a new value say 30 How do I used the trained model to out put the value? (rather than reading the coefficients and computing manually)