How to do gaussian/polynomial regression with scikit-learn?

15,949

Solution 1

Either you use Support Vector Regression sklearn.svm.SVR and set the appropritate kernel (see here).

Or you install the latest master version of sklearn and use the recently added sklearn.preprocessing.PolynomialFeatures (see here) and then OLS or Ridge on top of that.

Solution 2

Theory

Polynomial regression is a special case of linear regression. With the main idea of how do you select your features. Looking at the multivariate regression with 2 variables: x1 and x2. Linear regression will look like this: y = a1 * x1 + a2 * x2.

Now you want to have a polynomial regression (let's make 2 degree polynomial). We will create a few additional features: x1*x2, x1^2 and x2^2. So we will get your 'linear regression':

y = a1 * x1 + a2 * x2 + a3 * x1*x2 + a4 * x1^2 + a5 * x2^2

This nicely shows an important concept curse of dimensionality, because the number of new features grows much faster than linearly with the growth of degree of polynomial. You can take a look about this concept here.

Practice with scikit-learn

You do not need to do all this in scikit. Polynomial regression is already available there (in 0.15 version. Check how to update it here).

from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model

X = [[0.44, 0.68], [0.99, 0.23]]
vector = [109.85, 155.72]
predict= [0.49, 0.18]

poly = PolynomialFeatures(degree=2)
X_ = poly.fit_transform(X)
predict_ = poly.fit_transform(predict)

clf = linear_model.LinearRegression()
clf.fit(X_, vector)
print clf.predict(predict_)
Share:
15,949
Jagat
Author by

Jagat

Updated on July 21, 2022

Comments

  • Jagat
    Jagat almost 2 years

    Does scikit-learn provide facility to perform regression using a gaussian or polynomial kernel? I looked at the APIs and I don't see any. Has anyone built a package on top of scikit-learn that does this?

  • amos
    amos over 7 years
    sklearn's Pipeline makes this even easier: scikit-learn.org/0.17/auto_examples/model_selection/…
  • Gianluca John Massimiani
    Gianluca John Massimiani over 7 years
    @Salvador Dali. Sorry, what is "vector" exactly?
  • Admin
    Admin over 7 years
    @GianlucaJohnMassimiani, vector = y_training and predict = X_test.
  • Charlie Parker
    Charlie Parker over 6 years
    I am trying to get the code for PolynomialFeatures for d>2, do you have it?