Is Stochastic gradient descent a classifier or an optimizer?

machine-learning scikit-learn classification gradient-descent

12,432

Solution 1

Taken from SGD sikit-learn documentation

loss="hinge": (soft-margin) linear Support Vector Machine, loss="modified_huber": smoothed hinge loss, loss="log": logistic regression

Solution 2

SGD is indeed a technique that is used to find the minima of a function. SGDClassifier is a linear classifier (by default in sklearn it is a linear SVM) that uses SGD for training (that is, looking for the minima of the loss using SGD). According to the documentation:

SGDClassifier is a Linear classifiers (SVM, logistic regression, a.o.) with SGD training.

This estimator implements regularized linear models with stochastic gradient descent (SGD) learning: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate). SGD allows minibatch (online/out-of-core) learning, see the partial_fit method. For best results using the default learning rate schedule, the data should have zero mean and unit variance.

This implementation works with data represented as dense or sparse arrays of floating point values for the features. The model it fits can be controlled with the loss parameter; by default, it fits a linear support vector machine (SVM).

Solution 3

SGDClassifier is a linear classifier which implements regularized linear models with stochastic gradient descent (SGD) learning

Other classifiers:

classifiers = [
("ASGD", SGDClassifier(average=True, max_iter=100)),
("Perceptron", Perceptron(tol=1e-3)),
("Passive-Aggressive I", PassiveAggressiveClassifier(loss='hinge',
                                                     C=1.0, tol=1e-4)),
("Passive-Aggressive II", PassiveAggressiveClassifier(loss='squared_hinge',
                                                      C=1.0, tol=1e-4)),
("SAG", LogisticRegression(solver='sag', tol=1e-1, C=1.e4 / X.shape[0]))

]

Stochastic Gradient Descent (sgd) is a solver. It is a simple and efficient approach for discriminative learning of linear classifiers under convex loss functions such as (linear) Support Vector Machines and Logistic Regression.

Other alternative solvers for sgd in neural_network.MLPClassifier are lbfgs and adam

solver : {‘lbfgs’, ‘sgd’, ‘adam’}, default ‘adam’

The solver for weight optimization.

‘lbfgs’ is an optimizer in the family of quasi-Newton methods

‘sgd’ refers to stochastic gradient descent.

‘adam’ refers to a stochastic gradient-based optimizer proposed by Kingma, Diederik, and Jimmy Ba

Details about implementation of SGDClassifier can be read @ SGDClassifier documentation page.

in brief:

This estimator implements regularized linear models with stochastic gradient descent (SGD) learning: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate). SGD allows minibatch (online/out-of-core) learning

12,432

Author by

Arpitha

Updated on June 22, 2022

Comments

Arpitha about 2 years

I am new to Machine Learning and I am trying analyze the classification algorithm for a project of mine. I came across SGDClassifier in sklearn library. But a lot of papers have referred to SGD as an optimization technique. Can someone please explain how is SGDClassifier implemented?