How to choose number of hidden layers and nodes in neural network?

artificial-intelligence machine-learning neural-network

32,494

Solution 1

Note: this answer was correct at the time it was made, but has since become outdated.

It is rare to have more than two hidden layers in a neural network. The number of layers will usually not be a parameter of your network you will worry much about.

Although multi-layer neural networks with many layers can represent deep circuits, training deep networks has always been seen as somewhat of a challenge. Until very recently, empirical studies often found that deep networks generally performed no better, and often worse, than neural networks with one or two hidden layers.

Bengio, Y. & LeCun, Y., 2007. Scaling learning algorithms towards AI. Large-Scale Kernel Machines, (1), pp.1-41.

The cited paper is a good reference for learning about the effect of network depth, recent progress in teaching deep networks, and deep learning in general.

Solution 2

The general answer is to for picking hyperparameters is to cross-validate. Hold out some data, train the networks with different configurations, and use the one that performs best on the held out set.

Solution 3

Most of the problems I have seen were solved with 1-2 hidden layers. It is proven that MLPs with only one hidden layer are universal function approximators (Hornik et. al.). More hidden layers can make the problem easier or harder. You usually have to try different topologies. I heard that you cannot add an arbitrary number of hidden layers if you want to train your MLP with backprop because the gradient will become too small in the first layers (I have no reference for that). But there are some applications where people used up to nine layers. Maybe you are interested in a standard benchmark problem which is solved by different classifiers and MLP topologies.

Solution 4

Besides the fact that cross-validation on different model configurations(no. of hidden layers OR neurons per layer) will lead you to choose better configuration.

One approach is training a model, as big and deep as possible and use dropout regularization to turn off some neurons and reduce overfitting.

the reference to this approach can be seen in this paper. https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf

Solution 5

All the above answers are of course correct but just to add some more ideas: Some general rules are the following based on this paper: 'Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture' by Saurabh Karsoliya

In general:

The number of hidden layer neurons are 2/3 (or 70% to 90%) of the size of the input layer. If this is insufficient then number of output layer neurons can be added later on.
The number of hidden layer neurons should be less than twice of the number of neurons in input layer.
The size of the hidden layer neurons is between the input layer size and the output layer size.

Keep always in mind that you need to explore and try a lot of different combinations. Also, using GridSearch you could find the "best model and parameters".

E.g. we can do a GridSearch in order to determine the "best" size of the hidden layer.

View more solutions

32,494

Author by

gintas

PHP, CakePHP, Python, Django, MySQL, jQuery

Updated on July 30, 2020

Comments

gintas almost 4 years

What does number of hidden layers in a multilayer perceptron neural network do to the way neural network behaves? Same question for number of nodes in hidden layers?

Let's say I want to use a neural network for hand written character recognition. In this case I put pixel colour intensity values as input nodes, and character classes as output nodes.

How would I choose number of hidden layers and nodes to solve such problem?