neural networks regression using pybrain
Solution 1
pybrain.tools.neuralnets.NNregression
is a tool which
Learns to numerically predict the targets of a set of data, with optional online progress plots.
so it seems like something well suited for constructing a neural network for your regression task.
Solution 2
As originally pointed out by Ben Allison, for the network to be able to approximate arbitrary values (i.e. not necessarily in the range 0..1
) it is important not to use an activation function with limited output range in the final layer. A linear activation function for example should work well.
Here is a simple regression example built from the basic elements of pybrain:
#----------
# build the dataset
#----------
from pybrain.datasets import SupervisedDataSet
import numpy, math
xvalues = numpy.linspace(0,2 * math.pi, 1001)
yvalues = 5 * numpy.sin(xvalues)
ds = SupervisedDataSet(1, 1)
for x, y in zip(xvalues, yvalues):
ds.addSample((x,), (y,))
#----------
# build the network
#----------
from pybrain.structure import SigmoidLayer, LinearLayer
from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(1,
100, # number of hidden units
1,
bias = True,
hiddenclass = SigmoidLayer,
outclass = LinearLayer
)
#----------
# train
#----------
from pybrain.supervised.trainers import BackpropTrainer
trainer = BackpropTrainer(net, ds, verbose = True)
trainer.trainUntilConvergence(maxEpochs = 100)
#----------
# evaluate
#----------
import pylab
# neural net approximation
pylab.plot(xvalues,
[ net.activate([x]) for x in xvalues ], linewidth = 2,
color = 'blue', label = 'NN output')
# target function
pylab.plot(xvalues,
yvalues, linewidth = 2, color = 'red', label = 'target')
pylab.grid()
pylab.legend()
pylab.show()
A side remark (since in your code example you have a hidden layer with linear activation functions): In any hidden layer, linear functions are not useful because:
- the weights at the input side to this layer form a linear transformation
- the activation function is linear
- the weights at the output side to this layer form a linear transformation
which can be reduced to one single linear transformation, i.e. they corresponding layer may as well be eliminated without any reduction in the set of functions which can be approximated. An important point of neural networks is that the activation functions are non-linear in the hidden layers.
Solution 3
I think there could be a couple of things going on here.
First, I'd recommend using a different configuration of layer activations than what you're using. In particular, for starters, try to use sigmoidal nonlinearities for the hidden layers in your network, and linear activations for the output layer. This is by far the most common setup for a typical supervised network and should help you get started.
The second thing that caught my eye is that you have a relatively large value for the weightDecay
parameter in your trainer (though what constitutes "relatively large" depends on the natural scale of your input and output values). I would remove that parameter for starters, or set its value to 0. The weight decay is a regularizer that will help prevent your network from overfitting, but if you increase the value of that parameter too much, your network weights will all go to 0 very quickly (and then your network's gradient will be basically 0, so learning will halt). Only set weightDecay
to a nonzero value if your performance on a validation dataset starts to decrease during training.
Related videos on Youtube
Alberto A
Updated on October 20, 2022Comments
-
Alberto A about 1 year
I need to solve a regression problem with a feed forward network and I've been trying to use PyBrain to do it. Since there are no examples of regression on pybrain's reference, I tried to adapt it's classification example for regression instead, but with no success (The classification example can be found here: http://pybrain.org/docs/tutorial/fnn.html). Following is my code:
This first function converts my data in numpy array form to a pybrain SupervisedDataset. I use the SupervisedDataset because according to pybrain's reference it is the dataset to use when the problem is regression. The parameters are an array with the feature vectors (data) and their expected output (values):
def convertDataNeuralNetwork(data, values): fulldata = SupervisedDataSet(data.shape[1], 1) for d, v in zip(data, values): fulldata.addSample(d, v) return fulldata
Next, is the function to run the regression. train_data and train_values are the train feature vectors and their expected output, test_data and test_values are the test feature vectors and their expected output:
regressionTrain = convertDataNeuralNetwork(train_data, train_values) regressionTest = convertDataNeuralNetwork(test_data, test_values) fnn = FeedForwardNetwork() inLayer = LinearLayer(regressionTrain.indim) hiddenLayer = LinearLayer(5) outLayer = GaussianLayer(regressionTrain.outdim) fnn.addInputModule(inLayer) fnn.addModule(hiddenLayer) fnn.addOutputModule(outLayer) in_to_hidden = FullConnection(inLayer, hiddenLayer) hidden_to_out = FullConnection(hiddenLayer, outLayer) fnn.addConnection(in_to_hidden) fnn.addConnection(hidden_to_out) fnn.sortModules() trainer = BackpropTrainer(fnn, dataset=regressionTrain, momentum=0.1, verbose=True, weightdecay=0.01) for i in range(10): trainer.trainEpochs(5) res = trainer.testOnClassData(dataset=regressionTest ) print res
when I print res, all it's values are 0. I've tried to use the buildNetwork function as a shortcut to build the network, but it didn't work as well. I've also tried different kinds of layers and different number of nodes in the hidden layer, with no luck.
Does somebody have any idea of what I am doing wrong? Also, some pybrain regression examples would really help! I couldn't find any when I looked.
Thanks in advance
-
Ben Allison over 10 yearspretty sure you want the output layer to be linear for regression---you probably also want to use sigmoidal/tanh hidden units
-
Andre Holznerstrictly speaking not necessarily linear but it most not be an activation whose output is bounded to a range like
0..1
. Also, I'm not sure what the purpose of a linear hidden layer is (as in the code posted), this can normally be absorbed in the weights to the next layer.
-