Keras neural network outputs same result for every input
Solution 1
I had the very same problem.
I would suggest you to reduce the learning rate for SGD. In my case I had used Adam Optimizer with lr=0.001, but changing to 0.0001 solved the problem.
Default parameters for SGD are:
keras.optimizers.SGD(lr=0.01, momentum=0.0, decay=0.0, nesterov=False)
Solution 2
The output is relatively similar to multi-label classification so I would recommend:
- Change loss function to binary_crossentropy
- Retain the last activation layer as sigmoid and change the others - relu can be a good choice.
- Add validation to your "fit" call and increase verbosity - This will allow you to understand how your network changes through the epochs and especially when it over/under fits
- Add depth to the network until you overfit
- Add regularization to your network until you don't overfit
- repeat 4+5
Solution 3
If you tried all the above and it does not work it means that you try to fit noise, there is no connection/correlation/relevance between your inputs and outputs.
Chack Rodríguez
Updated on June 09, 2022Comments
-
Chack Rodríguez almost 2 years
I tried to implement a feedforward neural network.
This is the structure: Input layer: 8 neurons, Hidden layer: 8 neurons and Output layer: 8 neurons.
The input data are vectors of 8 bits (1 bit for each neuron of the input layer). The outputs of the neural network are also vectors of 8 bits. So in total the dataset has 256 examples.
Example: if given x = [0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0]
the output must be y = [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0]
This is the implementation:
from keras.models import Sequential from keras.layers import Dense import numpy as np import random from math import ceil #Dimension of layers dim = 8 #Generate dataset X = [] for i in range(0,2**dim): n = [float(x) for x in bin(i)[2:]] X.append([0.]*(dim-len(n))+n) y = X[:] random.shuffle(y) X = np.array(X) y = np.array(y) # create model model = Sequential() model.add(Dense(dim, input_dim=dim, init='normal', activation='sigmoid')) model.add(Dense(dim, init='normal', activation='sigmoid')) model.add(Dense(dim, init='normal', activation='sigmoid')) # Compile model model.compile(loss='mse', optimizer='SGD', metrics=['accuracy']) # Fit the model model.fit(X, y, nb_epoch=1000, batch_size=50, verbose=0) # evaluate the model scores = model.evaluate(X, y) print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100)) output = model.predict(X) #Make the output binary for i in range(0, output[:,0].size): for j in range(0, output[0].size): if output[i][j] > 0.5 or output[i][j] == 0.5: output[i][j] = 1 else: output[i][j] = 0 print(output)
This is what I get in output:
acc: 50.39% [[ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.] ..., [ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.]]
It seems that all outputs have the same value. So I don´t know what's wrong about the configuration. I tried this Cannot train a neural network in keras - stackoverflow which suggests to remove the activation function at the output layer but when I run this I get all output vectors with this value:
[ 0. 1. 1. ..., 1. 1. 1.]
Any insights on how to make it work?
-
xtluo over 5 yearsThis solved my problem, which is my CNN produces the same probability distribution for all inputs...confused!