Keras neural network outputs same result for every input

11,736

Solution 1

I had the very same problem.

I would suggest you to reduce the learning rate for SGD. In my case I had used Adam Optimizer with lr=0.001, but changing to 0.0001 solved the problem.

Default parameters for SGD are:

keras.optimizers.SGD(lr=0.01, momentum=0.0, decay=0.0, nesterov=False)

Solution 2

The output is relatively similar to multi-label classification so I would recommend:

  1. Change loss function to binary_crossentropy
  2. Retain the last activation layer as sigmoid and change the others - relu can be a good choice.
  3. Add validation to your "fit" call and increase verbosity - This will allow you to understand how your network changes through the epochs and especially when it over/under fits
  4. Add depth to the network until you overfit
  5. Add regularization to your network until you don't overfit
  6. repeat 4+5

Solution 3

If you tried all the above and it does not work it means that you try to fit noise, there is no connection/correlation/relevance between your inputs and outputs.

Share:
11,736
Chack Rodríguez
Author by

Chack Rodríguez

Updated on June 09, 2022

Comments

  • Chack Rodríguez
    Chack Rodríguez almost 2 years

    I tried to implement a feedforward neural network.

    This is the structure: Input layer: 8 neurons, Hidden layer: 8 neurons and Output layer: 8 neurons.

    The input data are vectors of 8 bits (1 bit for each neuron of the input layer). The outputs of the neural network are also vectors of 8 bits. So in total the dataset has 256 examples.

    Example: if given x = [0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0]

    the output must be y = [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0]

    This is the implementation:

    from keras.models import Sequential
    from keras.layers import Dense
    import numpy as np
    import random
    from math import ceil
    
    #Dimension of layers
    dim = 8
    
    #Generate dataset
    X = []
    for i in range(0,2**dim):
        n = [float(x) for x in bin(i)[2:]]
        X.append([0.]*(dim-len(n))+n)
    y = X[:]
    random.shuffle(y)
    X = np.array(X)
    y = np.array(y)
    
    # create model
    model = Sequential()
    model.add(Dense(dim, input_dim=dim, init='normal', activation='sigmoid'))
    model.add(Dense(dim, init='normal', activation='sigmoid'))
    model.add(Dense(dim, init='normal', activation='sigmoid'))
    
    # Compile model
    model.compile(loss='mse', optimizer='SGD', metrics=['accuracy'])
    # Fit the model
    model.fit(X, y, nb_epoch=1000, batch_size=50, verbose=0)
    # evaluate the model
    scores = model.evaluate(X, y)
    print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
    output = model.predict(X)
    
    #Make the output binary
    for i in range(0, output[:,0].size):
        for j in range(0, output[0].size):
            if output[i][j] > 0.5 or output[i][j] == 0.5:
                output[i][j] = 1
            else:
                output[i][j] = 0
    print(output)
    

    This is what I get in output:

    acc: 50.39%
    [[ 1.  0.  0. ...,  0.  1.  1.]
    [ 1.  0.  0. ...,  0.  1.  1.]
    [ 1.  0.  0. ...,  0.  1.  1.]
    ..., 
    [ 1.  0.  0. ...,  0.  1.  1.]
    [ 1.  0.  0. ...,  0.  1.  1.]
    [ 1.  0.  0. ...,  0.  1.  1.]]
    

    It seems that all outputs have the same value. So I don´t know what's wrong about the configuration. I tried this Cannot train a neural network in keras - stackoverflow which suggests to remove the activation function at the output layer but when I run this I get all output vectors with this value:

    [ 0. 1. 1. ..., 1. 1. 1.]

    Any insights on how to make it work?

  • xtluo
    xtluo over 5 years
    This solved my problem, which is my CNN produces the same probability distribution for all inputs...confused!