How to calculate confidence score of a Neural Network prediction

tensorflow machine-learning keras confidence-interval uncertainty

11,917

The softmax is a problematic way to estimate a confidence of the model`s prediction.

There are a few recent papers about this topic.

You can look for "calibration" of neural networks in order to find relevant papers.

This is one example you can start with - https://arxiv.org/pdf/1706.04599.pdf

11,917

Author by

yamini goel

Updated on July 16, 2022

Comments

yamini goel almost 2 years
I am using a deep neural network model (implemented in keras)to make predictions. Something like this:
```
def make_model():
 model = Sequential()       
 model.add(Conv2D(20,(5,5), activation = "relu"))
 model.add(MaxPooling2D(pool_size=(2,2)))    
 model.add(Flatten())
 model.add(Dense(20, activation = "relu"))
 model.add(Lambda(lambda x: tf.expand_dims(x, axis=1)))
 model.add(SimpleRNN(50, activation="relu"))
 model.add(Dense(1, activation="sigmoid"))    
 model.compile(loss = "binary_crossentropy", optimizer = adagrad, metrics = ["accuracy"])

 return model

model = make_model()
model.fit(x_train, y_train, validation_data = (x_validation,y_validation), epochs = 25, batch_size = 25, verbose = 1)

##Prediciton:
prediction = model.predict_classes(x)
probabilities = model.predict_proba(x) #I assume these are the probabilities of class being predictied
```
My problem is a classification(binary) problem. I wish to calculate the confidence score of each of these prediction i.e. I wish to know - Is my model 99% certain it is "0" or is it 58% it is "0".

I have found some views on how to do it, but can't implement them. The approach I wish to follow says: "With classifiers, when you output you can interpret values as the probability of belonging to each specific class. You can use their distribution as a rough measure of how confident you are that an observation belongs to that class."

How should I predict with something like above model so that I get its confidence about each predictions? I would appreciate some practical examples (preferably in Keras).