How to plot a learning curve for a keras experiment?

15,505

Solution 1

To get accuracy values, you need to request that they are calculated during fit, because accuracy is not an objective function, but a (common) metric. Sometimes calculating accuracy does not make sense, so it is not enabled by default in Keras. However, it is a built-in metric, and easy to add.

To add the metric, use metrics=['accuracy'] parameter to model.compile.

In your example:

history = model.fit(X_train, y_train, batch_size = 512, 
          nb_epoch = 5, validation_split = 0.05)

You can then access validation accuracy as history.history['val_acc']

Solution 2

The history object is created during fit()ting the model. See keras/engine/training.py for details.

You can access the history using the history attribute on the model: model.history.

After fitting the model you simply average over the attribute.

np.mean([v['val_acc'] for v in model.history])

Note that the pattern is val_<your output name here> for every output you specify.

Solution 3

Why do you find the average accuracy more important than the final accuracy? Depending on your initial values, your average might be quite misleading. It's easy to come up with different curves that have the same average but different interpretations.

I'd just plot the complete history of train_acc and val_acc to decide whether the RNN is performing well within the given setup. And also don't forget to have a sample size N > 1. Random initialization can have a big impact on RNNs, take at least N=10 different initializations for each setup to make sure that the different performance is actually caused by your set size and not by better/worse initializations.

Share:
15,505
akilat90
Author by

akilat90

Real name is Akila Thalgahagoda [email protected] Other profiles: LinkedIn GitHub

Updated on June 05, 2022

Comments

  • akilat90
    akilat90 almost 2 years

    I'm training an RNN using keras and would like to see how the validation accuracy changes with the data set size. Keras has a list called val_acc in its history object which gets appended after every epoch with the respective validation set accuracy (link to the post in google group). I want to get the average of val_acc for the number of epochs run and plot that against the respective data set size.

    Question: How can I retrieve the elements in the val_acc list and perform an operation like numpy.mean(val_acc)?


    EDIT: As @runDOSrun said, getting the mean of the val_accs doesn't make sense. Let me focus on getting the final val_acc.

    I tried what's been suggested by @nemo but no luck. Here's what I got when I print

    model.fit(X_train, y_train, batch_size = 512, nb_epoch = 5, validation_split = 0.05).__dict__

    output:

    {'model': <keras.models.Sequential object at 0x000000001F752A90>, 'params': {'verbose': 1, 'nb_epoch': 5, 'batch_size': 512, 'metrics': ['loss', 'val_loss'], 'nb_sample': 1710, 'do_validation': True}, 'epoch': [0, 1, 2, 3, 4], 'history': {'loss': [0.96936064512408959, 0.66933631673890948, 0.63404161288724303, 0.62268789783555867, 0.60833334699708819], 'val_loss': [0.84040999412536621, 0.75676006078720093, 0.73714292049407959, 0.71032363176345825, 0.71341043710708618]}}
    

    It turns out there's no list as val_acc in my history dictionary.

    Question: How to include val_acc in to the history dictionary?

  • akilat90
    akilat90 almost 8 years
    thanks for making me think about it. Yes I was wrong and was just ignoring the fact that the network gets better over each epoch. In that sense, like you said, getting the final `val_acc' would suffice.
  • akilat90
    akilat90 almost 8 years
    Thanks for the answer @nemo, Can you have a look at my recent edit?
  • Casimir
    Casimir over 5 years
    Shouldn't this throw an error? According to the docs, metrics need to be specified in model.compile() rather than model.fit().
  • Neil Slater
    Neil Slater over 5 years
    Good catch, either I got this wrong at the time, or it has changed. I will edit now anyway to match the docs.