Keras LSTM - why different results with "same" model & same weights?

python machine-learning neural-network keras lstm

11,277

Solution 1

Machine learning algorithms in general are non-deterministic. This means that every time you run them the outcome should vary. This has to do with the random initialization of the weights. If you want to make the results reproducible you have to eliminate the randomness from the table. A simple way to do this is to use a random seed.

import numpy as np
import tensorflow as tf

np.random.seed(1234)
tf.random.set_seed(1234)

# rest of your code

If you want the randomness factor but not so high variance in your output, I would suggest either lowering your learning rate or changing your optimizer (I would suggest an SGD optimizer with a relatively low learning rate). A cool overview of gradient descent optimization is available here!

A note on TensorFlow's random generators is that besides a global seed (i.e. tf.random.set_seed()), they also use an internal counter, so if you run

tf.random.set_seed(1234)
print(tf.random.uniform([1]).numpy())
print(tf.random.uniform([1]).numpy())

You'll get 0.5380393 and 0.3253647, respectively. However if you re-run that same snippet, you'll get the same two numbers again.

A detailed explanation of how random seeds work in TensorFlow can be found here.

For newer TF versions take care of this too: TensorFlow 2.2 ships with a os environment variable TF_DETERMINISTIC_OPS which if set to '1', will ensure that only deterministic GPU ops are used.

Solution 2

This code is for keras using tensorflow backend

This is because the weights are initialised using random numbers and hence you will get different results every time. This is expected behaviour. To have reproducible result you need to set the random seed as. Below example set operation-level and graph-level seeds for more information look here

import tensorflow as tf
import random as rn

os.environ['PYTHONHASHSEED'] = '0'

# Setting the seed for numpy-generated random numbers
np.random.seed(37)

# Setting the seed for python random numbers
rn.seed(1254)

# Setting the graph-level random seed.
tf.set_random_seed(89)

from keras import backend as K

session_conf = tf.ConfigProto(
      intra_op_parallelism_threads=1,
      inter_op_parallelism_threads=1)

#Force Tensorflow to use a single thread
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)

K.set_session(sess)

# Rest of the code follows from here on ...

Solution 3

I resolved this issue by adding os.environ['TF_DETERMINISTIC_OPS'] = '1'

Here an example:

import os
os.environ['TF_DETERMINISTIC_OPS'] = '1'
#rest of the code
# tf version 2.3.1

11,277

NeuronQ

Programmer with 7+ yrs experience (backend, full-stack, a bit of ML-infra.). 💡 ⇒ 🏭 - Can take complex projects from idea to delivery. ⚙️ ⇒ 🤖 - Looking to bring my solid software-engineering experience to the areas of machine-learning- engineering and data-science-engineering - this is what I’m passionate about and where I feel I can make the strongest positive impact!

Updated on October 17, 2022

Comments

NeuronQ over 1 year
(NOTE: Properly fixing the RNG state before each model creating as described in comment in comment practically fixed my problem, as within 3 decimals results are consistent, but they aren't exactly so, so there's somewhere a hidden source of randomness not fixed by seeding the RNG... probably some lib uses time milisecs or smth...if anyone has an idea on that, it would be cool to know, so I will wait and not close question yet :) )

I create a Keras LSTM model (used to predict some time series data, not important what), and every time I try to re-create an identical model (same mode config loaded from json, same weights loaded from file, same args to compile function), I get wildly different results on same train and test data. WHY?

Code is roughly like this:
```
# fix random
import random
random.seed(42)

# make model & compile
model = Sequential([
    LSTM(50, input_shape=(None, 1), return_sequences=True),
    LSTM(100, return_sequences=False),
    Dense(1),
    Activation("linear")
])
model.compile(loss="mse", optimizer="rmsprop")

# save it and its initial random weights
model_json = model.to_json()
model.save_weights("model.h5")

# fit and predict
model.fit(x_train, y_train, epochs=3)
r = model.predict(x_test)

# create new "identical" model
model2 = model_from_json(model_json)
model2.load_weights("model.h5")
model2.compile(loss="mse", optimizer="rmsprop")

# fit and predict "identical" model
model2.fit(x_train, y_train, epochs=3)
r2 = model2.predict(x_test)

# ...different results :(
```
I know that the model has initial random weights, so I'm saving them up and reloading them. I'm also paranoid enough to assume there are some "hidden" params that I may not know of, so I serialize model to json and reload instead of recreating an identical one by hand (tried that, same thing btw). And I also fixed the random number generator.

It's my first time with Keras, and I'm also a beginners to neural networks in general. But this this drives me crazy... wtf can vary?!

On fixing random number generators: I run Keras with the TensorFlow backend, and I have these lines of code at the start to try and fix the RNGs for experimental purposes:
```
import random
random.seed(42)
import numpy
numpy.random.seed(42)
from tensorflow import set_random_seed
set_random_seed(42)
```
...but they still don't fix the randomness.

And I understand that the goal is to make my model to behave non-randomly despite the inherent stochastic nature of NNs. But I need to temporarily fix this for experimental purposes (I'm even OK with it being reproducible on one machine only!).
- senderle over 6 years
  
  I am not sure how this could affect the results, but you haven't "fixed" the random number generator for the second model. You'd need to start it again from the same state (seed=42), and you'd need to run exactly the same set of calls to the generator the second time. Furthermore, you don't know how Keras is getting its random numbers! It's likely, in fact, that it's not getting them from the random module. It might not even be getting them from numpy either, as the answer below assumes.
- Miriam Farber over 6 years
  
  You should specify the seed differently if you want to get consistent results. Depending on the keras backend (theano or tensorflow), there are two ways to specify the random seed. See here: stackoverflow.com/questions/45970112/…
- NeuronQ over 6 years
  
  @senderle THIS. I didn't realize that of course the RNG state change when it's run, so I don't only need to fix it at the beginning, but also to refix it before making model2 ...guess it's "Friday fried brain" :) this almost fixed my problem, in the sense that there is still randomness, but wrt the first 3 decimals it's reproducible (I imagine some library dependency has it's own hidden randomish thinggy). This is good enough so I can distinguish "actual variance" (different pred on only slightly different training data) from "model randomness" and can start to work to fix the first! Thx!
- SomePhysicsStudent almost 4 years
  
  Thanks for that! I found out that TensorFlow 2.2 ships with a os environment variable TF_DETERMINISTIC_OPS which if set to '1', will ensure that only deterministic GPU ops are used. Setting to 1 fixed most of my GPU non-determinism except for a few TensorFlow ops that I ended up leaving on the CPU.
Miriam Farber over 6 years

This will work if keras used theano backend. For tensorflow backend you need to specify the seed with set_random_seed as well. See here: stackoverflow.com/questions/45970112/…
NeuronQ over 6 years

ok, so first thanks for reminding of numpy.random.seed, I've only set random.seed, but... it doesn't change anything :| Yeah, reducing overall variance in output is the actual goal, but right now I just want to understand why tf this is happening?! Like I fixed everything, I should get the same results on same data! After that yeah, there's a lot to optimize/tweak... but first things first: sanity and reproductibility is what I want.
Miriam Farber over 6 years

@NeuronQ Is your backend theano? If yes, it seems to me that this should be related somehow to the way you are saving and loading your weight, or the way you serializing. I tried to run your model after adding np.random.seed(42) using the data x_train=np.reshape(np.arange(10),(10,1,1)) x_test=x_train[:] y_train=2*np.arange(10) and got exactly the same results each time I run it. This means that the model itself is reproducible, when we ignoring all the "loading the data\ weights" part.
NeuronQ over 6 years

@MiriamFarber tried that too, still random. I'll probably get on with trying to tweak the optimizer and other parameters to reduce variability... but I'm still incredibly annoyed that somewhere I have an extra source of randomness that I don't know of! ...I understand NNs are stochastic in nature, but I want to know what each source of randomness is, and to be able to fix it at least on one machine when running experiments.
NeuronQ over 6 years

@MiriamFarber backend is TF
Miriam Farber over 6 years

@NeuronQ In such case, ydd the following 4 lines to the top of your code: from numpy.random import seed seed(1) from tensorflow import set_random_seed set_random_seed(2)
NeuronQ over 6 years

@MiriamFarber did that, also edited question to mention I did it. didn't work. might be a bug that it ignores the seed setting...
NeuronQ over 6 years

@Djib2011 sorry to bother you, but after coming back to this problem, after solving the initial randomness issues, I came across a weird finding: lowering learning rate helps a lot in reducing variance... but not not with SGD! (other optimizers like rmsprop work great). do you happen to know in what cases SGD performs exceptionally bad for an RNN? thanks!
Djib2011 over 6 years

I'm not sure. If I were to guess, it has something to do with the mini-batches used for calculating the derivative. With SGD you don't move towards the best direction for your entire input data, but just for a batch. It is supposed to take a lot more epochs to converge than normal GD but each epoch is calculated a lot faster. So finally SGD is faster than normal GD. I guess that this adds more variance to your system... I'd recommend this if you're interested in reading more on optimization algorithms: ruder.io/optimizing-gradient-descent
Brendano257 almost 3 years

I'm curious if you can add to why setting intra_op_paraellism IS necessary, I've not seen this elsewhere in recommendations for removing as much randomization as possible. Are there operations in TF that have been shown to produce random results based on thread-dependent behavior that can't be predicted, or is this just to make absolutely sure? Others have recommended removing GPU support (not an option for a lot of people, I think), but it sounds like from TF 2.2 forward the environment variable "TF_DETERMINISTIC_OPS" = '1' should handle that.
keramat almost 3 years

AttributeError: module 'tensorflow' has no attribute 'set_random_seed'