How to add report_tensor_allocations_upon_oom to RunOptions in Keras
Solution 1
TF1 solution:
Its not as hard as it seems, what you need to know is that according to the documentation, the **kwargs parameter passed to model.compile
will be passed to session.run
So you can do something like:
import tensorflow as tf
run_opts = tf.RunOptions(report_tensor_allocations_upon_oom = True)
model.compile(loss = "...", optimizer = "...", metrics = "..", options = run_opts)
And it should be passed directly each time session.run
is called.
TF2:
The solution above works only for tf1. For tf2, unfortunately, it appears there is no easy solution yet.
Solution 2
Currently, it is not possible to add the options to model.compile
. See: https://github.com/tensorflow/tensorflow/issues/19911
Solution 3
OOM means out of memory. May be it is using more memory at that time. Decrease batch_size significantly. I set to 16, then it worked fine
Related videos on Youtube
dspeyer
Updated on February 03, 2022Comments
-
dspeyer over 2 years
I'm trying to train a neural net on a GPU using Keras and am getting a "Resource exhausted: OOM when allocating tensor" error. The specific tensor it's trying to allocate isn't very big, so I assume some previous tensor consumed almost all the VRAM. The error message comes with a hint that suggests this:
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
That sounds good, but how do I do it? RunOptions appears to be a Tensorflow thing, and what little documentation I can find for it associates it with a "session". I'm using Keras, so Tensorflow is hidden under a layer of abstraction and its sessions under another layer below that.
How do I dig underneath everything to set this option in such a way that it will take effect?
-
dspeyer about 6 yearsI used options=run_opts, since it's a kwargs thing, and that worked
-
Amila almost 6 years@Matias Valdenegro I get
ValueError: ('Some keys in session_kwargs are not supported at this time: %s', dict_keys(['options']))
. Any idea what I'm doing wrong? -
Enea Dume over 5 yearsWhile this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes.
-
zwep over 5 yearsI had this exact same issue. Using keras version 2.2.4... is there any solution?
-
zwep over 5 yearsAlso, I received this error 'Protocol message RunOptions has no "report_tensor_allocations_upon_oom" field.'
-
Dan Grahn about 5 years@zwep this was resolved in 2.2.4. Are you sure you've updated?
-
Zaccharie Ramzi almost 5 yearsThis caused me a segmentation fault for some reason:
[1] 3957 segmentation fault python oom_net.py
-
Zaccharie Ramzi almost 5 yearsApparently I was not the only one with a segfault: github.com/keras-team/keras/issues/11322
-
Adam Azarchs about 3 yearsWhether that will work, and what batch size is appropriate, will depend entirely on the model in question, as well as the dataset. If one is attempting to debug a memory issue that doesn't depend on batch size, this doesn't help at all.