Function call stack: keras_scratch_graph Error
Solution 1
My situation is tensorflow sample code works fine in Google colab but not in my machine as I got keras_scratch_graph error.
Then i add this Python code at the beginning and it works fine.
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Restrict TensorFlow to only use the fourth GPU
tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES
) visible to the process.
In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as is needed by the process.
For example, you want to train multiple small models with one GPU at the same time.
By calling tf.config.experimental.set_memory_growth
, which attempts to allocate only as much GPU memory in needed for the runtime allocations: it starts out allocating very little memory, and as the program gets run and more GPU memory is needed, we extend the GPU memory region allocated to the TensorFlow process.
Hope it helps!
Solution 2
I was getting similar error. I reduced the batch size and the error disappeared. I don't know why but it worked for me. I am guessing something related to over stacking.
Solution 3
I think it's a thing about the gpu. look at the traceback:
File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 572, in __call__ return self._call_flat(args)
tf is calling on eager execution, which means that gpu will be used if the version is available. I had the same issue when I was testing a dense network:
inputs=Input(shape=(100,)
)
x=Dense(32, activation='relu')(inputs)
x=Dense(32, activation='relu')(x)
x=Dense(32, activation='relu')(x)
outputs=Dense(10, activation='softmax')(x)
model=Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
t=tf.zeros([1,100])
model.predict(t, steps=1, batch_size=1)
... and it gave a similar traceback, also linking to eager execution. Then when I disabled gpu using the following line:
tf.config.experimental.set_visible_devices([], 'GPU')
... the code ran just fine. See if this would help solve the issue. Btw, does colab even support gpu? I didn't even know.
Solution 4
it my case I had to update keras and tensorflow
pip install -U tensorflow keras
Solution 5
If you use Tensorflow-GPU, then add:
physical_devices = tf.config.experimental.list_physical_devices('GPU')
print("physical_devices-------------", len(physical_devices))
tf.config.experimental.set_memory_growth(physical_devices[0], True)
In addition, you can reduce your batch_size or change another computer or cloud services, like google colab, amazon cloud to run your codes because I think this is because the limitation of memory.
user8882401
Updated on August 06, 2020Comments
-
user8882401 over 3 years
I am reimplementing a text2speech project. I am facing a Function call stack : keras_scratch_graph error in decoder part. The network architecture is from Deep Voice 3 paper.
I am using keras from TF 2.0 on Google Colab. Below is the code for Decoder Keras Model.
y1 = tf.ones(shape = (16, 203, 320)) def Decoder(name = "decoder"): # Decoder Prenet din = tf.concat((tf.zeros_like(y1[:, :1, -hp.mel:]), y1[:, :-1, -hp.mel:]), 1) keys = K.Input(shape = (180, 256), batch_size = 16, name = "keys") vals = K.Input(shape = (180, 256), batch_size = 16, name = "vals") prev_max_attentions_li = tf.ones(shape=(hp.dlayer, hp.batch_size), dtype=tf.int32) #prev_max_attentions_li = K.Input(tensor = prev_max_attentions_li) for i in range(hp.dlayer): dpout = K.layers.Dropout(rate = 0 if i == 0 else hp.dropout)(din) fc_out = K.layers.Dense(hp.char_embed, activation = 'relu')(dpout) print("=======================================================================================================") print("The FC value is ", fc_out) print("=======================================================================================================") query_pe = K.layers.Embedding(hp.Ty, hp.char_embed)(tf.tile(tf.expand_dims(tf.range(hp.Ty // hp.r), 0), [hp.batch_size, 1])) key_pe = K.layers.Embedding(hp.Tx, hp.char_embed)(tf.tile(tf.expand_dims(tf.range(hp.Tx), 0), [hp.batch_size, 1])) alignments_li, max_attentions_li = [], [] for i in range(hp.dlayer): dpout = K.layers.Dropout(rate = 0)(fc_out) queries = K.layers.Conv1D(hp.datten_size, hp.dfilter, padding = 'causal', dilation_rate = 2**i)(dpout) fc_out = (queries + fc_out) * tf.math.sqrt(0.5) print("=======================================================================================================") print("The FC value is ", fc_out) print("=======================================================================================================") queries = fc_out + query_pe keys += key_pe tensor, alignments, max_attentions = Attention(name = "attention")(queries, keys, vals, prev_max_attentions_li[i]) fc_out = (tensor + queries) * tf.math.sqrt(0.5) alignments_li.append(alignments) max_attentions_li.append(max_attentions) decoder_output = fc_out dpout = K.layers.Dropout(rate = 0)(decoder_output) mel_logits = K.layers.Dense(hp.mel * hp.r)(dpout) dpout = K.layers.Dropout(rate = 0)(fc_out) done_output = K.layers.Dense(2)(dpout) return K.Model(inputs = [keys, vals], outputs = [mel_logits, done_output, decoder_output, alignments_li, max_attentions_li], name = name)
decode = Decoder() kin = tf.ones(shape = (16, 180, 256)) vin = tf.ones(shape = (16, 180, 256)) print(decode(kin, vin)) tf.keras.utils.plot_model(decode, to_file = "decoder.png", show_shapes = True)
When I test with some data, it shows the error messages below. It's going to be some problem with "fc_out", but I dun know how to pass "fc_out" output from the first for loop to the second for loop? Any answer would be appreciated.
File "Decoder.py", line 60, in <module> decode = Decoder() File "Decoder.py", line 33, in Decoder dpout = K.layers.Dropout(rate = 0)(fc_out) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 596, in __call__ base_layer_utils.create_keras_history(inputs) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 199, in create_keras_history _, created_layers = _create_keras_history_helper(tensors, set(), []) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 245, in _create_keras_history_helper layer_inputs, processed_ops, created_layers) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 245, in _create_keras_history_helper layer_inputs, processed_ops, created_layers) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 245, in _create_keras_history_helper layer_inputs, processed_ops, created_layers) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 243, in _create_keras_history_helper constants[i] = backend.function([], op_input)([]) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3510, in __call__ outputs = self._graph_fn(*converted_inputs) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 572, in __call__ return self._call_flat(args) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 671, in _call_flat outputs = self._inference_function.call(ctx, args) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 445, in call ctx=ctx) File "/Users/ydc/dl-npm/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 67, in quick_execute six.raise_from(core._status_to_exception(e.code, message), None) File "<string>", line 3, in raise_from tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable _AnonymousVar19 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar19/N10tensorflow3VarE does not exist. [[node dense_7/BiasAdd/ReadVariableOp (defined at Decoder.py:33) ]] [Op:__inference_keras_scratch_graph_566] Function call stack: keras_scratch_graph
-
Leonard about 4 yearsThanks for the elaborate answer. I'm facing the same error when
tf.config.experimental.list_physical_devices('GPU')
yields an empty list, i.e.gpus
isFalse
. Can you conceive any reason for this? -
Suraj Donthi about 4 years@Rumo It's mostly an installation problem. If you're using TF 2.0+, an easy way to check whether you've installed TF with GPU support is to either use
tf.test.is_built_with_cuda()
ortf.test.is_built_with_gpu_support()
. If this returnsFalse
, you'll have to reinstall TensorFlow as in the documentation. -
Leonard about 4 years@SurajDonthi Thanks, but I do not want to use GPU support. What I meant is that this error is not only attributed to GPU issues. In my case, the reason was that I used
tf.metrics.iou
which is not supported for eager mode. It worked after switching totf.keras.MeanIoU
. I think this error can arise for very different reasons. -
Ahmad Moussa almost 4 yearsI have these versions of keras and tensorflow and still have that error
-
Hafizur Rahman almost 4 yearsDid you try
keras=2.3.2
? -
user3352632 almost 2 yearswhere to add it. got it anyway ...