Inputs to eager execution function cannot be Keras symbolic tensors

python tensorflow keras deep-learning eager-execution

13,841

Solution 1

One alternative solution is to pass weights as additional output features rather than input features.

This keeps the model completely free of anything weights related, and the weights appear only in the loss function and the .fit() call:

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, losses, models

data_x = 2 * np.ones((7, 11, 15, 3), dtype=float)
data_y = 5 * np.ones((7, 9, 13, 5), dtype=float)

x = layers.Input(data_x.shape[1:])
y = layers.Conv2D(5, kernel_size=3)(x)
model = models.Model(inputs=x, outputs=y)


def loss(y_true, y_pred):
    (y_true, w) = tf.split(y_true, num_or_size_splits=[-1, 1], axis=-1)
    loss = tf.squeeze(w, axis=-1) * losses.mse(y_true, y_pred)

    tf.print(tf.math.reduce_mean(y_true), "== 5")
    tf.print(tf.math.reduce_mean(w), "== 3")

    return loss


model.compile(loss=loss)

data_w = 3 * np.ones((7, 9, 13, 1), dtype=float)
data_yw = np.concatenate((data_y, data_w), axis=-1)
model.fit(data_x, data_yw)

One drawback still is that you need to manipulate (potentially) large arrays when merging y and w in numpy.stack(), so anymore more TensorFlow-like will be appreciated.

Solution 2

Another way:

from tensorflow.keras import layers, models, losses
import numpy as np

def loss_fcn(y_true, y_pred, w):
    loss = w * losses.mse(y_true, y_pred)
    return loss


data_x = np.random.rand(5, 4, 1)
data_w = np.random.rand(5, 4)
data_y = np.random.rand(5, 4, 1)

x = layers.Input([4, 1])
y_true = layers.Input([4, 1])
w = layers.Input([4])
y = layers.Activation('tanh')(x)


model = models.Model(inputs=[x, y_true, w], outputs=y)
model.add_loss(loss_fcn(y, y_true, w))


model.compile()
model.fit((data_x, data_y, data_w))

I think this is the most elegant solution.

Solution 3

Your code works just fine with latest tensorflow (2.3) if you replace your fit row with

model.fit((data_x, data_y, data_w))

So:

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, losses, models


# HERE
def loss_fcn(y_true, y_pred):
    w = y_pred[:, :, -1]  # HERE
    y_pred = y_pred[:, :, :-1]  # HERE
    loss = w * losses.mse(y_true, y_pred)
    return loss


data_x = np.random.rand(5, 4, 1)
data_w = np.random.rand(5, 4, 1)  # HERE
data_y = np.random.rand(5, 4, 1)

x = layers.Input([4, 1])
w = layers.Input([4, 1])  # HERE
y = layers.Activation('tanh')(x)
output = layers.Concatenate()([y, w])  # HERE
model = models.Model(inputs=[x, w], outputs=output)  # HERE
loss = loss_fcn  # HERE

model.compile(loss=loss)
model.fit((data_x, data_y, data_w))

print('Done.')

Further, I found tf.reduce_mean, K.mean, tf.square, tf.exp etc. implemented in a loss funtion cause the same error.

13,841

bers

Updated on June 04, 2022

Comments

bers almost 2 years

I am trying to implement sample- and pixel-dependent dependent loss weighting in tf.Keras (TensorFlow 2.0.0rc0) for a 3-D U-Net with sparse annotation data (Cicek 2016, arxiv:1606.06650).

This is my code:

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, losses, models

# disabling eager execution makes this example work:
# tf.python.framework_ops.disable_eager_execution()


def get_loss_fcn(w):
    def loss_fcn(y_true, y_pred):
        loss = w * losses.mse(y_true, y_pred)
        return loss
    return loss_fcn


data_x = np.random.rand(5, 4, 1)
data_w = np.random.rand(5, 4)
data_y = np.random.rand(5, 4, 1)

x = layers.Input([4, 1])
w = layers.Input([4])
y = layers.Activation('tanh')(x)
model = models.Model(inputs=[x, w], outputs=y)
loss = get_loss_fcn(model.input[1])

# using another loss makes it work, too:
# loss = 'mse'

model.compile(loss=loss)
model.fit((data_x, data_w), data_y)

print('Done.')

This runs fine when disabling eager execution, but one of the points of TensorFlow 2 is to have eager execution by default. What stands between me and that goal is the custom loss function, as you can see (using 'mse' as a loss removes that error, too):

  File "MWE.py", line 30, in <module>
    model.fit((data_x, data_w), data_y)
[...]
tensorflow.python.eager.core._SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'input_2:0' shape=(None, 4) dtype=float32>]

What can I do to make this kind of structure work with eager execution?

One idea that I had was to concatenate w to the output y and separate y_pred into the original y_pred and w in the loss function, but this is a hack I'd like to avoid. It works, though, with changes marked by # HERE:

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, losses, models


# HERE
def loss_fcn(y_true, y_pred):
    w = y_pred[:, :, -1]  # HERE
    y_pred = y_pred[:, :, :-1]  # HERE
    loss = w * losses.mse(y_true, y_pred)
    return loss


data_x = np.random.rand(5, 4, 1)
data_w = np.random.rand(5, 4, 1)  # HERE
data_y = np.random.rand(5, 4, 1)

x = layers.Input([4, 1])
w = layers.Input([4, 1])  # HERE
y = layers.Activation('tanh')(x)
output = layers.Concatenate()([y, w])  # HERE
model = models.Model(inputs=[x, w], outputs=output)  # HERE
loss = loss_fcn  # HERE

model.compile(loss=loss)
model.fit((data_x, data_w), data_y)

print('Done.')

Any other ideas?

vgoklani over 4 years

Have you tried training directly without using .fit() like this example: tensorflow.org/beta/guide/keras/…
bers over 4 years

@vgoklani no, not yet. Thanks for the hint!
Luke over 4 years

Problem with that is that you lose a lot of the convenience of keras (i.e. callbacks).
bers over 4 years

Could you please explain why this works? I have an idea, but I am not sure I understand the concept behind it. Are you suggesting that y_true was the tensor that my MWE had a problem with, and not w? Because I don't see any change in w.
feature_engineer over 4 years

@bers I think the problem was with your function returning an inner function using the symbolic tensor. using a function which returns a symbolic tensor, instead of a function works, since that returned symbolic tensor is lazily evaluated.
Stefan Falk almost 4 years

@feature_engineer This makes sense but if I do this i am getting ValueError: No gradients provided for any variable: [...] - any idea why this might happen?