Custom weighted loss function in Keras for weighing each element


In the batch size is 32 by default, that's where this number is coming from. Here's what's happening:

  • In custom_loss_1 the tensor K.abs(y_true-y_pred) has shape (batch_size=32, 5), while the numpy array weights has shape (100, 5). This is an invalid multiplication, since the dimensions don't agree and broadcasting can't be applied.

  • In custom_loss_2 this problem doesn't exist because you're multiplying 2 tensors with the same shape (batch_size=32, 5).

  • In custom_loss_3 the problem is the same as in custom_loss_1, because converting weights into a Keras variable doesn't change their shape.

UPDATE: It seems you want to give a different weight to each element in each training sample, so the weights array should have shape (100, 5) indeed. In this case, I would input your weights' array into your model and then use this tensor within the loss function:

import numpy as np
from keras.layers import Dense, Input
from keras import Model
import keras.backend as K
from functools import partial

def custom_loss_4(y_true, y_pred, weights):
    return K.mean(K.abs(y_true - y_pred) * weights)

train_X = np.random.randn(100, 5)
train_Y = np.random.randn(100, 5) * 0.01 + train_X
weights = np.random.randn(*train_X.shape)

input_layer = Input(shape=(5,))
weights_tensor = Input(shape=(5,))
out = Dense(5)(input_layer)
cl4 = partial(custom_loss_4, weights=weights_tensor)
model = Model([input_layer, weights_tensor], out)
model.compile('adam', cl4)[train_X, weights], y=train_Y, epochs=10)
Nipun Batra
Author by

Nipun Batra

Nipun Batra is an Assistant Professor in Computer Science at IIT Gandhinagar. He previously completed his postdoc from University of Virginia. He completed his PhD. from IIIT Delhi where he was a TCS PhD fellow. His group broadly works on machine learning/AI/sensors/IoT for computational sustainability problems like smart buildings and air quality. His work has been awarded several awards, including, the best PhD presentation at ACM Sensys, best demo at ACM Buildsys, and a best video nominee at ACM KDD.

Updated on February 02, 2020


  • Nipun Batra
    Nipun Batra over 4 years

    I'm trying to create a simple weighted loss function.

    Say, I have input dimensions 100 * 5, and output dimensions also 100 * 5. I also have a weight matrix of the same dimension.

    Something like the following:

    import numpy as np
    train_X = np.random.randn(100, 5)
    train_Y = np.random.randn(100, 5)*0.01 + train_X
    weights = np.random.randn(*train_X.shape)

    Defining the custom loss function

    def custom_loss_1(y_true, y_pred):
        return K.mean(K.abs(y_true-y_pred)*weights)

    Defining the model

    from keras.layers import Dense, Input
    from keras import Model
    import keras.backend as K
    input_layer = Input(shape=(5,))
    out = Dense(5)(input_layer)
    model = Model(input_layer, out)

    Testing with existing metrics works fine

    model.compile('adam','mean_absolute_error'), train_Y, epochs=1)

    Testing with our custom loss function doesn't work

    model.compile('adam',custom_loss_1), train_Y, epochs=10)

    It gives the following stack trace:

    InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5] vs. [100,5]
     [[Node: loss_9/dense_8_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_9/dense_8_loss/Abs, loss_9/dense_8_loss/mul/y)]]

    Where is the number 32 coming from?

    Testing a loss function with weights as Keras tensors

    def custom_loss_2(y_true, y_pred):
        return K.mean(K.abs(y_true-y_pred)*K.ones_like(y_true))

    This function seems to do the work. So, probably suggests that a Keras tensor as a weight matrix would work. So, I created another version of the loss function.

    Loss function try 3

    from functools import partial
    def custom_loss_3(y_true, y_pred, weights):
        return K.mean(K.abs(y_true-y_pred)*K.variable(weights, dtype=y_true.dtype))
    cl3 = partial(custom_loss_3, weights=weights)  

    Fitting data using cl3 gives the same error as above.

    InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5] vs. [100,5]
         [[Node: loss_11/dense_8_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_11/dense_8_loss/Abs, loss_11/dense_8_loss/Variable/read)]]

    I wonder what I'm missing! I could have used the notion of sample_weight in Keras; but then I'd have to reshape my inputs to a 3d vector.

    I thought that this custom loss function should really have been trivial.