Custom weighted loss function in Keras for weighing each element

python tensorflow keras loss-function

22,008

In model.fit the batch size is 32 by default, that's where this number is coming from. Here's what's happening:

In custom_loss_1 the tensor K.abs(y_true-y_pred) has shape (batch_size=32, 5), while the numpy array weights has shape (100, 5). This is an invalid multiplication, since the dimensions don't agree and broadcasting can't be applied.
In custom_loss_2 this problem doesn't exist because you're multiplying 2 tensors with the same shape (batch_size=32, 5).
In custom_loss_3 the problem is the same as in custom_loss_1, because converting weights into a Keras variable doesn't change their shape.

UPDATE: It seems you want to give a different weight to each element in each training sample, so the weights array should have shape (100, 5) indeed. In this case, I would input your weights' array into your model and then use this tensor within the loss function:

import numpy as np
from keras.layers import Dense, Input
from keras import Model
import keras.backend as K
from functools import partial


def custom_loss_4(y_true, y_pred, weights):
    return K.mean(K.abs(y_true - y_pred) * weights)


train_X = np.random.randn(100, 5)
train_Y = np.random.randn(100, 5) * 0.01 + train_X
weights = np.random.randn(*train_X.shape)

input_layer = Input(shape=(5,))
weights_tensor = Input(shape=(5,))
out = Dense(5)(input_layer)
cl4 = partial(custom_loss_4, weights=weights_tensor)
model = Model([input_layer, weights_tensor], out)
model.compile('adam', cl4)
model.fit(x=[train_X, weights], y=train_Y, epochs=10)

22,008

Author by

Nipun Batra

Nipun Batra is an Assistant Professor in Computer Science at IIT Gandhinagar. He previously completed his postdoc from University of Virginia. He completed his PhD. from IIIT Delhi where he was a TCS PhD fellow. His group broadly works on machine learning/AI/sensors/IoT for computational sustainability problems like smart buildings and air quality. His work has been awarded several awards, including, the best PhD presentation at ACM Sensys, best demo at ACM Buildsys, and a best video nominee at ACM KDD.

Updated on February 02, 2020

Comments

Nipun Batra over 4 years

I'm trying to create a simple weighted loss function.

Say, I have input dimensions 100 * 5, and output dimensions also 100 * 5. I also have a weight matrix of the same dimension.

Something like the following:

import numpy as np
train_X = np.random.randn(100, 5)
train_Y = np.random.randn(100, 5)*0.01 + train_X

weights = np.random.randn(*train_X.shape)

Defining the custom loss function

def custom_loss_1(y_true, y_pred):
    return K.mean(K.abs(y_true-y_pred)*weights)

Defining the model

from keras.layers import Dense, Input
from keras import Model
import keras.backend as K

input_layer = Input(shape=(5,))
out = Dense(5)(input_layer)
model = Model(input_layer, out)

Testing with existing metrics works fine

model.compile('adam','mean_absolute_error')
model.fit(train_X, train_Y, epochs=1)

Testing with our custom loss function doesn't work

model.compile('adam',custom_loss_1)
model.fit(train_X, train_Y, epochs=10)

It gives the following stack trace:

InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5] vs. [100,5]
 [[Node: loss_9/dense_8_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_9/dense_8_loss/Abs, loss_9/dense_8_loss/mul/y)]]

Where is the number 32 coming from?

Testing a loss function with weights as Keras tensors

def custom_loss_2(y_true, y_pred):
    return K.mean(K.abs(y_true-y_pred)*K.ones_like(y_true))

This function seems to do the work. So, probably suggests that a Keras tensor as a weight matrix would work. So, I created another version of the loss function.

Loss function try 3

from functools import partial

def custom_loss_3(y_true, y_pred, weights):
    return K.mean(K.abs(y_true-y_pred)*K.variable(weights, dtype=y_true.dtype))

cl3 = partial(custom_loss_3, weights=weights)

Fitting data using cl3 gives the same error as above.

InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5] vs. [100,5]
     [[Node: loss_11/dense_8_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_11/dense_8_loss/Abs, loss_11/dense_8_loss/Variable/read)]]

I wonder what I'm missing! I could have used the notion of sample_weight in Keras; but then I'd have to reshape my inputs to a 3d vector.

I thought that this custom loss function should really have been trivial.