InvalidArgumentError: required broadcastable shapes at loc(unknown)

18,631

Solution 1

I found several issues here. The model was intended to be used for semantic segmentation with several classes (this is why I had changed the output layer activation to "softmax" and set "sparse_categorical_crossentropy" loss). Hence, in the ImageDataGenerators, class_mode has to be set to None. classes are not to be provided. Instead, I needed to insert the manually classified images as y. I guess beginners make a lot of beginner mistakes.

Solution 2

I faced this problem when the number of Class Labels did not match with the shape of the Output Layer's output shape.

For example, if there are 10 Class Labels and we have defined the Output Layer as:

output = tf.keras.layers.Conv2D(5, (1, 1), activation = "softmax")(c9)

As the number of Class Labels (10) is not equal to the Output shape (5). Then, we will get this error.

Ensure that the number of class labels matches with the Output Layer's output shape.

Solution 3

I got the same issue because I used a number of n_classes in the model (for the output layer) different from the actual number of classes in the labels/masks array. I see you have a similar issue here: you have 13 classes, but your output layer is given only 1. The best way is to avoid hard-coding the number of classes, and only pass a variable (like n_classes) in the model, then declare this variable before calling the model. For instance n_classes = y_Train.shape[-1] or n_classes = len(np.unique(y_Train))

Solution 4

Try to check whether ks.layers.concatenate layers' inputs are of equal dimension. For example ks.layers.concatenate([u7, c3]), here check u7 and c3 tensors are of same shape to be concatenated except the axis input to the function ks.layers.concatenate. Axis = -1 default, that's the last dimension. To illustrate if you are giving ks.layers.concatenate([u7,c3],axis=0), then except the first axis of both u7 and c3 all other axes' dimension should match exactly, example, u7.shape = [3,4,5], c3.shape = [6,4,5].

Share:
18,631
Manuel Popp
Author by

Manuel Popp

Geoecologist with interests in macroecology, ecosystem ecology and remote sensing.

Updated on June 17, 2022

Comments

  • Manuel Popp
    Manuel Popp almost 2 years

    Background

    I am totally new to Python and to machine learning. I just tried to set up a UNet from code I found on the internet and wanted to adapt it to the case I'm working on bit for bit. When trying to .fit the UNet to the training data, I received the following error:

    InvalidArgumentError:  required broadcastable shapes at loc(unknown)
         [[node Equal (defined at <ipython-input-68-f1422c6f17bb>:1) ]] [Op:__inference_train_function_3847]
    

    I get a lot of results when I search for it, but mostly they are different errors.

    What does this mean? And, more importantly, how can I fix it?

    The code that caused the error

    The context of this error is as follows: I want to segment images and label the different classes. I set up directories "trn", "tst" and "val" for training, test, and validation data. The dir_dat() function applies os.path.join() to get the full path to the respective data set. Each of the 3 folders has sub directories for each class labeled with integers. In each folder, there are some .tif images for the respective class.

    I defined the following image data generators (training data is sparse, hence the augmentation):

    classes = np.array([ 0,  2,  4,  6,  8, 11, 16, 21, 29, 30, 38, 39, 51])
    bs = 15 # batch size
    
    augGen = ks.preprocessing.image.ImageDataGenerator(rotation_range = 365,
                                                       width_shift_range = 0.05,
                                                       height_shift_range = 0.05,
                                                       horizontal_flip = True,
                                                       vertical_flip = True,
                                                       fill_mode = "nearest") \
        .flow_from_directory(directory = dir_dat("trn"),
                             classes = [str(x) for x in classes.tolist()],
                             class_mode = "categorical",
                             batch_size = bs, seed = 42)
        
    tst_batches = ks.preprocessing.image.ImageDataGenerator() \
        .flow_from_directory(directory = dir_dat("tst"),
                             classes = [str(x) for x in classes.tolist()],
                             class_mode = "categorical",
                             batch_size = bs, shuffle = False)
    
    val_batches = ks.preprocessing.image.ImageDataGenerator() \
        .flow_from_directory(directory = dir_dat("val"),
                             classes = [str(x) for x in classes.tolist()],
                             class_mode = "categorical",
                             batch_size = bs)
    

    Then I set up the UNet based on this example. Here, I altered a few parameters to adapt the UNet to the situation (multiple classes), namely activation in the last layer and the loss function:

    layer_in = ks.layers.Input(shape = (imgr, imgc, imgdim))
    # convert pixel integer values to float
    inVals = ks.layers.Lambda(lambda x: x / 255)(layer_in)
    
    # Contraction path
    c1 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(inVals)
    c1 = ks.layers.Dropout(0.1)(c1)
    c1 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c1)
    p1 = ks.layers.MaxPooling2D((2, 2))(c1)
    
    c2 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(p1)
    c2 = ks.layers.Dropout(0.1)(c2)
    c2 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c2)
    p2 = ks.layers.MaxPooling2D((2, 2))(c2)
     
    c3 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(p2)
    c3 = ks.layers.Dropout(0.2)(c3)
    c3 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c3)
    p3 = ks.layers.MaxPooling2D((2, 2))(c3)
     
    c4 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(p3)
    c4 = ks.layers.Dropout(0.2)(c4)
    c4 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c4)
    p4 = ks.layers.MaxPooling2D(pool_size = (2, 2))(c4)
     
    c5 = ks.layers.Conv2D(256, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(p4)
    c5 = ks.layers.Dropout(0.3)(c5)
    c5 = ks.layers.Conv2D(256, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c5)
    
    # Expansive path 
    u6 = ks.layers.Conv2DTranspose(128, (2, 2), strides = (2, 2), padding = "same")(c5)
    u6 = ks.layers.concatenate([u6, c4])
    c6 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(u6)
    c6 = ks.layers.Dropout(0.2)(c6)
    c6 = ks.layers.Conv2D(128, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c6)
     
    u7 = ks.layers.Conv2DTranspose(64, (2, 2), strides = (2, 2), padding = "same")(c6)
    u7 = ks.layers.concatenate([u7, c3])
    c7 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(u7)
    c7 = ks.layers.Dropout(0.2)(c7)
    c7 = ks.layers.Conv2D(64, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c7)
     
    u8 = ks.layers.Conv2DTranspose(32, (2, 2), strides = (2, 2), padding = "same")(c7)
    u8 = ks.layers.concatenate([u8, c2])
    c8 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(u8)
    c8 = ks.layers.Dropout(0.1)(c8)
    c8 = ks.layers.Conv2D(32, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c8)
     
    u9 = ks.layers.Conv2DTranspose(16, (2, 2), strides = (2, 2), padding = "same")(c8)
    u9 = ks.layers.concatenate([u9, c1], axis = 3)
    c9 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(u9)
    c9 = ks.layers.Dropout(0.1)(c9)
    c9 = ks.layers.Conv2D(16, (3, 3), activation = "relu",
                                kernel_initializer = "he_normal", padding = "same")(c9)
     
    out = ks.layers.Conv2D(1, (1, 1), activation = "softmax")(c9)
     
    model = ks.Model(inputs = layer_in, outputs = out)
    model.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
    model.summary()
    

    Finally, I defined callbacks and ran the training, which produced the error:

    cllbs = [
        ks.callbacks.EarlyStopping(patience = 4),
        ks.callbacks.ModelCheckpoint(dir_out("Checkpoint.h5"), save_best_only = True),
        ks.callbacks.TensorBoard(log_dir = './logs'),# log events for TensorBoard
        ]
    
    model.fit(augGen, epochs = 5, validation_data = val_batches, callbacks = cllbs)
    

    Full console output

    This is the full output when running the last line (in case it helps solving the issue):

    trained = model.fit(augGen, epochs = 5, validation_data = val_batches, callbacks = cllbs)
    Epoch 1/5
    Traceback (most recent call last):
    
      File "<ipython-input-68-f1422c6f17bb>", line 1, in <module>
        trained = model.fit(augGen, epochs = 5, validation_data = val_batches, callbacks = cllbs)
    
      File "c:\users\manuel\python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1183, in fit
        tmp_logs = self.train_function(iterator)
    
      File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\def_function.py", line 889, in __call__
        result = self._call(*args, **kwds)
    
      File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\def_function.py", line 950, in _call
        return self._stateless_fn(*args, **kwds)
    
      File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\function.py", line 3023, in __call__
        return graph_function._call_flat(
    
      File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\function.py", line 1960, in _call_flat
        return self._build_call_outputs(self._inference_function.call(
    
      File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\function.py", line 591, in call
        outputs = execute.execute(
    
      File "c:\users\manuel\python\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
        tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
    
    InvalidArgumentError:  required broadcastable shapes at loc(unknown)
         [[node Equal (defined at <ipython-input-68-f1422c6f17bb>:1) ]] [Op:__inference_train_function_3847]
    
    Function call stack:
    train_function