Unable to train my keras model : (Data cardinality is ambiguous:)

19,805

Had the same issue, dunno why number of inputs and outputs should be same, this error appears to be raised from one of the data adaptors when x.shape[0] != y.shape[0], in this case

x = [INPUT_IDS,INPUT_MASKS,INPUT_SEGS]
y = list(train.SECTION)

so instead of

model.fit([INPUT_IDS,INPUT_MASKS,INPUT_SEGS], list(train.SECTION))

try giving inputs and outputs in a dictionary with the layer names (check model summary (suitable names can be explicitly given as well)), worked for me

model.fit(
     {
     "input_word_ids": INPUT_IDS,
     "input_mask": INPUT_MASKS,
     "segment_ids": INPUT_SEGS,
     },
    {"dense_1": list(train.SECTION)}
)

please make sure that the inputs and outputs are numpy arrays, for ex: using np.asarray(), it looks for .shape attribute

Share:
19,805
Amal Vijayan
Author by

Amal Vijayan

| Full Stack Developer & Data Scientist | More than 3 years of professional experience in IT Worked with a number of technologies, platforms, and frameworks Python, Django, React, Docker, AWS, GitHub, Linux, TensorFlow | Floating on Data |

Updated on June 13, 2022

Comments

  • Amal Vijayan
    Amal Vijayan almost 2 years

    I am using the bert-for-tf2 library to do a Multi-Class Classification problem. I created the model but training throws the following error:

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-25-d9f382cba5d4> in <module>()
    ----> 1 model.fit([INPUT_IDS,INPUT_MASKS,INPUT_SEGS], list(train.SECTION))
    
    5 frames
    /tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/data_adapter.py in 
    __init__(self, x, y, sample_weights, batch_size, epochs, steps, shuffle, **kwargs)
    243             label, ", ".join([str(i.shape[0]) for i in nest.flatten(data)]))
    244       msg += "Please provide data which shares the same first dimension."
    --> 245       raise ValueError(msg)
    246     num_samples = num_samples.pop()
    247 
    
    ValueError: Data cardinality is ambiguous:
    x sizes: 3
    y sizes: 6102
    Please provide data which shares the same first dimension.
    

    I am referring the medium article called Simple BERT using TensorFlow 2.0 The git repo for the library bert-for-tf2 can be found here.

    Please find the entire code here.

    Here is a link to my colab notebook

    Really appreciate your help!