AttributeError: 'str' object has no attribute 'shape'

16,771

Solution 1

Two possible problem you may have:

  1. Like Pietro Marsella Suggested, you define x_train_pad as a string type, or you redefine it after one epoch in your code.

  2. After the word embedding of the input words, you should have a numpy with shape=(N, K) , you should check whether your work embedding is valid for every word.

Solution 2

There is an error in this line: model.fit(x_train_pad, y_train, epochs=5, batch_size=256) Basically x_train_pad is an str (string) an it should be a numpy array

Share:
16,771
Admin
Author by

Admin

Updated on June 08, 2022

Comments

  • Admin
    Admin almost 2 years

    I'm a beginner in python. I try to conduct sentiment analysis and RNN. However I get AttributeError: 'str' object has no attribute 'shape'". I reviewed all posted solutions about this problem but I couldn't solve this problem. I try the same code another data file and it works. But not for my original data file.

    This is my code:

    import numpy as np
    import pandas as pd
    from tensorflow.python.keras.models import Sequential`
    from tensorflow.python.keras.layers import Dense, GRU, Embedding, CuDNNGRU
    from tensorflow.python.keras.optimizers import Adam
    from tensorflow.python.keras.preprocessing.text import Tokenizer
    from tensorflow.python.keras.preprocessing.sequence import pad_sequences
    
    
    dataset = pd.read_csv(r'C:\Users\Administrator\Desktop\tümveri8.csv', encoding='latin1')
    
    target = dataset['duygu'].values.tolist()
    data = dataset['yorum'].values.tolist()
    
    cutoff = int(len(data) * 0.80)
    x_train, x_test = data[:cutoff], data[cutoff:]
    y_train, y_test = target[:cutoff], target[cutoff:]
    
    num_words = 10000
    tokenizer = Tokenizer(num_words=num_words)
    tokenizer.fit_on_texts(data)
    
    x_train_tokens = tokenizer.texts_to_sequences(x_train)
    
    x_test_tokens = tokenizer.texts_to_sequences(x_test)
    
    num_tokens = [len(tokens) for tokens in x_train_tokens + x_test_tokens]
    num_tokens = np.array(num_tokens)
    max_tokens = np.mean(num_tokens) + 2 * np.std(num_tokens)
    max_tokens = int(max_tokens)
    max_tokens
    
    np.sum(num_tokens < max_tokens) / len(num_tokens)
    x_train_pad = pad_sequences(x_train_tokens, maxlen=max_tokens)
    x_test_pad = pad_sequences(x_test_tokens, maxlen=max_tokens)
    
    idx = tokenizer.word_index
    inverse_map = dict(zip(idx.values(), idx.keys()))
    def tokens_to_string(tokens):
    words = [inverse_map[token] for token in tokens if token!=0]
    text = ' '.join(words)
    return text
    
    model = Sequential()
    embedding_size = 50
    model.add(Embedding(input_dim=num_words,
                        output_dim=embedding_size,
                        input_length=max_tokens,
                        name='embedding_layer'))
    
    model.add(GRU(units=16, return_sequences=True))
    model.add(GRU(units=8, return_sequences=True))
    model.add(GRU(units=4))
    model.add(Dense(1, activation='sigmoid'))
    
    optimizer = Adam(lr=1e-3)
    
    model.compile(loss='binary_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])
    model.summary()
    

    This is the error code:

    model.fit(x_train_pad, y_train, epochs=5, batch_size=256)
    
    
    model.fit(x_train_pad, y_train, epochs=5, batch_size=256)
    
    AttributeError                            Traceback (most recent call 
    last)
    <ipython-input-79-631bbf0ac3a7> in <module>
    ----> 1 model.fit(x_train_pad, y_train, epochs=5, batch_size=256)
    
    ~\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py 
    in fit(self, x, y, batch_size, epochs, verbose, callbacks, 
    validation_split, validation_data, shuffle, class_weight, sample_weight, 
    initial_epoch, steps_per_epoch, validation_steps, validation_freq, 
    max_queue_size, workers, use_multiprocessing, **kwargs)
    707         steps=steps_per_epoch,
    708         validation_split=validation_split,
    --> 709         shuffle=shuffle)
    710 
    711     # Prepare validation data.
    
    ~\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py 
    in _standardize_user_data(self, x, y, sample_weight, class_weight, 
    batch_size, check_steps, steps_name, steps, validation_split, shuffle, 
    extract_tensors_from_dataset)
    2671           shapes=None,
    2672           check_batch_axis=False,  # Don't enforce the batch size.
    -> 2673           exception_prefix='target')
    2674 
    2675       # Generate sample-wise weight values given the `sample_weight` 
    and
    
    ~\Anaconda3\lib\site- 
    packages\tensorflow\python\keras\engine\training_utils.py in 
    standardize_input_data(data, names, shapes, check_batch_axis, 
    exception_prefix)
    335     ]
    336   else:
    --> 337     data = [standardize_single_array(x) for x in data]
    338 
    339   if len(data) != len(names):
    
    ~\Anaconda3\lib\site- 
     packages\tensorflow\python\keras\engine\training_utils.py in <listcomp> (.0)
    335     ]
    336   else:
    --> 337     data = [standardize_single_array(x) for x in data]
    338 
    339   if len(data) != len(names):
    
    ~\Anaconda3\lib\site- 
    packages\tensorflow\python\keras\engine\training_utils.py in 
    standardize_single_array(x, expected_shape)
    263     return None
    264 
    --> 265     if (x.shape is not None and len(x.shape) == 1 and
    266        (expected_shape is None or len(expected_shape) != 1)):
    267     if tensor_util.is_tensor(x):
    
    AttributeError: 'str' object has no attribute 'shape'
    
  • Admin
    Admin over 4 years
    Thank you for your answer Pietro. I tried Num in the begining of the line but in this case it doesnt work. I would appreciate if you have any idea how I can fix it? Thank you.
  • Admin
    Admin over 4 years
    I can just run 1 epochs, the other 4 are not working.
  • Admin
    Admin over 4 years
    Thank you for your answer. Im sorry for asking again but Im really very beginner in python. I dont know how to check whether my work embedding is valid for every word. I would appreciate if you could tell me if there is a way to the it. Thank you.
  • monkeyking9528
    monkeyking9528 over 4 years
    one method would be print the x_train_pad's shape to see whether it is a numpy array of (sample_size, max_word_length)