AttributeError: 'str' object has no attribute 'ndim'

17,696

You are feeding a list of strings to a model which is something it does not expect. You can use keras.preprocessing.text module to convert the text to an integer sequence. More specifically you can prepare data like:

from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
tk = Tokenizer()
tk.fit_on_texts(texts)
index_list = tk.texts_to_sequences(texts)
x_train = pad_sequences(index_list, maxlen=maxlen)

Now x_train (a n_samples * maxlen ndarray of type np.int) is a legitimate input for the model.

Share:
17,696
Amy
Author by

Amy

Updated on September 27, 2022

Comments

  • Amy
    Amy over 1 year

    I'm using Keras to implement a sentiment analysis code. I have my training data as follows:

    • pos.txt : text file of all positive reviews separated by line
    • neg.txt : text file of all negative reviews separated by line

    I build my code in a similar fashion to here

    The only difference is that their data is imported from Keras dataset while mine are text file

    This is my code

    # CNN for the IMDB problem
    
    top_words = 5000
    
    pos_file=open('pos.txt', 'r')
    neg_file=open('neg.txt', 'r')
     # Load data from files
     pos = list(pos_file.readlines())
     neg = list(neg_file.readlines())
     x = pos + neg
     total = numpy.array(x)
     # Generate labels
     positive_labels = [1 for _ in pos]
     negative_labels = [0 for _ in neg]
     y = numpy.concatenate([positive_labels, negative_labels], 0)
    
     #Testing
     pos_test=open('posTest.txt', 'r')
     posT = list(pos_test.readlines())
     print("pos length is",len(posT))
    
     neg_test=open('negTest.txt', 'r')
     negT = list(neg_test.readlines())
     xTest = pos + negT
     total2 = numpy.array(xTest)
    
    # Generate labels
    positive_labels2 = [1 for _ in posT]
    negative_labels2 = [0 for _ in negT]
    yTest = numpy.concatenate([positive_labels2, negative_labels2], 0)
    
    #Create model
    max_words = 1
    model = Sequential()
    model.add(Embedding(top_words, 32, input_length=max_words))
    
    model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
    model.add(MaxPooling1D(pool_size=1))
    model.add(Flatten())
    model.add(Dense(250, activation='relu'))
    
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    print(model.summary())
    
    #Fit the model
    
    model.fit(total, y, validation_data=(xTest, yTest), epochs=2, batch_size=128, verbose=2)
    
    # Final evaluation of the model
    scores = model.evaluate(total2, yTest, verbose=0)
    print("Accuracy: %.2f%%" % (scores[1]*100))
    

    When I run my code , I get this error

    File "C:\Users\\Anaconda3\lib\site-packages\keras\engine\training.py", line 70, in <listcomp>
    data = [np.expand_dims(x, 1) if x is not None and x.ndim == 1 else x for x in data]
    
    AttributeError: 'str' object has no attribute 'ndim'
    
  • Amy
    Amy about 6 years
    Yess it worked !