Unable to import Tokenizer from Keras

12,032

It appears it is importing correctly, but the Tokenizer object has no attribute word_index.

According to the documentation that attribute will only be set once you call the method fits_on_text on the Tokenizer object.

The following code runs successfully:

 from keras.preprocessing.text import Tokenizer

 samples = ['The cat say on the mat.', 'The dog ate my homework.']

 tokenizer = Tokenizer(num_words=1000)
 tokenizer.fit_on_texts(samples)

 one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')

 word_index = tokenizer.word_index
 print('Found %s unique tokens.' % len(word_index))
Share:
12,032
rmahesh
Author by

rmahesh

Updated on June 04, 2022

Comments

  • rmahesh
    rmahesh almost 2 years

    Currently working through a Deep Learning example and they are using a Tokenizer package. I am getting the following error:

    AttributeError: 'Tokenizer' object has no attribute 'word_index'

    Here is my code:

    from keras.preprocessing.text import Tokenizer
    
    samples = ['The cat say on the mat.', 'The dog ate my homework.']
    
    tokenizer = Tokenizer(num_words=1000)
    tokenizer.fit_on_sequences(samples)
    
    sequences = tokenizer.texts_to_sequences(samples)
    
    one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')
    
    word_index = tokenizer.word_index
    print('Found %s unique tokens.' % len(word_index))
    

    Could anyone help me catch my mistake?