Unable to import Tokenizer from Keras

python machine-learning deep-learning keras

12,032

It appears it is importing correctly, but the Tokenizer object has no attribute word_index.

According to the documentation that attribute will only be set once you call the method fits_on_text on the Tokenizer object.

The following code runs successfully:

 from keras.preprocessing.text import Tokenizer

 samples = ['The cat say on the mat.', 'The dog ate my homework.']

 tokenizer = Tokenizer(num_words=1000)
 tokenizer.fit_on_texts(samples)

 one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')

 word_index = tokenizer.word_index
 print('Found %s unique tokens.' % len(word_index))

12,032

Author by

rmahesh

Updated on June 04, 2022

Comments

rmahesh almost 2 years

Currently working through a Deep Learning example and they are using a Tokenizer package. I am getting the following error:

AttributeError: 'Tokenizer' object has no attribute 'word_index'

Here is my code:

from keras.preprocessing.text import Tokenizer

samples = ['The cat say on the mat.', 'The dog ate my homework.']

tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_sequences(samples)

sequences = tokenizer.texts_to_sequences(samples)

one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')

word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))

Could anyone help me catch my mistake?