AttributeError: getfeature_names not found ; using scikit-learn

10,102

Solution 1

I see two problems with your code. First, you are applying get_feature_names() to your matrix output, rather than to the vectorizer. You need to apply it to the vectorizer. Second, you are unnecessarily breaking this apart into too many steps. You can use TfidfVectorizer.fit_transform() to do what you want in much less space. Try this:

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer()
transformed = vectorizer.fit_transform(word_data)
print "Num words:", len(vectorizer.get_feature_names())

Solution 2

from sklearn.feature_extraction.text import TfidfVectorizer
TfIdfer = TfidfVectorizer(stop_words = 'english')
TfIdfer.fit_transform(word_data).toarray()
names = TfIdfer.get_feature_names()

Solution 3

Is it not get_feature_names(), ie. with an underscore after 'get'.

Also, I am not sure what you are trying to do, but get_feature_names is a method valid only for *Vectorizer classes, not with the TfidTransformer. Maybe you want TfidVectorizer instead?

Share:
10,102
Farheen Nilofer
Author by

Farheen Nilofer

Updated on June 11, 2022

Comments

  • Farheen Nilofer
    Farheen Nilofer almost 2 years
    from sklearn.feature_extraction.text import CountVectorizer
    
    vectorizer = CountVectorizer()
    vectorizer = vectorizer.fit(word_data)
    freq_term_mat = vectorizer.transform(word_data)
    
    from sklearn.feature_extraction.text import TfidfTransformer
    
    tfidf = TfidfTransformer(norm="l2")
    tfidf = tfidf.fit(freq_term_mat)
    Ttf_idf_matrix = tfidf.transform(freq_term_mat)
    
    voc_words = Ttf_idf_matrix.getfeature_names()
    print "The num of words = ",len(voc_words)
    

    when I run the program containing this piece of code I get following error:

    Traceback (most recent call last): File "vectorize_text.py", line 87, in
    voc_words = Ttf_idf_matrix.getfeature_names()
    File "/home/farheen/anaconda/lib/python2.7/site- >packages/scipy/sparse/base.py", line 499, in getattr
    raise AttributeError(attr + " not found")
    AttributeError: get_feature_names not found

    Please suggest me a solution for it.