WordListCorpusReader is not iterable
20,405
When you did
from nltk.corpus import stopwords
stopwords
is the variable that's pointing to the CorpusReader
object in nltk
.
The actual stopwords (i.e. a list of stopwords) you're looking for is instantiated when you do:
stop_words = set(stopwords.words("english"))
So when checking whether a word in your list of tokens is a stopwords, you should do:
from nltk.corpus import stopwords
stop_words = set(stopwords.words("english"))
for w in tokenized_sent:
if w not in stop_words:
pass # Do something.
To avoid confusion, I usually name the actual list of stopwords as stoplist
:
from nltk.corpus import stopwords
stoplist = set(stopwords.words("english"))
for w in tokenized_sent:
if w not in stoplist:
pass # Do something.
Author by
Aarushi Aiyyar
Updated on October 30, 2020Comments
-
Aarushi Aiyyar over 3 years
So, I am new to using Python and NLTK. I have a file called reviews.csv which consists of comments extracted from amazon. I have tokenized the contents of this csv file and written it to a file called csvfile.csv. Here's the code :
from nltk.tokenize import sent_tokenize, word_tokenize from nltk.stem import PorterStemmer import csv #CommaSpaceVariable from nltk.corpus import stopwords ps = PorterStemmer() stop_words = set(stopwords.words("english")) with open ('reviews.csv') as csvfile: readCSV = csv.reader(csvfile,delimiter='.') for lines in readCSV: word1 = word_tokenize(str(lines)) print(word1) with open('csvfile.csv','a') as file: for word in word1: file.write(word) file.write('\n') with open ('csvfile.csv') as csvfile: readCSV1 = csv.reader(csvfile) for w in readCSV1: if w not in stopwords: print(w)
I am trying to perform stemming on csvfile.csv. But I get this error:
Traceback (most recent call last):<br> File "/home/aarushi/test.py", line 25, in <module> <br> if w not in stopwords: <br> TypeError: argument of type 'WordListCorpusReader' is not iterable
-
Aarushi Aiyyar over 6 yearsThank you! I modified stopwords to stop_words but I get an error saying: if w not in stop_words: TypeError: unhashable type: 'list'
-
Mohammad Heydari almost 5 years@alvas, i face with this erro : NameError: name 'tokenized_sent' is not defined