Python word_tokenize

python nltk tokenize

14,282

Your issue is that you're trying to run nltk.download() in a script and the GUI is appearing hidden somewhere behind your pages.

Generally, nltk.download() is usually ran in the Python interpreter, it allows you to download various datasets and corpuses (corpii? :P) for use with nltk. You usually only have to do this once, only using it again if you want to update your corpuses. You don't have to run it every single time you run a script.

Assuming you've ran nltk.download() in the Python interpreter then you will either get some form of GUI, or if you're not able to have access to GUIs (for example if you're SSHd in without X-forwarding) then it'll be a command line interface. You can use this to download the data. I'd recommend just downloading it all, unless you're stretched for space.

Once you've ran nltk.download() and downloaded everything you think you'll need, then the code below should work.

import nltk
import os

os.getcwd()
text_file=open(r"ecelebi\1.txt","r")

p = text_file.read()
words = nltk.tokenize.word_tokenize(p)

fdist= nltk.FreqDist(words)
print(fdist)

Note that the command is nltk.FreqDist, not FreqDist, because the function is in the nltk namespace.

14,282

Author by

Admin

Updated on July 12, 2022

Comments

Admin almost 2 years

I'm quite new in python. I'm trying to find Frequency Distributions of my text. Here is the code,

import nltk
nltk.download()
import os
os.getcwd()
text_file=open(r"ecelebi\1.txt","r")
p = text_file.read()
words = nltk.tokenize.word_tokenize(p)
fdist= FreqDist(words)
print(fdist)

The problem is that program is not giving any error or solution. It is just returning this

>>> ================================ RESTART ================================
>>> 
showing info http://nltk.github.com/nltk_data/

I think the problem is with word_tokenize(). I would appreciate, if you can help. Thank you.

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Writing a tokenizer in Python

remove stopwords and tokenize for collocationbigramfinder NLTK

How to apply NLTK word_tokenize library on a Pandas dataframe for Twitter data?

Tokenizing unicode using nltk

How do I tokenize a string sentence in NLTK?

How to get rid of punctuation using NLTK tokenizer?

NLTK tokenize - faster way?

Multilingual NLTK for POS Tagging and Lemmatizer

Attribute error while using scikit-learn

NLTK collocations for specific words

Python word_tokenize

Admin

Comments

Recents

Related