NLTK: why does nltk not recognize the CLASSPATH variable for stanford-ner?

11,195

Solution 1

change the .jar file and the environmental variable from stanford-ner-3.5.2.jar to stanford-ner.jar

apparently NLTK has a name_pattern variable in nltk_internals.py which only accepts the CLASSPATH if it matches a regex of the value stanford-ner.jar

Solution 2

This is the correct way to set PATH:

st = StanfordNERTagger('C:\Python26\Lib\stanford-ner\classifiers\english.all.3class.distsim.crf.ser.gz','C:\Python26\Lib\stanford-ner\stanford-ner.jar')

Give the correct path to locate both file. If there is still a Java Environment variable error, then it means your Java Environment is not configured. To set this, go to 'My Computer -> Properties -> Advanced Setting'. There are videos showing what these settings do. Once you've done this, if the environment is properly set then when you run the python file a black command window will splash for around ten seconds while it processes your file. This should return your result without error.

Share:
11,195
Admin
Author by

Admin

Updated on June 15, 2022

Comments

  • Admin
    Admin almost 2 years

    This is my code

    from nltk.tag import StanfordNERTagger
    st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz')
    

    And i get

    NLTK was unable to find stanford-ner.jar! Set the CLASSPATH
      environment variable.
    

    This is what my .bashrc looks like in ubuntu

    export CLASSPATH=/home/wolfgang/Downloads/stanford-ner-2015-04-20/stanford-ner-3.5.2.jar
    export STANFORD_MODELS=/home/wolfgang/Downloads/stanford-ner-2015-04-20/classifiers
    

    Also, i tried printing the environmental variable in python this way

    import os
    os.environ.get('CLASSPATH')
    

    And i recieve

    '/home/wolfgang/Downloads/stanford-ner-2015-04-20/stanford-ner-3.5.2.jar'
    

    Therefore the variables are being SET!

    What is wrong then?

    Why doe'snt nltk recognize my environmental variables?