Error in creating the StanfordCoreNLP object

14,250

Solution 1

The exception which is thrown is due to the missing pos model. This is because there are downloadable versions with and without the model files.

Either you add stanford-postagger-full-3.3.1.jar which can be found on the following page (stanford-postagger-full-2014-01-04.zip): http://nlp.stanford.edu/software/tagger.shtml .

Or you do the same for the whole CoreNLP Package (stanford-corenlp-full....jar): http://nlp.stanford.edu/software/corenlp.shtml (Then you can drop all the postagger depenedencies too, they are included in CoreNLP)

In case you only want to add the model files, look at Maven Central and download "stanford-corenlp-3.3.1-models.jar".

Solution 2

An easier way to add those model files is to simply add following dependencies in your pom.xml and let maven manage it for you:

<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
</dependency>
<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
  <classifier>models</classifier> <!--  will get the dependent model jars -->
</dependency>

Solution 3

If anyone looking for gradle dependencies, add the following under dependencies.

 compile group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: '3.9.1'
 compile group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: '3.9.1', classifier: 'models'
Share:
14,250
Lohath Unique
Author by

Lohath Unique

Updated on September 15, 2022

Comments

  • Lohath Unique
    Lohath Unique over 1 year

    I have downloaded and installed required jar files from http://nlp.stanford.edu/software/corenlp.shtml#Download.

    I have include the five jar files

    Satnford-postagger.jar

    Stanford-psotagger-3.3.1.jar

    Stanford-psotagger-3.3.1.jar-javadoc.jar

    Stanford-psotagger-3.3.1.jar-src.jar

    stanford-corenlp-3.3.1.jar

    and the code is

    public class lemmafirst {
    
        protected StanfordCoreNLP pipeline;
    
        public lemmafirst() {
            // Create StanfordCoreNLP object properties, with POS tagging
            // (required for lemmatization), and lemmatization
            Properties props;
            props = new Properties();
            props.put("annotators", "tokenize, ssplit, pos, lemma");
    
            /*
             * This is a pipeline that takes in a string and returns various analyzed linguistic forms. 
             * The String is tokenized via a tokenizer (such as PTBTokenizerAnnotator), 
             * and then other sequence model style annotation can be used to add things like lemmas, 
             * POS tags, and named entities. These are returned as a list of CoreLabels. 
             * Other analysis components build and store parse trees, dependency graphs, etc. 
             * 
             * This class is designed to apply multiple Annotators to an Annotation. 
             * The idea is that you first build up the pipeline by adding Annotators, 
             * and then you take the objects you wish to annotate and pass them in and 
             * get in return a fully annotated object.
             * 
             *  StanfordCoreNLP loads a lot of models, so you probably
             *  only want to do this once per execution
             */
            ***this.pipeline = new StanfordCoreNLP(props);***
    }
    

    My Problem is in creating a the pipline.

    The ERROR that i got is:

    Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
        at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:563)
        at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:262)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
        at lemmafirst.<init>(lemmafirst.java:39)
        at lemmafirst.main(lemmafirst.java:83)
    Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
        at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:758)
        at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
        at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
        at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:88)
        at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:76)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:561)
        ... 6 more
    Caused by: java.io.IOException: Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
        at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:434)
        at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:753)
        ... 11 more
    

    Can anyone please correct the errors? Thank you

  • Christopher Schröder
    Christopher Schröder about 10 years
    Short answer: Download the full CoreNLP package, which includes the model files, from here: nlp.stanford.edu/software/corenlp.shtml
  • Lohath Unique
    Lohath Unique about 10 years
    Thanks a lot. I have added stanford-corenlp-3.3.1-models.jar and it worked for me. Thanks Chirtopher..
  • Ishan Kumar
    Ishan Kumar over 6 years
    Thanks @Sruthi Poddutur for you comment. It helps resolving my isssue.
  • alextsil
    alextsil over 6 years
    I tried this and didn't work for some reason. I then used jarsplice to splice my jar and the ..-models.jar and it worked.
  • Khan
    Khan almost 6 years
    For gradle, what should I write in build.gradle?