How to use Stanford parser

11,269

Solution 1

I am using Stanford parser to extract entities like name ,location,organization.

Here is my code:

public class stanfrdIntro {

    public static void main(String[] args) throws IOException, SAXException,
  {

        String serializedClassifier = "classifiers/english.all.3class.distsim.crf.ser.gz";


        AbstractSequenceClassifier<CoreLabel> classifier = CRFClassifier
                .getClassifierNoExceptions(serializedClassifier);

       String s1 = "Good afternoon Rahul Kulhari, how are you today?";

       s1 = s1.replaceAll("\\s+", " ");
       String  t=classifier.classifyWithInlineXML(s1);
    System.out.println(Arrays.toString(getTagValues(t).toArray()));

    }
       private static final Pattern TAG_REGEX = Pattern.compile("<PERSON>(.+?)</PERSON>");

private static Set<String> getTagValues(final String str) {
    final Set<String> tagValues = new HashSet<String>();
    //final Set<String> tagValues = new TreeSet();
    final Matcher matcher = TAG_REGEX.matcher(str);
    while (matcher.find()) {
        tagValues.add(matcher.group(1));
    }

    return tagValues;
}

This might help you but i am extracting only entities.

Solution 2

All grammars are located in the included models jar. Is the "stanford-parser-2.0.5-models.jar" in the execution directory or classpath?

Share:
11,269
SahelSoft
Author by

SahelSoft

Updated on June 24, 2022

Comments

  • SahelSoft
    SahelSoft almost 2 years

    I downloaded the Stanford parser 2.0.5 and use Demo2.java source code that is in the package, but After I compile and run the program it has many errors. A part of my program is:

    public class testStanfordParser {
    /** Usage: ParserDemo2 [[grammar] textFile] */
      public static void main(String[] args) throws IOException {
        String grammar = args.length > 0 ? args[0] : "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz";
        String[] options = { "-maxLength", "80", "-retainTmpSubcategories" };
        LexicalizedParser lp = LexicalizedParser.loadModel(grammar, options);
        TreebankLanguagePack tlp = new PennTreebankLanguagePack();
        GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
     ...
    

    the errors are:

    Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz java.io.IOException: Unable to resolve edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:408)
    at edu.stanford.nlp.io.IOUtils.readStreamFromString(IOUtils.java:356)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromSerializedFile(LexicalizedParser.java:594)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:389)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:157)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:143)
    at testStanfordParser.main(testStanfordParser.java:19).                                             Loading parser from text file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz Exception in thread "main" java.lang.NoSuchMethodError: edu.stanford.nlp.io.IOUtils.readerFromString(Ljava/lang/String;)Ljava/io/BufferedReader;
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromTextFile(LexicalizedParser.java:528)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:391)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:157)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:143)
    at testStanfordParser.main(testStanfordParser.java:19)
    

    please help me to solve it. Thanks

  • SahelSoft
    SahelSoft almost 11 years
    Thanks for your answer. The model is located in the execution directory. I change the code with << String grammar = args.length > 0 ? args[0] : "C:/project/models/englishPCFG.ser.gz"; >> but It has errors.
  • Ludwig Wensauer
    Ludwig Wensauer almost 11 years
    check your stanford-parser.jar. It seems that you are using a older one. At least the edu.stanford.nlp.io.IOUtils.readerFromString is not there.
  • Ludwig Wensauer
    Ludwig Wensauer almost 11 years
    At least in version 1.5 was not such a method in IOUtils