no viable alternative at input

30,491

I see 3 issues:

1

Your atom rule matches epsilon (nothing):

atom   
 : ID 
 | | '(' expression ')' -> expression
 ;

(note the "nothingness" inside | |)

causing your grammar to be ambiguous. I guess it should be:

atom   
 : ID 
 | '(' expression ')' -> expression
 ;

2

Your fragment CHARACTER matches a single quote while this single quote also denotes the end of the fragment QUOTED_STRING.

I guess CHARACTER should be this instead:

fragment CHARACTER : ('a'..'z' | 'A'..'Z' | '.' | '%'); 

3

Nowhere in your parser rule you match the token CONSTANT_EXPRESSION, so the AST you posted could never have been created by a parser generated from the grammar you posted. I presume you'd want to match it in the atom rule like this:

atom   
 : ID 
 | CONSTANT_EXPRESSION
 | '(' expression ')' -> expression
 ;

With the changes outlined above, I get the following AST without any errors being printed to the console:

enter image description here

Share:
30,491
jlengrand
Author by

jlengrand

Developer Advocate @Adyen. I create 'islands' where engineers are the heroes. Podcast @JuliensTech. #jvm #fp #elm #java https://lengrand.fr/

Updated on May 26, 2020

Comments

  • jlengrand
    jlengrand almost 4 years

    I have a small interrogation concerning my grammar. I want to parse strings, like the following :

     "(ICOM LIKE '%bridge%' or ICOM LIKE '%Munich%')"
    

    I ended up with the following grammar (a bit more complex than needed I know) :

    // Aiming at parsing a complete BQS formed Query

    grammar Logic;
    
    options {
        output=AST;
    }
    
    tokens {
      NOT_LIKE;
    }
    
    /*------------------------------------------------------------------
     * PARSER RULES
     *------------------------------------------------------------------*/
     // precedence order is (low to high): or, and, not, [comp_op, geo_op, rel_geo_op, like, not like, exists], ()
     parse  
        : expression EOF -> expression
        ; // ommit the EOF token
    
     expression
        : query
        ;       
    
     query  
        : term (OR^ term)*    // make `or` the root
        ;
    
     term   
        : factor (AND^ factor)*
        ;
    
     factor
      :  (notexp -> notexp) ( NOT LIKE e=notexp  -> ^(NOT_LIKE $factor $e))?
      ;
    
     notexp
      :  NOT^ like
      |  like
      ;
    
     like // this one has to be completed (a lot)
        : atom (LIKE^ atom)*
        ;
    
    
     atom   
        : ID 
        | | '(' expression ')' -> expression
        ;
    
    /*------------------------------------------------------------------
     * LEXER RULES
     *------------------------------------------------------------------*/
    // GENERAL OPERATORS: 
    //NOTLIKE   :   'notlike' | 'NOTLIKE'; // whitespaces have been removed
    LIKE    :   'like' | 'LIKE';
    
    OR          :   'or' | 'OR';
    AND         :   'and' | 'AND';
    NOT         :   'not' | 'NOT';
    
    //ELEMENTS 
    CONSTANT_EXPRESSION : DATE | NUMBER | QUOTED_STRING;    
    ID          :   (CHARACTER|DIGIT)+; 
    
    WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+    { $channel = HIDDEN; } ;
    
    fragment DATE       :   '\'' YEAR '/' MONTH '/' DAY (' ' HOUR ':' MINUTE ':' SECOND)? '\'';
    
    fragment QUOTED_STRING :    '\'' (CHARACTER)+ '\'' ; 
    
    //UNITS
    fragment CHARACTER :    ('a'..'z' | 'A'..'Z'|'.'|'\''|'%'); // FIXME: Careful, should be all ASCII
    fragment DIGIT  :   '0'..'9' ;
    fragment DIGIT_SEQ  :(DIGIT)+;
    fragment DEL    :   SPACE ',' SPACE ; //Delimiter + may be space behind
    fragment NUMBER :   (SIGN)? DIGIT_SEQ ('.' (DIGIT_SEQ)?)?; // should be given in decimal degrees, North is 0 and direction is clockwise, range is 0 to 360
    fragment SIGN   :   '+' | '-';
    fragment YEAR   :   DIGIT DIGIT DIGIT DIGIT;
    fragment MONTH  :   DIGIT DIGIT;
    fragment DAY    :   DIGIT DIGIT;
    fragment HOUR   :   DIGIT DIGIT;
    fragment MINUTE :   DIGIT DIGIT;
    fragment SECOND :   DIGIT (DIGIT)? ('.' (DIGIT)+)?;
    
    fragment SPACE : (' ')?;// used to increase compatibility
    

    Thing is, I have this message when creating the AST :

    line 1:11 no viable alternative at input ''%bridge%''
    line 1:35 no viable alternative at input ''%Munich%''
    

    The generated tree is though correct (as far as I'm concerned at least):

    antlr viable ast tree

    So, could anyone give me a hint about what's wrong in there ? I think character contains all extra characters needed to correclty parse this expression. . .

    Thanks !

    As usual, some Java code to quickly test the grammar :

    import org.antlr.runtime.*;
    import org.antlr.runtime.tree.*;
    import org.antlr.stringtemplate.*;
    
    public class Main {
      public static void main(String[] args) throws Exception {
    
        // the expression
        String src = "(ICOM LIKE '%bridge%' or ICOM LIKE '%Munich%')";
    
        // create a lexer & parser
        //LogicLexer lexer = new LogicLexer(new ANTLRStringStream(src));
        //LogicParser parser = new LogicParser(new CommonTokenStream(lexer));
    
        LogicLexer lexer = new LogicLexer(new ANTLRStringStream(src));
        LogicParser parser = new LogicParser(new CommonTokenStream(lexer));
    
        // invoke the entry point of the parser (the parse() method) and get the AST
        CommonTree tree = (CommonTree)parser.parse().getTree();
    
        // print the DOT representation of the AST 
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
      }
    }