ANTLR4 mutually left-recursive error when parsing
Antlr is a LL(*) parser, which is in many ways "better" than a LL(k) parser, but still has many of its disavantages. One of these being the fact it can't deal with left-recursion (in fact, the version 4 can deal with left-recursion within the same rule). What the error is saying is that you have a left-recursion of a grammar, a bane for the LL parsers.
This is caused by this construction in your grammar:
constantFixedExpression: term ...;
term: factor ...;
factor: ('+' | '-')* (constantFixedExpression | ...) ...;
Since the *
operator means 0 or more, I can instantiate it with 0, so the parser will do this: "try constantFixedExpression
, so it needs to try term
, so it needs to try factor
, so it needs to try constantFixedEXpression
, so it [...]" and you've got yourself an infinite loop.
Fortunately, context-free formal grammars have an equivalent transformation for removing left-recursion! It can be expressed generically by:
A -> Aa | b
-- becomes --
A -> bR
R -> aR | ε
Or in Antlr notation:
A: Aa | b;
// becomes
A: bR;
R: (aR)?;
More information about this process can be found in automaton/grammars books or in the Wikipedia.
I'll leave correcting your grammar with the refactoration to remove left-recursion as your work. However, I want to touch in another point: Antlr 4 can do left-recursion! As I mentioned, the version 4 can deal with left-recursion within the same rule. There are ways to indicate precedence and associativity of operators other than directly in parsing, as you're doing, in Antlr4. Let's see how it works:
expr: NUMBER
|<assoc=right> expr '^' expr
| expr '*' expr
| expr '/' expr
| expr '+' expr
| expr '-' expr;
This is an example of a basic calculator grammar. The operators at the top are those with highest precedence, and those at the bottom are of lower precedence. This means 2+2*3
will be parsed as 2+(2*3)
rather than (2+2)*3
. The <assoc=right>
construction means the operator in right-associative, so 1^2^3
will be parsed as 1^(2^3)
rather than (1^2)^3
.
As you can see, it is much easier to specify operators with left-recursion, so Antlr 4 is of big help in these moments! I recommend re-writing your grammar to make use of this feature.
Related videos on Youtube
Alican
Updated on June 19, 2022Comments
-
Alican almost 2 years
I have this ANTLR 4 grammar:
constantFixedExpresion : term (('+'|'-') term)+; term : factor (('*'|'//'|'REM')factor)+; factor : ('+'|'-')* ( wholeNumberConstant | constantFixedExpresion | 'TOFIXED' (stringConstant | bitCodeConstant) | identifier) ('FIT'constantFixedExpresion)*;
I get the following error:
error(119): LanguageA.g4::: The following sets of rules are mutually left-recursive [constantFixedExpresion, factor, term]
I tried so many ways but can't fix it. What is the problem and how can I solve it?
-
Bond over 6 yearsI needed expr as optional, so something like expr? '*' expr? . But this is giving me same error.
-
Mephy over 6 years@Bond do you really need it? Is the string
*******
a valid expression? -
Fox over 5 yearsI think you meant:
R -> aR | ε
The way you wrote it the sequence "bab" is valid in the right-recursion version but not in the left-recursion one. -
WiredWiz about 4 yearsI swear, for a moment I thought this was a @JonSkeet reply because it was so thorough and nicely broken down.