Why are there LR(0) parsers but not LL(0) parsers?

parsing compiler-construction lr ll

17,863

The difference has to do with what the k means in LR(k) versus LL(k).

In LL(k), the parser maintains information about a top-down, left-to-right parse that traces out a leftmost derivation. The parser works by repeatedly looking at the current nonterminal symbol, then inspecting the next k tokens of the input stream to determine which production should be used. As a result, if you have an LL(0) parser, the parser would have to predict which production to use based purely on the current nonterminal. This is only possible if each nonterminal has just one production associated with it, which means that the grammar either produces exactly one string or no strings at all (by going into a loop). Therefore, while it is mathematically well-defined, LL(0) parsing is never used in practice.

In LR(k), the parser works bottom-up. It maintains a stack of symbols, along with a current "state," and then continuously decides whether to perform a shift (pushing another symbol on top of the stack) or a reduce (popping some number of symbols from the stack and applying a production in reverse). One key difference between an LL(k) and LR(k) parser is how the LR(k) parser decides which action to perform. In the LR(k) parser, the decision of what to do next s based on both the next k tokens of lookahead and on the current state of the parser. As a result, an LR(0) parser can still make somewhat intelligent decisions about which action to perform even if it can't see any lookahead tokens, because the parser's current state can encode a huge amount of information about where in a production the parser is and what it can realistically expect to see as the next tokens of input (even if the parser can't directly look at those tokens).

In short, LL(0) is extremely weak because the parser has to purely base its decision on the current nonterminal, meaning that it can't take one of many different actions based on which production might be used. An LR(0) parser is decently powerful because the parser bases its choice on its internal state, which is usually sufficiently detailed to let the parser make informed decisions about what to do next.

There is one other reason LL(0) is weak while LR(0) is reasonably powerful. In an LL(0) parser, the parser has to immediately decide which productions ought to be performed, which means that the parser has to guess productions completely blindly. In an LR(0) parser, the parser can shift multiple symbols before deciding that it is time to reduce. Consequently, the parser, without any lookahead, can defer making its decision about which reduction to use until after it has seen enough tokens of input to get a sense for the structure of the string. This is a special case of the more general fact that any LL(k) grammar is automatically LR(k).

Hope this helps!

17,863

Shmoopy

Updated on July 12, 2022

Comments

Shmoopy almost 2 years

I've been reading on both in Wikipedia, and noticed that although LR(0) parsers exist, there's no such thing as LL(0) parser.

From what I read, I understand that the k in LL(k)/LR(k) means how many characters the parser can see beyond the current character that it's currently working on.

So my question is, why is there no such thing as LL(0) parser even though LR(0) exists?
Shmoopy over 11 years

Thanks! Just one question, LL(0) parser can still read the input right? So isn't it's decision based on the current token AND the current nonterminal that's on top of it's stack?
templatetypedef over 11 years

@Shmoopy- An LL(0) parser only looks at input tokens when the first token of the string derived so far is a terminal. If there is a nonterminal at the front, the parser does not look at the input at all.
Eli Bendersky over 10 years

Your explanation seems interesting, but not entirely clear to me. By definition, LL(0) can look at 0 input tokens to decide which production to choose. So do you mean that it only sees input in retrospect (as encoded in its state) and hence can only "handle" grammars that produce can only derive a single terminal?
templatetypedef over 10 years

@EliBendersky- In an LL(0) grammar, the parser decides whether to do a predict or a match step based on the next symbol left in the sentential form. If it's a terminal, it does a match step. If it's a nonterminal, it does a predict step based on what that nonterminal is. When doing the predict step, the parser looks at the next 0 tokens of the input stream to determine which production to use. Because of this, the parser has to be able to uniquely determine which production to use based on the nonterminal in the sentential form, without looking at the input at all. Does that make sense?
Michael Ho Chum over 8 years

Wouldn't LL(0) parser be like recursive-descent parser?
templatetypedef over 8 years

@MichaelHoChum Not really. LL parsers always commit to a fixed production when they make a prediction, so an LL(0) parser would have to entirely guess the productions to use correctly with no feedback on what to do based on the input. LL(1) is closer to recursive descent, but has no backtracking.
RS1 almost 5 years

What can be said about SLR(0) and LALR(0)? I can find only SLR(1) and LALR(1) parsers discussed in dragon book. Also they are not discussed online. I know for LR(0), reduce moves appear in full row. What happens in case of SLR(0) and LALR(0)?
templatetypedef almost 5 years

@anir Interesting question - I’ve never seen SLR(0) before. I suspect that it’s completely equivalent to LR(0), but I’d need to consult a formal definition of SLR(k) parsers to know for sure.
RS1 almost 5 years

Asked the same question here. Got some good comments.
Jamāl about 3 years

@templatetypedef suppose the grammar is S->aA | bB, A-> something, B-> something. Suppose the parser is LL(0) and the input is ab. At the start we would have S in the stack. Now an LL(0) parser doesn't need lookahead symbols to decide which production to choose since it can decide that based on the current symbol itself which is a. So it would choose the production S->aA. So I don't see why you said that the language would have only one string or the parser just keeps looping.
templatetypedef about 3 years

The 1 in LL(1) refers to the number of tokens the parser can look at at each step when making a decision about which production to pick. In an LL(0) parser, that means the parser can’t see any tokens when determining which production to pick, so if the input is ab the parser can’t see the a when deciding which of the S productions to pick.
rsonx about 3 years

Is there an answer that explains why LL(1) grammar and LR(0) grammar are incomparable.