antlr

Class LLkAnalyzer

Implemented Interfaces:
GrammarAnalyzer, LLkGrammarAnalyzer

public class LLkAnalyzer
extends Object
implements LLkGrammarAnalyzer

A linear-approximate LL(k) grammar analzyer. All lookahead elements are sets of token types.
Author:
Terence Parr, John Lilley
See Also:
Grammar, Lookahead

Field Summary

boolean
DEBUG_ANALYZER
(package private) CharFormatter
charFormatter
protected Grammar
grammar
protected boolean
lexicalAnalysis
protected Tool
tool

Fields inherited from interface antlr.GrammarAnalyzer

LOOKAHEAD_DEPTH_INIT, NONDETERMINISTIC

Constructor Summary

LLkAnalyzer(Tool tool_)
Create an LLk analyzer

Method Summary

Lookahead
FOLLOW(int k, RuleEndElement end)
Compute the lookahead set of whatever follows references to the rule associated witht the FOLLOW block.
protected boolean
altUsesWildcardDefault(Alternative alt)
Return true if someone used the '.' wildcard default idiom.
boolean
deterministic(AlternativeBlock blk)
Is this block of alternatives LL(k)? Fill in alternative cache for this block.
boolean
deterministic(OneOrMoreBlock blk)
Is (...)+ block LL(1)? Fill in alternative cache for this block.
boolean
deterministic(ZeroOrMoreBlock blk)
Is (...)* block LL(1)? Fill in alternative cache for this block.
boolean
deterministicImpliedPath(BlockWithImpliedExitPath blk)
Is this (...)* or (...)+ block LL(k)?
Lookahead
look(int k, String rule)
Compute the combined lookahead for all productions of a rule.
Lookahead
look(int k, ActionElement action)
Actions are ignored
Lookahead
look(int k, AlternativeBlock blk)
Combine the lookahead computed for each alternative
Lookahead
look(int k, BlockEndElement end)
Compute what follows this place-holder node and possibly what begins the associated loop unless the node is locked.
Lookahead
look(int k, CharLiteralElement atom)
Return this char as the lookahead if k=1.
Lookahead
look(int k, CharRangeElement r)
Lookahead
look(int k, GrammarAtom atom)
Lookahead
look(int k, OneOrMoreBlock blk)
The lookahead of a (...)+ block is the combined lookahead of all alternatives and, if an empty path is found, the lookahead of what follows the block.
Lookahead
look(int k, RuleBlock blk)
Combine the lookahead computed for each alternative.
Lookahead
look(int k, RuleEndElement end)
If not locked or noFOLLOW set, compute FOLLOW of a rule.
Lookahead
look(int k, RuleRefElement rr)
Compute the lookahead contributed by a rule reference.
Lookahead
look(int k, StringLiteralElement atom)
Lookahead
look(int k, SynPredBlock blk)
The lookahead of a (...)=> block is the lookahead of what follows the block.
Lookahead
look(int k, TokenRangeElement r)
Lookahead
look(int k, TreeElement t)
Lookahead
look(int k, WildcardElement wc)
Lookahead
look(int k, ZeroOrMoreBlock blk)
The (...)* element is the combined lookahead of the alternatives and what can follow the loop.
static boolean
lookaheadEquivForApproxAndFullAnalysis(Lookahead[] bset, int k)
If the first k-1 sets are singleton sets, the appoximate lookahead analysis is equivalent to full lookahead analysis.
void
setGrammar(Grammar g)
Set the grammar for the analyzer
boolean
subruleCanBeInverted(AlternativeBlock blk, boolean forLexer)

Field Details

DEBUG_ANALYZER

public boolean DEBUG_ANALYZER

charFormatter

(package private)  CharFormatter charFormatter

grammar

protected Grammar grammar

lexicalAnalysis

protected boolean lexicalAnalysis

tool

protected Tool tool

Constructor Details

LLkAnalyzer

public LLkAnalyzer(Tool tool_)
Create an LLk analyzer

Method Details

FOLLOW

public Lookahead FOLLOW(int k,
                        RuleEndElement end)
Compute the lookahead set of whatever follows references to the rule associated witht the FOLLOW block.
Specified by:
FOLLOW in interface LLkGrammarAnalyzer

altUsesWildcardDefault

protected boolean altUsesWildcardDefault(Alternative alt)
Return true if someone used the '.' wildcard default idiom. Either #(. children) or '.' as an alt by itself.

deterministic

public boolean deterministic(AlternativeBlock blk)
Is this block of alternatives LL(k)? Fill in alternative cache for this block.
Specified by:
deterministic in interface LLkGrammarAnalyzer
Returns:
true if the block is deterministic

deterministic

public boolean deterministic(OneOrMoreBlock blk)
Is (...)+ block LL(1)? Fill in alternative cache for this block.
Specified by:
deterministic in interface LLkGrammarAnalyzer
Returns:
true if the block is deterministic

deterministic

public boolean deterministic(ZeroOrMoreBlock blk)
Is (...)* block LL(1)? Fill in alternative cache for this block.
Specified by:
deterministic in interface LLkGrammarAnalyzer
Returns:
true if the block is deterministic

deterministicImpliedPath

public boolean deterministicImpliedPath(BlockWithImpliedExitPath blk)
Is this (...)* or (...)+ block LL(k)?
Returns:
true if the block is deterministic

look

public Lookahead look(int k,
                      String rule)
Compute the combined lookahead for all productions of a rule. If the lookahead returns with epsilon, at least one epsilon path exists (one that consumes no tokens). The noFOLLOW flag being set for this endruleblk, indicates that the a rule ref invoked this rule. Currently only look(RuleRef) calls this. There is no need for the code generator to call this.
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      ActionElement action)
Actions are ignored
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      AlternativeBlock blk)
Combine the lookahead computed for each alternative
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      BlockEndElement end)
Compute what follows this place-holder node and possibly what begins the associated loop unless the node is locked.

if we hit the end of a loop, we have to include what tokens can begin the loop as well. If the start node is locked, then we simply found an empty path through this subrule while analyzing it. If the start node is not locked, then this node was hit during a FOLLOW operation and the FIRST of this block must be included in that lookahead computation.

Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      CharLiteralElement atom)
Return this char as the lookahead if k=1.

### Doesn't work for ( 'a' 'b' | 'a' ~'b' ) yet!!!

If the atom has the not flag on, then create the set complement of the tokenType which is the set of all characters referenced in the grammar with this char turned off. Also remove characters from the set that are currently allocated for predicting previous alternatives. This avoids ambiguity messages and is more properly what is meant. ( 'a' | ~'a' ) implies that the ~'a' is the "else" clause.

NOTE: we do NOT include exit path in the exclusion set. E.g., ( 'a' | ~'a' )* 'b' should exit upon seeing a 'b' during the loop.

Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      CharRangeElement r)
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      GrammarAtom atom)
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      OneOrMoreBlock blk)
The lookahead of a (...)+ block is the combined lookahead of all alternatives and, if an empty path is found, the lookahead of what follows the block.
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      RuleBlock blk)
Combine the lookahead computed for each alternative. Lock the node so that no other computation may come back on itself--infinite loop. This also implies infinite left-recursion in the grammar (or an error in this algorithm ;)).
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      RuleEndElement end)
If not locked or noFOLLOW set, compute FOLLOW of a rule.

TJP says 8/12/99: not true anymore: Lexical rules never compute follow. They set epsilon and the code generator gens code to check for any character. The code generator must remove the tokens used to predict any previous alts in the same block.

When the last node of a rule is reached and noFOLLOW, it implies that a "local" FOLLOW will be computed after this call. I.e.,

		a : b A;
		b : B | ;
		c : b C;
 
Here, when computing the look of rule b from rule a, we want only {B,EPSILON_TYPE} so that look(b A) will be {B,A} not {B,A,C}.

if the end block is not locked and the FOLLOW is wanted, the algorithm must compute the lookahead of what follows references to this rule. If end block is locked, FOLLOW will return an empty set with a cycle to the rule associated with this end block.

Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      RuleRefElement rr)
Compute the lookahead contributed by a rule reference.

When computing ruleref lookahead, we don't want the FOLLOW computation done if an empty path exists for the rule. The FOLLOW is too loose of a set...we want only to include the "local" FOLLOW or what can follow this particular ref to the node. In other words, we use context information to reduce the complexity of the analysis and strengthen the parser. The noFOLLOW flag is used as a means of restricting the FOLLOW to a "local" FOLLOW. This variable is orthogonal to the lock variable that prevents infinite recursion. noFOLLOW does not care about what k is.

Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      StringLiteralElement atom)
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      SynPredBlock blk)
The lookahead of a (...)=> block is the lookahead of what follows the block. By definition, the syntactic predicate block defies static analysis (you want to try it out at run-time). The LOOK of (a)=>A B is A for LL(1) ### is this even called?
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      TokenRangeElement r)
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      TreeElement t)
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      WildcardElement wc)
Specified by:
look in interface LLkGrammarAnalyzer

look

public Lookahead look(int k,
                      ZeroOrMoreBlock blk)
The (...)* element is the combined lookahead of the alternatives and what can follow the loop.
Specified by:
look in interface LLkGrammarAnalyzer

lookaheadEquivForApproxAndFullAnalysis

public static boolean lookaheadEquivForApproxAndFullAnalysis(Lookahead[] bset,
                                                             int k)
If the first k-1 sets are singleton sets, the appoximate lookahead analysis is equivalent to full lookahead analysis.

setGrammar

public void setGrammar(Grammar g)
Set the grammar for the analyzer
Specified by:
setGrammar in interface LLkGrammarAnalyzer

subruleCanBeInverted

public boolean subruleCanBeInverted(AlternativeBlock blk,
                                    boolean forLexer)
Specified by:
subruleCanBeInverted in interface LLkGrammarAnalyzer