antlr
public abstract class CodeGenerator extends Object
A CodeGenerator knows about a Grammar data structure and a grammar analyzer. The Grammar is walked to generate the appropriate code for both a parser and lexer (if present). This interface may change slightly so that the lexer is itself living inside of a Grammar object (in which case, this class generates only one recognizer). The main method to call is gen(), which initiates all code gen.
The interaction of the code generator with the analyzer is simple: each subrule block calls deterministic() before generating code for the block. Method deterministic() sets lookahead caches in each Alternative object. Technically, a code generator doesn't need the grammar analyzer if all lookahead analysis is done at runtime, but this would result in a slower parser.
This class provides a set of support utilities to handle argument list parsing and so on.
Version: 2.00a
See Also: JavaCodeGenerator DiagnosticCodeGenerator LLkAnalyzer Grammar AlternativeElement Lookahead
Field Summary | |
---|---|
protected LLkGrammarAnalyzer | analyzer The LLk analyzer |
protected Tool | antlrTool |
protected DefineGrammarSymbols | behavior The grammar behavior |
protected Vector | bitsetsUsed List of all bitsets that must be dumped. |
protected int | bitsetTestThreshold This is a hint for the language-specific code generator.
|
protected static int | BITSET_OPTIMIZE_INIT_THRESHOLD If there are more than 8 long words to init in a bitset,
try to optimize it; e.g., detect runs of -1L and 0L. |
protected CharFormatter | charFormatter Object used to format characters in the target language.
subclass must initialize this to the language-specific formatter |
protected PrintWriter | currentOutput Current output Stream |
protected boolean | DEBUG_CODE_GENERATOR Use option "codeGenDebug" to generate debugging output |
protected static int | DEFAULT_BITSET_TEST_THRESHOLD |
protected static int | DEFAULT_MAKE_SWITCH_THRESHOLD Default values for code-generation thresholds |
protected Grammar | grammar The grammar for which we generate code |
protected int | makeSwitchThreshold This is a hint for the language-specific code generator.
|
protected int | tabs Current tab indentation for code output |
static String | TokenTypesFileExt |
static String | TokenTypesFileSuffix |
Constructor Summary | |
---|---|
CodeGenerator() Construct code generator base class |
Method Summary | |
---|---|
static String | decodeLexerRuleName(String id) |
static boolean | elementsAreRange(int[] elems) Test if a set element array represents a contiguous range. |
static String | encodeLexerRuleName(String id) |
protected String | extractIdOfAction(Token t) Get the identifier portion of an argument-action token.
|
protected String | extractIdOfAction(String s, int line, int column) Get the identifier portion of an argument-action.
|
protected String | extractTypeOfAction(Token t) Get the type string out of an argument-action token.
|
protected String | extractTypeOfAction(String s, int line, int column) Get the type portion of an argument-action.
|
abstract void | gen() Generate the code for all grammars |
abstract void | gen(ActionElement action) Generate code for the given grammar element. |
abstract void | gen(AlternativeBlock blk) Generate code for the given grammar element. |
abstract void | gen(BlockEndElement end) Generate code for the given grammar element. |
abstract void | gen(CharLiteralElement atom) Generate code for the given grammar element. |
abstract void | gen(CharRangeElement r) Generate code for the given grammar element. |
abstract void | gen(LexerGrammar g) Generate the code for a parser |
abstract void | gen(OneOrMoreBlock blk) Generate code for the given grammar element. |
abstract void | gen(ParserGrammar g) Generate the code for a parser |
abstract void | gen(RuleRefElement rr) Generate code for the given grammar element. |
abstract void | gen(StringLiteralElement atom) Generate code for the given grammar element. |
abstract void | gen(TokenRangeElement r) Generate code for the given grammar element. |
abstract void | gen(TokenRefElement atom) Generate code for the given grammar element. |
abstract void | gen(TreeElement t) Generate code for the given grammar element. |
abstract void | gen(TreeWalkerGrammar g) Generate the code for a parser |
abstract void | gen(WildcardElement wc) Generate code for the given grammar element. |
abstract void | gen(ZeroOrMoreBlock blk) Generate code for the given grammar element. |
protected void | genTokenInterchange(TokenManager tm) Generate the token types as a text file for persistence across shared lexer/parser |
abstract String | getASTCreateString(Vector v) Get a string for an expression to generate creation of an AST subtree. |
abstract String | getASTCreateString(GrammarAtom atom, String str) Get a string for an expression to generate creating of an AST node |
protected String | getBitsetName(int index) Given the index of a bitset in the bitset list, generate a unique name.
|
String | getFIRSTBitSet(String ruleName, int k) |
String | getFOLLOWBitSet(String ruleName, int k) |
abstract String | mapTreeId(String id, ActionTransInfo tInfo) Map an identifier to it's corresponding tree-node variable.
|
protected int | markBitsetForGen(BitSet p) Add a bitset to the list of bitsets to be generated.
if the bitset is already in the list, ignore the request.
|
protected void | print(String s) Output tab indent followed by a String, to the currentOutput stream.
|
protected void | printAction(String s) Print an action with leading tabs, attempting to
preserve the current indentation level for multi-line actions
Ignored if string is null. |
protected void | println(String s) Output tab indent followed by a String followed by newline,
to the currentOutput stream. |
protected void | printTabs() Output the current tab indentation. |
protected abstract String | processActionForSpecialSymbols(String actionStr, int line, RuleBlock currentRule, ActionTransInfo tInfo) Lexically process $ and # references within the action.
|
String | processStringForASTConstructor(String str) Process a string for an simple expression for use in xx/action.g
it is used to cast simple tokens/references to the right type for
the generated language. |
protected String | removeAssignmentFromDeclaration(String d)
Remove the assignment portion of a declaration, if any. |
static String | reverseLexerRuleName(String id) |
void | setAnalyzer(LLkGrammarAnalyzer analyzer_) |
void | setBehavior(DefineGrammarSymbols behavior_) |
protected void | setGrammar(Grammar g) Set a grammar for the code generator to use |
void | setTool(Tool tool) |
protected void | _print(String s) Output a String to the currentOutput stream.
|
protected void | _printAction(String s) Print an action without leading tabs, attempting to
preserve the current indentation level for multi-line actions
Ignored if string is null. |
protected void | _println(String s) Output a String followed by newline, to the currentOutput stream.
|
Parameters: elems The array of elements representing the set, usually from BitSet.toArray().
Returns: true if the elements are a contiguous range (with two or more).
Parameters: t The action token
Returns: A string containing the text of the identifier
Parameters: s The action text line Line used for error reporting. column Line used for error reporting.
Returns: A string containing the text of the identifier
Parameters: t The action token
Returns: A string containing the text of the type
Parameters: s The action text line Line used for error reporting.
Returns: A string containing the text of the type
Parameters: action The {...} action to generate
Parameters: blk The "x|y|z|..." block to generate
Parameters: end The block-end element to generate. Block-end elements are synthesized by the grammar parser to represent the end of a block.
Parameters: atom The character literal reference to generate
Parameters: r The character-range reference to generate
Parameters: blk The (...)+ block to generate
Parameters: rr The rule-reference to generate
Parameters: atom The string-literal reference to generate
Parameters: r The token-range reference to generate
Parameters: atom The token-reference to generate
Parameters: blk The tree to generate code for.
Parameters: wc The wildcard element to generate
Parameters: blk The (...)* block to generate
Parameters: v A Vector of String, where each element is an expression in the target language yielding an AST node.
Parameters: str The text of the arguments to the AST construction
Parameters: index The index of the bitset in the bitset list.
Parameters: id The identifier name to map forInput true if the input tree node variable is to be returned, otherwise the output variable is returned.
Returns: The mapped id (which may be the same as the input), or null if the mapping is invalid due to duplicates
Parameters: p Bit set to mark for code generation forParser true if the bitset is used for the parser, false for the lexer
Returns: The position of the bitset in the list.
Parameters: s The string to output.
Parameters: s The action string to output
Parameters: s The string to output
Parameters: str A String.
Parameters: d the declaration
Returns: the declaration without any assignment portion
Parameters: s The string to output
Parameters: s The action string to output
Parameters: s The string to output