I was reading about parsers and parser generators and found this statement in wikipedia\'s LR parsing -page:
Many programming languages can be parsed
The "typedef" problem in C++ can be parsed with an LALR(1) parser that builds a symbol table while parsing (not a pure LALR parser). The "template" problem probably cannot be solved with this method. The advantage of this kind of LALR(1) parser is that the grammar (shown below) is an LALR(1) grammar (no ambiguity).
/* C Typedef Solution. */
/* Terminal Declarations. */
=> lookup(); /* Symbol table lookup. */
/* Rules. */
Goal -> [Declaration]... +> goal_
Declaration -> Type... VarList ';' +> decl_
-> typedef Type... TypeVarList ';' +> typedecl_
VarList -> Var /','...
TypeVarList -> TypeVar /','...
Var -> [Ptr]... Identifier
TypeVar -> [Ptr]... TypeIdentifier
Identifier -> +> identifier_(1)
TypeIdentifier -> =+> typedefidentifier_(1,{typedef})
// The above line will assign {typedef} to the ,
// because {typedef} is the second argument of the action typeidentifier_().
// This handles the context-sensitive feature of the C++ language.
Ptr -> '*' +> ptr_
Type -> char +> type_(1)
-> int +> type_(1)
-> short +> type_(1)
-> unsigned +> type_(1)
-> {typedef} +> type_(1)
/* End Of Grammar. */
The following input can be parsed without a problem:
typedef int x;
x * y;
typedef unsigned int uint, *uintptr;
uint a, b, c;
uintptr p, q, r;
The LRSTAR parser generator reads the above grammar notation and generates a parser that handles the "typedef" problem without ambiguity in the parse tree or AST. (Disclosure: I am the guy who created LRSTAR.)