What's the difference between parse tree and AST?

后端 未结 5 1383
我在风中等你
我在风中等你 2020-11-30 18:38

Are they generated by different phases of a compiling process? Or are they just different names for the same thing?

相关标签:
5条回答
  • 2020-11-30 19:06

    This is based on the Expression Evaluator grammar by Terrence Parr.

    The grammar for this example:

    grammar Expr002;
    
    options 
    {
        output=AST;
        ASTLabelType=CommonTree; // type of $stat.tree ref etc...
    }
    
    prog    :   ( stat )+ ;
    
    stat    :   expr NEWLINE        -> expr
            |   ID '=' expr NEWLINE -> ^('=' ID expr)
            |   NEWLINE             ->
            ;
    
    expr    :   multExpr (( '+'^ | '-'^ ) multExpr)*
            ; 
    
    multExpr
            :   atom ('*'^ atom)*
            ; 
    
    atom    :   INT 
            |   ID
            |   '('! expr ')'!
            ;
    
    ID      : ('a'..'z' | 'A'..'Z' )+ ;
    INT     : '0'..'9'+ ;
    NEWLINE : '\r'? '\n' ;
    WS      : ( ' ' | '\t' )+ { skip(); } ;
    

    Input

    x=1
    y=2
    3*(x+y)
    

    Parse Tree

    The parse tree is a concrete representation of the input. The parse tree retains all of the information of the input. The empty boxes represent whitespace, i.e. end of line.

    Parse Tree

    AST

    The AST is an abstract representation of the input. Notice that parens are not present in the AST because the associations are derivable from the tree structure.

    AST

    For a more through explanation see Compilers and Compiler Generators pg. 23
    or Abstract Syntax Trees on pg. 21 in Syntax and Semantics of Programming Languages

    0 讨论(0)
  • 2020-11-30 19:10

    In parse tree interior nodes are non terminal, leaves are terminal. In syntax tree interior nodes are operator, leaves are operands.

    0 讨论(0)
  • 2020-11-30 19:14

    Take the pascal assignment Age:= 42;

    The syntax tree would look just like the source code. Below I am putting brackets around the nodes. [Age][:=][42][;]

    An abstract tree would look like this [=][Age][42]

    The assignment becomes a node with 2 elements, Age and 42. The idea is that you can execute the assignment.

    Also note that the pascal syntax disappears. Thus it is possible to have more than one language generate the same AST. This is useful for cross language script engines.

    0 讨论(0)
  • 2020-11-30 19:17

    From what I understand, the AST focuses more on the abstract relationships between the components of source code, while the parse tree focuses on the actual implementation of the grammar utilized by the language, including the nitpicky details. They are definitely not the same, since another term for "parse tree" is "concrete syntax tree".

    I found this page which attempts to resolve this exact question.

    0 讨论(0)
  • 2020-11-30 19:20

    The DSL book from Martin Fowler explains this nicely. The AST only contains all 'useful' elements that will be used for further processing, while the parse tree contains all the artifacts (spaces, brackets, ...) from the original document you parse

    0 讨论(0)
提交回复
热议问题