Relation between grammar and operator associativity

倖福魔咒の 提交于 2019-12-05 06:26:57

I think the issue here is that a language has an abstract syntax which is just like:

E ::= E + E | E - E | E * E | E / E | Int | (E)

but this is actually implemented via a concrete syntax which is used to specify associativity and precedence. So, if you're writing a recursive decent parse, you're implicitly writing the concrete syntax into it as you go along and that's fine, though it may be good to specify it exactly as a phrase-structured grammar as well!

There are a couple of issues with your grammar if it is to be a fully-fledged concrete grammar. First of all, you need to add productions to just 'go to the next level down', so relaxing your syntax a bit:

Expr ::= Term + Term | Term - Term | Term
Term ::= Factor * Factor | Factor / Factor | Factor
Factor ::= INTEGER | (Expr)

Otherwise there's no way to derive valid sentences starting from the start symbol (in this case Expr). For example, how would you derive '1 * 2' without those extra productions?

Expr -> Term
     -> Factor * Factor
     -> 1 * Factor
     -> 1 * 2

We can see the other grammar handles this in a slightly different way:

Expr -> Term Expr'
     -> Factor Term' Expr'
     -> 1 Term' Expr'
     -> 1 * Factor Term' Expr'
     -> 1 * 2 Term' Expr'
     -> 1 * 2 ε Expr'
     -> 1 * 2 ε ε
      = 1 * 2

but this achieves the same effect.

Your parser is actually non-associative. To see this ask how E + E + E would be parsed and find that it couldn't. Whichever + is consumed first, we get E on one side and E + E on the other, but then we're trying to parse E + E as a Term which is not possible. Equivalently, think about deriving that expression from the start symbol, again not possible.

Expr -> Term + Term
     -> ? (can't get another + in here)

The other grammar is left-associative ebcase an arbitrarily long sting of E + E + ... + E can be derived.

So anyway, to sum up, you're right that when writing the RDP, you can implement whatever concrete version of the abstract syntax you like and you probably know a lot more about that than me. But there are these issues when trying to produce the grammar which describes your RDP precisely. Hope that helps!

To get associative trees, you really need to have the trees formed with the operator as the subtree root node, with children having similar roots.

Your implementation grammar:

Expr  ::= Term Expr'
Expr' ::= ( "+" | "-" ) Term Expr' | ε
Term  ::= Factor Term'
Term' ::= ( "*" | "/" ) Factor Term' | ε
Factor ::= INTEGER | "(" Expr ")"

must make that awkward; if you implement recursive descent on this, the Expr' routine has no access to the "left child" and so can't build the tree. You can always patch this up by passing around pieces (in this case, passing tree parts up the recursion) but that just seems awkward. You could have chosen this instead as a grammar:

Expr  ::= Term  ( ("+"|"-") Term )*;
Term  ::= Factor ( ( "*" | "/" ) Factor )* ;
Factor ::= INTEGER | "(" Expr ")"

which is just as easy (easier?) to code recursive descent-wise, but now you can form the trees you need without trouble.

This doesn't really get you associativity; it just shapes the trees so that it could be allowed. Associativity means that the tree ( + (+ a b) c) means the same thing as (+ a (+ b c)); its actually a semantic property (sure doesn't work for "-" but the grammar as posed can't distinguish).

We have a tool (the DMS Software Reengineering Toolkit) that includes parsers and term-rewriting (using source-to-source transformations) in which the associativity is explicitly expressed. We'd write your grammar:

Expr  ::= Term ;
[Associative Commutative] Expr  ::= Expr "+" Term ;
Expr  ::= Expr "-" Term ;
Term  ::= Factor ;
[Associative Commutative] Term  ::= Term "*" Factor ;
Term  ::= Term "/" Factor ;
Factor ::= INTEGER ;
Factor ::= "(" Expr ")" ;

The grammar seems longer and clumsier this way, but it in fact allows us to break out the special cases and mark them as needed. In particular, we can now distinguish operators that are associative from those that are not, and mark them accordingly. With that semantic marking, our tree-rewrite engine automatically accounts for associativity and commutativity. You can see a full example of such DMS rules being used to symbolically simplify high-school algebra using explicit rewrite rules over a typical expression grammar that don't have to account for such semantic properties. That is built into the rewrite engine.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!