E -> E+T | E-T | T
T -> T*F | T/F | F
F -> i | (E)
How can I modify this grammar to allow an exponentiation operation ^
so th
In EBNF (Extended Backus-Naur Form), this could look as follows:
expr -> term [ ('+' | '-') term ]*
term -> factor [ ('*' | '/') factor ]*
factor -> base [ '^' exponent ]*
base -> '(' expr ')' | identifier | number
exponent -> '(' expr ')' | identifier | number
(taken from here)
Translated to your notation (no distinction between numbers and identifiers):
E -> E+T | E-T | T
T -> T*F | T/F | F
F -> F^X | B
B -> i | (E)
X -> i | (E)
One could merge "B" and "X" for the sake of clarity.
Both answers provided here are wrong. In these answers ^ associates to the left, when in fact it should associate to the right. The correct modified grammar should be:
E -> E+T | E-T | T
T -> T*X | T/X | X
X -> F^X | F //Note: it's F^X not X^F
F -> i | (E)
In this way, your grammar works as expected with an expression like :
a+b^c^d^e*f
Thanks!
Symmetrically as below.
E -> E + T | E - T | T
T -> T * F | T / F | X
X -> X ^ Y | Y
Y -> i | (E)
Explanation:
==========
Notice this grammar is unambiguous. To generate expression consist of operators ^
, *
, /
, +
, -
, first we need to start writing steps in which lower precedence operator such that +
, -
can be added before higher one i.e *
, /
. And then ^
can be added in target expression at the end. Hence in Abstract Syntax tree (parse-tree) operator ^
will be appears at bottom (towards leaves). In this way if we evaluate that expression according to tree, ^
will be perform first.
Note: According to grammar rules, in a sentimental form X -> O ^ Y
we can't go back to add +
, -
.. , But if you have any sentimental form out of E+T | E-T | T
then we can further add other operators again. So In this form of grammar, we have control over flow of generation of any valid expression(string) belongs to the language of grammar.(This is how to control operator precedence in unambiguous).
For example to produce the expression i + i ^ i * i
, we can't go like E --> T --> X ---> X ^ Y
because once you have X ^ Y
you can't add +
, -
(without parenthesis (E)
).
The possible choice to generate expression i + i ^ i * i
is as follows:
E --> E - T --> E + T - T --> E + X - T --> E + X ^ Y - T --*--> i+i^i*i
`--*-->` means more than one step
Notice operator ^
is added at last step (so can appear at bottom in tree, shown below in diagram):
The Tree will be something like:
E
/ | \
/ | \
E - T <-- - evaluates 3rd
/ | \ '
/ | \ '
E + T i <-- + evaluates 2nd
' |
' |
i X
/ | \
/ | \
X ^ Y <-- ^ evaluates first
' '
' '
i i
NOTE:
in tree ' means more than one steps
'
^ has higher precedence
because of Left to right associativity + evaluated before then -
When you start evaluating this tree then ^
will be evaluated first.
Remember high precedence operators always added at bottom hence, grammar should be such that operators can be added later (in sentence synthesis).
(you should understand why in your grammar +
and -
can be directly generated via E
and via T
you can add *
, /
. Why other unambiguous version E -> E*T | E/T | T
, T -> T+F | T-F | X
is not correct! where as language generated by this grammar and your grammar are equivalent. The reason is your grammar generate correct tree from evaluation point of respect)
Additionally, If you are writing a parser using YACC tool. You can use an ambiguous version of this grammar with less numbers of production rules and specify operators precedence outside grammar rules (that will give a rough idea about which operator to evaluate first, hence how should build a tree). And that will be preferable way than this unambiguous form because less number of productions rules build smaller tree in high (hence efficient compiler-take less time to parse).