yacc - Precedence of a rule with no operator?

后端 未结 1 361
鱼传尺愫
鱼传尺愫 2021-01-16 07:12

Thinking about parsing regular expressions using yacc (I\'m actually using PLY), some of the rules would be like the following:

expr : expr expr
expr : expr          


        
1条回答
  •  梦毁少年i
    2021-01-16 07:45

    You are under no obligation to use precedence to disambiguate; you can simply write an unambiguous grammar:

    term : CHAR | '(' expr ')'
    rept : term | term '*' | term '+' | term '?'
    conc : rept | conc rept
    expr : conc | expr '|' conc
    

    If you really want to use precedence, you can use a "fictitious" token with a %prec annotation. See the manual for details. (This feature comes from yacc, so you could read about it in any yacc/bison documentation as well.)

    Bear in mind that precedence is always a comparison between a production (at the top of the parser stack) and the lookahead token. Normally, the precedence of productions is taken from the precedence of the last terminal in the production (and normally there is only one terminal in each applicable production), so it appears to be a comparison between terminals. But in order to get precedence to work with "invisible" operators, you need to separately consider both the production precedence and the lookahead token precedence.

    The precedence of the production can be set with a "fictitious" token, as described above. But there is no lookahead token corresponding to an invisible operator; the lookahead token will be the first token in the following operand. In other words, it could be any token in the FIRST set of expr, which in this case is {NORMAL, PRIGHT}; this set must be added to the precedence declaration as though they were concatenation operators:

    precedence = (
      ('left', 'BAR'),
      ('left', 'CONCAT', 'NORMAL', 'PLEFT'),
      ('left', 'ASTERISK'),
    )
    

    Once you do that, you could economize on the fictitious CONCAT token, since you could use any of the FIRST(expr) tokens as a proxy, but that might be considered less readable.

    0 讨论(0)
提交回复
热议问题