Bison shift/reduce conflict / reduce/reduce conflict warnings

后端 未结 1 1095
名媛妹妹
名媛妹妹 2021-01-07 11:07

When I run this bison code in Ubuntu Linux i get these warnings:

- shift/reduce conflict [-Wconflicts-sr]
- reduce/reduce conflicts [-Wcolficts-sr]


        
相关标签:
1条回答
  • 2021-01-07 11:42

    A. Redundant non-terminals

    The reduce/reduce conflicts are because you have two non-terminals which exist only to gather together different types:

    typos_dedomenwn: T_int
        | T_bool
        | T_string;
    
    typos_synartisis: T_int
        | T_bool
        | T_string;
    

    Where these non-terminals are used, it is impossible for the parser to know which one applies; it cannot tell until further along in the declaration. However, it doesn't matter. You could just define a single typos non-terminal, and use it throughout:

    typos: T_int
        | T_bool
        | T_string;
    
    orismos_metavlitwn: typos lista_metavlitwn T_semic;
    kefalida_synartisis: typos T_id T_openpar lista_typikwn_parametrwn T_closepar
        | typos T_id T_openpar T_closepar;
    typikes_parametroi: typos T_ampersand T_id;
    

    B. Dangling else

    The shift/reduce conflict is the classic problem with "C" style if statements. These statements are difficult to describe in a way which is not ambiguous. Consider:

    if (expr1) if (expr2) statement1; else statement2;
    

    We know that the else must match the second if, so the above is equivalent to:

    if (expr1) { if (expr2) statement1; else statement2; }
    

    But the grammar also matches the other possible parse, equivalent to:

    if (expr1) { if (expr2) statement1; } else statement2;
    

    There are three possible solutions to this problem:

    1. Do nothing. Bison does the right thing here, by design: it always prefers "shift" over "reduce". What that means is that if an else could match an open if statement, bison will always do that, rather than holding onto the else to match some outer if statement. There is a pretty good description of this in the Dragon book, amongst other places.

      The problem with this solution is that you still end up with a warning about shift/reduce conflicts, and it is hard to distinguish between "OK" conflicts, and newly-created "not OK" conflicts. Bison provides the %expect declaration so you can tell it how many conflicts you expect, which will suppress the warning if the right number are found, but that is still pretty fragile.

    2. Use precedence declarations. These are described in the bison manual. and their use in solving the dangling else problem is a running example in that chapter. In your case, it would look something like this:

      %precedence T_then  /* Fake terminal, needed for %prec */
      %precedence T_else
       /* ... */
      %%
       /* ... */
      
      entoli_if: T_if T_openpar geniki_ekfrasi Tw_closepar entoli T_else entoli
         | T_if T_openpar geniki_ekfrasi T_closepar entoli %prec T_then
      

      Here, I have eliminated the unnecessary non-terminal else_clause because it hides the else token. If you wanted to keep it, for whatever reason, you would need to add a %prec T_else to the end of the entoli_if production which uses it.

      The %precedence declaration is only available from bison 3.0 onwards. If you have an earlier version of bison, you can use the %nonassoc declaration instead, but this may hide some other errors.

    3. Fix the grammar. It is actually possible to make an unambiguous grammar, but it is a bit of work.

      The important point is that in:

      if (expr) statement1 else statement2
      

      statement1 cannot be an unmatched if statement. If statement1 is an if statement, it must include an else clause; otherwise, the else in the outer if would match the inner if. And that applies recursively to any trailing statements in statement1, such as

      if (e2) statement2; 
        else if (e3) statement3
        else /* must be present */ statement;
      

      We can express this by dividing statements into "matching" statements (where all if are matched by else) and "non-matching" statements: (I haven't tried to preserve the greek non-terminal names here; sorry. You'll have to adapt the idea to your grammar).

      statement: matching_statement | non_matching_statement ;
      matching_statement: call_statement | assignment_statement | ...
          | matching_if_statement
      non_matching_statement: non_matching_if_statement
          /* might be others, see below */
      
      if_condition: "if" '(' expression ')' ;
      
      matching_if_statement:
            if_condition matching_statement "else" matching_statement ;
      non_matching_if_statement:
            if_condition statement
          | if_condition matching_statement "else" non_matching_statement
          ; 
      

      In C, there are other compound statements which can end with a statement (while, for). Each of these will also have a "matching" and "non-matching" version, depending on whether the final statement is matching or non-matching:

      while_condition: "while" '(' expression ')' ;
      matching_while_statement: while_condition matching_statement ;
      non_matching_while_statement: while_condition non_matching_statement ;
      

      As far as I can see, this does not apply to your language, but you might want to extend it in the future to include such statements.

    C. Some notes about bison style

    1. Bison allows you to use single character tokens as themselves, surrounded by single quotes. So instead of declaring T_openpar and then writing verbose rules which use it, you can just write '('; you don't even need to declare it. (In your flex -- or other -- scanner, you would just return '('; instead of return T_openpar, which is why you don't need to declare the token.) This usually makes grammars more readable.

    2. Bison also lets you specify a human-readable name for a token. (This feature is not in all yacc derivatives, but it is pretty common.), which can also make grammars more readable. For example, you can give names to the if and else tokens as follows:

      %token T_if "if"
      %token T_else "else"
      

      and then you could use the quoted strings in your grammar rules. (I did that in my last example for the dangling-else problem.) In the flex scanner, you still need to use the token symbols T_if and T_else.

    3. If you have a two-symbol token like &&, it is usually better if the scanner recognizes it and returns a single token, instead of the parser recognizing two consecutive & tokens. In the second case, the parser will recognize:

      boolean_expr1 &  & boolean_expr2
      

      as though it had been written

      boolean_expr1 && boolean_expr2
      

      although the first one was most likely an error which should be reported.

    4. Bison is a bottom-up LALR(1) parser generator. It is not necessary to remove left-recursion. Bottom-up parsers prefer left-recursion, and left-recursive grammars are usually more accurate and easier to read. For example, it is better all round to declare:

      apli_ekfrasi: aplos_oros
          | apli_ekfrasi '+' aplos_oros
          | apli_ekfrasi '-' aplos_oros;
      

      than to use LL-style repeated suffixes (loop7 in your grammar). The left-recursive grammar can be parsed without extending the parser stack, and more accurately represents the syntactic structure of the expression, making parser actions easier to write.

      There are a number of other places in your grammar which you might want to revisit.

      (This advice comes straight from the bison manual: "you should always use left recursion, because it can parse a sequence of any number of elements with bounded stack space.")

    0 讨论(0)
提交回复
热议问题