问题
I'm having trouble fixing a shift reduce conflict in my grammar. I tried to add -v to read the output of the issue and it guides me towards State 0 and mentions that my INT and FLOAT is reduced to variable_definitions by rule 9. I cannot see the conflict and I'm having trouble finding a solution.
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token INT FLOAT
%token ADDOP MULOP INCOP
%token WHILE IF ELSE RETURN
%token NUM ID
%token INCLUDE
%token STREAMIN ENDL STREAMOUT
%token CIN COUT
%token NOT
%token FLT_LITERAL INT_LITERAL STR_LITERAL
%right ASSIGNOP
%left AND OR
%left RELOP
%%
program: variable_definitions
| function_definitions
;
function_definitions: function_head block
| function_definitions function_head block
;
identifier_list: ID
| ID '[' INT_LITERAL ']'
| identifier_list ',' ID
| identifier_list ',' ID '[' INT_LITERAL ']'
;
variable_definitions:
| variable_definitions type identifier_list ';'
;
type: INT
| FLOAT
;
function_head: type ID arguments
;
arguments: '('parameter_list')'
;
parameter_list:
|parameters
;
parameters: type ID
| type ID '['']'
| parameters ',' type ID
| parameters ',' type ID '['']'
;
block: '{'variable_definitions statements'}'
;
statements:
| statements statement
;
statement: expression ';'
| compound_statement
| RETURN expression ';'
| IF '('bool_expression')' statement ELSE statement
| WHILE '('bool_expression')' statement
| input_statement ';'
| output_statement ';'
;
input_statement: CIN
| input_statement STREAMIN variable
;
output_statement: COUT
| output_statement STREAMOUT expression
| output_statement STREAMOUT STR_LITERAL
| output_statement STREAMOUT ENDL
;
compound_statement: '{'statements'}'
;
variable: ID
| ID '['expression']'
;
expression_list:
| expressions
;
expressions: expression
| expressions ',' expression
;
expression: variable ASSIGNOP expression
| variable INCOP expression
| simple_expression
;
simple_expression: term
| ADDOP term
| simple_expression ADDOP term
;
term: factor
| term MULOP factor
;
factor: ID
| ID '('expression_list')'
| literal
| '('expression')'
| ID '['expression']'
;
literal: INT_LITERAL
| FLT_LITERAL
;
bool_expression: bool_term
| bool_expression OR bool_term
;
bool_term: bool_factor
| bool_term AND bool_factor
;
bool_factor: NOT bool_factor
| '('bool_expression')'
| simple_expression RELOP simple_expression
;
%%
回答1:
Your definition of a program
is that it is either a list of variable definitions or a list of function definitions (program: variable_definitions | function_definitions;
). That seems a bit odd to me. What if I want to define both a function and a variable? Do I have to write two programs and somehow link them together?
This is not the cause of your problem, but fixing it would probably fix the problem as well. The immediate cause is that function_definitions
is one or more function definition while variable_definitions
is zero or more variable definitions. In other words, the base case of the function_definitions
recursion is a function definition, while the base case of variable_definitions
is the empty sequence. So a list of variable definitions starts with an empty sequence.
But both function definitions and variable definitions start with a type
. So if the first token of a program is int
, it could be the start of a function definition with return type int
or a variable definition of type int
. In the former case, the parser should shift the int
in order to produce the function_definitions
base case:; in the latter case, it should immediately reduce an empty variable_definitions
base case.
If you really wanted a program to be either function definitions or variable definitions, but not both. you would need to make variable_definitions
have the same form as function_definitions
, by changing the base case from empty to type identifier_list ';'
. Then you could add an empty production to program
so that the parser could recognize empty inputs.
But as I said at the beginning, you probably want a program to be a sequence of definitions, each of which could either be a variable or a function:
program: %empty
| program type identifier_list ';'
| program function_head block
By the way, you are misreading the output file produced by -v
. It shows the following actions for State 0:
INT shift, and go to state 1
FLOAT shift, and go to state 2
INT [reduce using rule 9 (variable_definitions)]
FLOAT [reduce using rule 9 (variable_definitions)]
Here, INT
and FLOAT
are possible lookaheads. So the interpretation of the line INT [reduce using rule 9 (variable_definitions)]
is "if the lookahead is INT
, immediately reduce using production 9". Production 9 produces the empty sequence, so the reduction reduces zero tokens at the top of the parser stack into a variable_definitions
. Reductions do not use the lookahead token, so after the reduction, the lookahead token is still INT
.
However, the parser doesn't actually do that because it has a different action for INT
, which is to shift it and go to state 1. as indicated by the first line start INT
. The brackets [...]
indicate that this action is not taken because it is a conflict and the resolution of the conflict was some other action. So the more accurate interpretation of that line is "if it weren't for the preceding action on INT
, the lookahead INT
would cause a reduction using rule 9."
来源:https://stackoverflow.com/questions/43579367/shift-reduce-conflict