I am working on a SystemVerilog parser and I am running into many ply conflicts (both shift/reduce and reduce/reduce).
I currently have like 170+ conflicts and the problem I have is that I don't really understand the parser.out file generated by PLY. Without properly understanding that there is little I can do, so my goal is to understand what ply is reporting. All the PLY documentation is brief and not very explainatory...
Here you have one of my states, the first where a conflict is found apparently:
state 24
(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN
! shift/reduce conflict for LPAREN resolved as shift
PLUS reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
MINUS reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
EXCLAMATION reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
NEG reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
AMPERSAND reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
NEGAMPERSAND reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
PIPE reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
NEGPIPE reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
CARET reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
NEGCARET reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
UNBASED_UNSIZED_LITERAL reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
STRING_LITERAL reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
REAL_FLOATINGP_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
REAL_FIXEDP_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
INT_HEX_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
INT_BINARY_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
INT_OCTAL_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
INT_DECIMAL_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
UNSIGNED_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
DOUBLEPLUS reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
DOUBLEMINUS reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
AT reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
TAGGED reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
INOUT reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
INPUT reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
OUTPUT reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
REF reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
ID reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
ESCAPED_ID reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
MODULE reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
MACROMODULE reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
LPAREN shift and go to state 21
! LPAREN [ reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .) ]
attribute_instance shift and go to state 49
As far as I understand ply, grammar rules are processed and states are built. Each of those states takes decisions based on the tokens that are coming in. So in this state that I posted (state 24), for example, if a PLUS token was waiting to be shifted in the stack, ply would go ahead and "reduce using rule 134". One thing I don't understand is, what does ply do then? I mean does it stay in the same state (24)? Is it only when an "attribute_instance" is waiting to be shifted in, when ply actualy moves states and goes to state 49?
Another question, what do the parsing "snapshots" listed at the beggining of the state mean?
(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN
Does PLY compute all the possible stack states under which state 24 could be reached? is that even possible?
In case it is of any use, here you can see my grammar's rules:
Grammar
Rule 0 S' -> source_text
Rule 1 source_text -> timeunits_declaration description_list
Rule 2 timeunits_declaration -> timeunit_and_precision
Rule 3 timeunits_declaration -> timeunit
Rule 4 timeunits_declaration -> timeprecision
Rule 5 timeunits_declaration -> timeunit timeprecision
Rule 6 timeunits_declaration -> timeprecision timeunit
Rule 7 timeunits_declaration -> empty
Rule 8 timeunit_and_precision -> TIMEUNIT time_literal SLASH time_literal SEMICOLON
Rule 9 timeunit -> TIMEUNIT time_literal SEMICOLON
Rule 10 timeprecision -> TIMEPRECISION time_literal SEMICOLON
Rule 11 time_literal -> UNSIGNED_NUMBER time_unit
Rule 12 time_literal -> REAL_FIXEDP_NUMBER time_unit
Rule 13 time_unit -> S
Rule 14 time_unit -> MS
Rule 15 time_unit -> US
Rule 16 time_unit -> NS
Rule 17 time_unit -> PS
Rule 18 time_unit -> FS
Rule 19 description_list -> description_list description
Rule 20 description_list -> description
Rule 21 description -> module_declaration
Rule 22 module_declaration -> module_nonansi_header timeunits_declaration module_item_list module_footer
Rule 23 module_declaration -> module_ansi_header timeunits_declaration non_port_module_item_list module_footer
Rule 24 module_declaration -> module_implicit_header timeunits_declaration module_item module_footer
Rule 25 module_declaration -> EXTERN module_nonansi_header
Rule 26 module_declaration -> EXTERN module_ansi_header
Rule 27 module_nonansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_ports SEMICOLON
Rule 28 module_ansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_port_declarations_list SEMICOLON
Rule 29 module_implicit_header -> attribute_instance_optional_list module_keyword lifetime module_identifier LPAREN DOT ASTERISK RPAREN SEMICOLON
Rule 30 module_keyword -> MODULE
Rule 31 module_keyword -> MACROMODULE
Rule 32 module_footer -> ENDMODULE COLON module_identifier
Rule 33 module_footer -> ENDMODULE
Rule 34 module_item -> port_declaration SEMICOLON
Rule 35 module_item -> non_port_module_item
Rule 36 port_declaration -> attribute_instance_optional_list inout_declaration
Rule 37 port_declaration -> attribute_instance_optional_list input_declaration
Rule 38 port_declaration -> attribute_instance_optional_list output_declaration
Rule 39 port_declaration -> attribute_instance_optional_list ref_declaration
Rule 40 port_declaration -> attribute_instance_optional_list interface_port_declaration
Rule 41 inout_declaration -> INOUT net_port_type list_of_port_identifiers
Rule 42 input_declaration -> INPUT net_port_type list_of_port_identifiers
Rule 43 input_declaration -> INPUT variable_port_type list_of_variable_identifiers
Rule 44 output_declaration -> OUTPUT net_port_type list_of_port_identifiers
Rule 45 interface_port_declaration -> interface_identifier list_of_interface_identifiers
Rule 46 interface_port_declaration -> interface_identifier DOT modport_identifier list_of_interface_identifiers
Rule 47 ref_declaration -> REF variable_port_type list_of_variable_identifiers
Rule 48 casting_type -> simple_type
Rule 49 casting_type -> constant_primary
Rule 50 casting_type -> signing
Rule 51 casting_type -> STRING
Rule 52 casting_type -> CONST
Rule 53 data_type -> integer_vector_type optional_signing optional_packed_dimension
Rule 54 data_type -> integer_atom_type optional_signing
Rule 55 data_type -> non_integer_type
Rule 56 data_type -> struct_union LBRACE struct_union_member_list RBRACE optional_packed_dimension_list
Rule 57 data_type -> ENUM LBRACE optional_enum_name_declaration_list RBRACE optional_packed_dimension_list
Rule 58 data_type -> ENUM enum_base_type LBRACE optional_enum_name_declaration_list RBRACE optional_packed_dimension_list
Rule 59 data_type -> STRING
Rule 60 data_type -> CHANDLE
Rule 61 data_type -> VIRTUAL interface_identifier optional_parameter_value_assignment optional_modport_identifier
Rule 62 data_type -> VIRTUAL INTERFACE interface_identifier optional_parameter_value_assignment optional_modport_identifier
Rule 63 data_type -> type_identifier optional_packed_dimension_list
Rule 64 data_type -> class_scope type_identifier optional_packed_dimension_list
Rule 65 data_type -> package_scope type_identifier optional_packed_dimension_list
Rule 66 data_type -> class_type
Rule 67 data_type -> EVENT
Rule 68 data_type -> ps_covergroup_identifier
Rule 69 data_type -> type_reference
Rule 70 data_type_or_implicit -> data_type
Rule 71 data_type_or_implicit -> implicit_data_type
Rule 72 implicit_data_type -> optional_signing optional_packed_dimension_list
Rule 73 enum_base_type -> integer_atom_type optional_signing
Rule 74 enum_base_type -> integer_vector_type optional_signing optional_packed_dimension
Rule 75 enum_base_type -> type_identifier optional_packed_dimension
Rule 76 enum_name_declaration -> enum_identifier optional_enum_identifier_pointer
Rule 77 enum_name_declaration -> enum_identifier optional_enum_identifier_pointer EQUALS constant_expression
Rule 78 optional_enum_identifier_pointer -> LBRACKET integral_number RBRACKET
Rule 79 optional_enum_identifier_pointer -> LBRACKET integral_number COLON integral_number RBRACKET
Rule 80 optional_enum_identifier_pointer -> empty
Rule 81 class_scope -> class_type DOUBLECOLON
Rule 82 class_type -> ps_class_identifier optional_parameter_value_assignment
Rule 83 class_type -> ps_class_identifier optional_parameter_value_assignment parametrized_class_list
Rule 84 parametrized_class_list -> parametrized_class_list DOUBLECOLON class_identifier optional_parameter_value_assignment
Rule 85 parametrized_class_list -> DOUBLECOLON class_identifier optional_parameter_value_assignment
Rule 86 integer_type -> integer_vector_type
Rule 87 integer_type -> integer_atom_type
Rule 88 integer_atom_type -> BYTE
Rule 89 integer_atom_type -> SHORTINT
Rule 90 integer_atom_type -> INT
Rule 91 integer_atom_type -> LONGINT
Rule 92 integer_atom_type -> INTEGER
Rule 93 integer_atom_type -> TIME
Rule 94 integer_vector_type -> BIT
Rule 95 integer_vector_type -> LOGIC
Rule 96 integer_vector_type -> REG
Rule 97 non_integer_type -> SHORTREAL
Rule 98 non_integer_type -> REAL
Rule 99 non_integer_type -> REALTIME
Rule 100 net_type -> SUPPLY0
Rule 101 net_type -> SUPPLY1
Rule 102 net_type -> TRI
Rule 103 net_type -> TRIAND
Rule 104 net_type -> TRIOR
Rule 105 net_type -> TRIREG
Rule 106 net_type -> TRI0
Rule 107 net_type -> TRI1
Rule 108 net_type -> UWIRE
Rule 109 net_type -> WIRE
Rule 110 net_type -> WAND
Rule 111 net_type -> WOR
Rule 112 net_port_type -> data_type_or_implicit
Rule 113 net_port_type -> net_type data_type_or_implicit
Rule 114 net_port_type -> net_type_identifier
Rule 115 net_port_type -> INTERCONNECT implicit_data_type
Rule 116 variable_port_type -> var_data_type
Rule 117 var_data_type -> data_type
Rule 118 var_data_type -> VAR data_type_or_implicit
Rule 119 signing -> SIGNED
Rule 120 signing -> UNSIGNED
Rule 121 simple_type -> integer_type
Rule 122 simple_type -> non_integer_type
Rule 123 simple_type -> ps_type_identifier
Rule 124 simple_type -> ps_parameter_identifier
Rule 125 struct_union_member -> attribute_instance_optional_list data_type_or_void list_of_variable_decl_assignments
Rule 126 struct_union_member -> attribute_instance_optional_list random_qualifier data_type_or_void list_of_variable_decl_assignments
Rule 127 data_type_or_void -> data_type
Rule 128 data_type_or_void -> VOID
Rule 129 struct_union -> STRUCT
Rule 130 struct_union -> UNION
Rule 131 struct_union -> UNION TAGGED
Rule 132 type_reference -> TYPE LPAREN expression RPAREN
Rule 133 type_reference -> TYPE LPAREN data_type RPAREN
Rule 134 attribute_instance_optional_list -> attribute_instance_list
Rule 135 attribute_instance_optional_list -> empty
Rule 136 attribute_instance_list -> attribute_instance_list attribute_instance
Rule 137 attribute_instance_list -> attribute_instance
Rule 138 attribute_instance -> LPAREN ASTERISK attr_spec_list ASTERISK RPAREN
Rule 139 attr_spec_list -> attr_spec_list COMMA attr_spec
Rule 140 attr_spec_list -> attr_spec
Rule 141 attr_spec -> attr_name
Rule 142 attr_spec -> attr_name EQUALS constant_expression
Rule 143 attr_name -> identifier
Rule 144 inc_or_dec_expression -> inc_or_dec_operator attribute_instance_optional_list variable_lvalue
Rule 145 inc_or_dec_expression -> variable_lvalue attribute_instance_optional_list inc_or_dec_operator
Rule 146 conditional_expression -> cond_predicate INTERROGATION attribute_instance_optional_list expression COLON expression
Rule 147 constant_expression -> constant_primary
Rule 148 constant_expression -> unary_operator attribute_instance_optional_list constant_primary
Rule 149 constant_expression -> constant_expression binary_operator attribute_instance_optional_list constant_expression
Rule 150 constant_expression -> constant_expression INTERROGATION attribute_instance_optional_list constant_expression COLON constant_expression
Rule 151 constant_mintypmax_expression -> constant_expression
Rule 152 constant_mintypmax_expression -> constant_expression COLON constant_expression COLON constant_expression
Rule 153 constant_param_expression -> constant_mintypmax_expression
Rule 154 constant_param_expression -> data_type
Rule 155 constant_param_expression -> DOLLAR
Rule 156 param_expression -> mintypmax_expression
Rule 157 param_expression -> data_type
Rule 158 param_expression -> DOLLAR
Rule 159 constant_range_expression -> constant_expression
Rule 160 constant_range_expression -> constant_part_select_range
Rule 161 constant_part_select_range -> constant_range
Rule 162 constant_part_select_range -> constant_indexed_range
Rule 163 constant_range -> constant_expression COLON constant_expression
Rule 164 constant_indexed_range -> constant_expression PLUSCOLON constant_expression
Rule 165 constant_indexed_range -> constant_expression MINUSCOLON constant_expression
Rule 166 expression -> primary
Rule 167 expression -> unary_operator attribute_instance_optional_list primary
Rule 168 expression -> inc_or_dec_expression
Rule 169 expression -> LPAREN operator_assignment RPAREN
Rule 170 expression -> expression binary_operator attribute_instance_optional_list expression
Rule 171 expression -> conditional_expression
Rule 172 expression -> inside_expression
Rule 173 expression -> tagged_union_expression
Rule 174 tagged_union_expression -> TAGGED member_identifier
Rule 175 tagged_union_expression -> TAGGED member_identifier expression
Rule 176 inside_expression -> expression INSIDE LBRACE open_range_list RBRACE
Rule 177 value_range -> expression
Rule 178 value_range -> LBRACKET expression COLON expression RBRACKET
Rule 179 mintypmax_expression -> expression
Rule 180 mintypmax_expression -> expression COLON expression COLON expression
Rule 181 module_path_conditional_expression -> module_path_expression INTERROGATION attribute_instance_optional_list module_path_expression COLON module_path_expression
Rule 182 module_path_expression -> module_path_primary
Rule 183 module_path_expression -> unary_module_path_operator attribute_instance_optional_list module_path_primary
Rule 184 module_path_expression -> module_path_expression binary_module_path_operator attribute_instance_optional_list module_path_expression
Rule 185 module_path_expression -> module_path_conditional_expression
Rule 186 module_path_mintypmax_expression -> module_path_expression
Rule 187 module_path_mintypmax_expression -> module_path_expression COLON module_path_expression COLON module_path_expression
Rule 188 part_select_range -> constant_range
Rule 189 part_select_range -> indexed_range
Rule 190 indexed_range -> expression PLUSCOLON constant_expression
Rule 191 indexed_range -> expression MINUSCOLON constant_expression
Rule 192 genvar_expression -> constant_expression
Rule 193 constant_primary -> primary_literal
Rule 194 primary_literal -> number
Rule 195 primary_literal -> time_literal
Rule 196 primary_literal -> UNBASED_UNSIZED_LITERAL
Rule 197 primary_literal -> STRING_LITERAL
Rule 198 number -> REAL_FLOATINGP_NUMBER
Rule 199 number -> REAL_FIXEDP_NUMBER
Rule 200 number -> INT_HEX_NUMBER
Rule 201 number -> INT_BINARY_NUMBER
Rule 202 number -> INT_OCTAL_NUMBER
Rule 203 number -> INT_DECIMAL_NUMBER
Rule 204 number -> UNSIGNED_NUMBER
Rule 205 unary_operator -> PLUS
Rule 206 unary_operator -> MINUS
Rule 207 unary_operator -> EXCLAMATION
Rule 208 unary_operator -> NEG
Rule 209 unary_operator -> AMPERSAND
Rule 210 unary_operator -> NEGAMPERSAND
Rule 211 unary_operator -> PIPE
Rule 212 unary_operator -> NEGPIPE
Rule 213 unary_operator -> CARET
Rule 214 unary_operator -> NEGCARET
Rule 215 binary_operator -> PLUS
Rule 216 binary_operator -> MINUS
Rule 217 binary_operator -> ASTERISK
Rule 218 binary_operator -> SLASH
Rule 219 binary_operator -> PERCENT
Rule 220 binary_operator -> ISEQUAL
Rule 221 binary_operator -> NISEQUAL
Rule 222 binary_operator -> CISEQUAL
Rule 223 binary_operator -> NCISEQUAL
Rule 224 binary_operator -> WISEQUAL
Rule 225 binary_operator -> NWISEQUAL
Rule 226 binary_operator -> DOUBLEAMPERSAND
Rule 227 binary_operator -> DOUBLEPIPE
Rule 228 binary_operator -> DOUBLEASTERISK
Rule 229 binary_operator -> LT
Rule 230 binary_operator -> LE
Rule 231 binary_operator -> GT
Rule 232 binary_operator -> GE
Rule 233 binary_operator -> AMPERSAND
Rule 234 binary_operator -> PIPE
Rule 235 binary_operator -> CARET
Rule 236 binary_operator -> NEGCARET
Rule 237 binary_operator -> RSHIFT
Rule 238 binary_operator -> LSHIFT
Rule 239 binary_operator -> ARSHIFT
Rule 240 binary_operator -> ALSHIFT
Rule 241 binary_operator -> IMPLICATION
Rule 242 binary_operator -> EQUIVALENCE
Rule 243 inc_or_dec_operator -> DOUBLEPLUS
Rule 244 inc_or_dec_operator -> DOUBLEMINUS
Rule 245 unary_module_path_operator -> EXCLAMATION
Rule 246 unary_module_path_operator -> NEG
Rule 247 unary_module_path_operator -> AMPERSAND
Rule 248 unary_module_path_operator -> NEGAMPERSAND
Rule 249 unary_module_path_operator -> PIPE
Rule 250 unary_module_path_operator -> NEGPIPE
Rule 251 unary_module_path_operator -> CARET
Rule 252 unary_module_path_operator -> NEGCARET
Rule 253 binary_module_path_operator -> ISEQUAL
Rule 254 binary_module_path_operator -> NISEQUAL
Rule 255 binary_module_path_operator -> DOUBLEAMPERSAND
Rule 256 binary_module_path_operator -> DOUBLEPIPE
Rule 257 binary_module_path_operator -> AMPERSAND
Rule 258 binary_module_path_operator -> PIPE
Rule 259 binary_module_path_operator -> CARET
Rule 260 binary_module_path_operator -> NEGCARET
Rule 261 array_identifier -> identifier
Rule 262 block_identifier -> identifier
Rule 263 bin_identifier -> identifier
Rule 264 c_identifier -> C_ID
Rule 265 cell_identifier -> identifier
Rule 266 checker_identifier -> identifier
Rule 267 class_identifier -> identifier
Rule 268 class_variable_identifier -> variable_identifier
Rule 269 clocking_identifier -> identifier
Rule 270 config_identifier -> identifier
Rule 271 const_identifier -> identifier
Rule 272 constraint_identifier -> identifier
Rule 273 covergroup_identifier -> identifier
Rule 274 covergroup_variable_identifier -> variable_identifier
Rule 275 cover_point_identifier -> identifier
Rule 276 cross_identifier -> identifier
Rule 277 dynamic_array_variable_identifier -> variable_identifier
Rule 278 enum_identifier -> identifier
Rule 279 escaped_identifier -> ESCAPED_ID
Rule 280 formal_identifier -> identifier
Rule 281 formal_port_identifier -> identifier
Rule 282 function_identifier -> identifier
Rule 283 generate_block_identifier -> identifier
Rule 284 genvar_identifier -> identifier
Rule 285 hierarchical_array_identifier -> hierarchical_identifier
Rule 286 hierarchical_block_identifier -> hierarchical_identifier
Rule 287 hierarchical_event_identifier -> hierarchical_identifier
Rule 288 hierarchical_identifier -> optional_identifier_constant_bit_select_list identifier
Rule 289 hierarchical_identifier -> DOLLAR ROOT DOT optional_identifier_constant_bit_select_list identifier
Rule 290 hierarchical_net_identifier -> hierarchical_identifier
Rule 291 hierarchical_parameter_identifier -> hierarchical_identifier
Rule 292 hierarchical_property_identifier -> hierarchical_identifier
Rule 293 hierarchical_sequence_identifier -> hierarchical_identifier
Rule 294 hierarchical_task_identifier -> hierarchical_identifier
Rule 295 hierarchical_tf_identifier -> hierarchical_identifier
Rule 296 hierarchical_variable_identifier -> hierarchical_identifier
Rule 297 identifier -> simple_identifier
Rule 298 identifier -> escaped_identifier
Rule 299 index_variable_identifier -> identifier
Rule 300 interface_identifier -> identifier
Rule 301 interface_instance_identifier -> identifier
Rule 302 inout_port_identifier -> identifier
Rule 303 input_port_identifier -> identifier
Rule 304 instance_identifier -> identifier
Rule 305 library_identifier -> identifier
Rule 306 member_identifier -> identifier
Rule 307 method_identifier -> identifier
Rule 308 modport_identifier -> identifier
Rule 309 module_identifier -> identifier
Rule 310 net_identifier -> identifier
Rule 311 net_type_identifier -> identifier
Rule 312 output_port_identifier -> identifier
Rule 313 package_identifier -> identifier
Rule 314 package_scope -> package_identifier DOUBLECOLON
Rule 315 package_scope -> DOLLAR UNIT DOUBLECOLON
Rule 316 optional_package_scope -> package_scope
Rule 317 optional_package_scope -> empty
Rule 318 parameter_identifier -> identifier
Rule 319 port_identifier -> identifier
Rule 320 production_identifier -> identifier
Rule 321 program_identifier -> identifier
Rule 322 property_identifier -> identifier
Rule 323 ps_class_identifier -> optional_package_scope class_identifier
Rule 324 ps_covergroup_identifier -> optional_package_scope covergroup_identifier
Rule 325 ps_checker_identifier -> optional_package_scope checker_identifier
Rule 326 ps_identifier -> optional_package_scope identifier
Rule 327 ps_or_hierarchical_array_identifier -> optional_package_scope hierarchical_array_identifier
Rule 328 ps_or_hierarchical_array_identifier -> implicit_class_handle DOT hierarchical_array_identifier
Rule 329 ps_or_hierarchical_array_identifier -> class_scope hierarchical_array_identifier
Rule 330 ps_or_hierarchical_net_identifier -> optional_package_scope net_identifier
Rule 331 ps_or_hierarchical_net_identifier -> hierarchical_net_identifier
Rule 332 ps_or_hierarchical_property_identifier -> optionnal_package_scope property_identifier
Rule 333 ps_or_hierarchical_property_identifier -> hierarchical_property_identifier
Rule 334 ps_or_hierarchical_sequence_identifier -> optional_package_scope sequence_identifier
Rule 335 ps_or_hierarchical_sequence_identifier -> hierarchical_sequence_identifier
Rule 336 ps_or_hierarchical_tf_identifier -> optional_package_scope tf_identifier
Rule 337 ps_or_hierarchical_tf_identifier -> hierarchical_tf_identifier
Rule 338 ps_parameter_identifier -> optional_package_scope parameter_identifier
Rule 339 ps_parameter_identifier -> class_scope parameter_identifier
Rule 340 ps_parameter_identifier -> ps_parameter_identifier_generate_list parameter_identifier
Rule 341 ps_parameter_identifier_generate_list -> ps_parameter_identifier_generate_list DOT ps_parameter_identifier_generate
Rule 342 ps_parameter_identifier_generate_list -> ps_parameter_identifier_generate
Rule 343 ps_parameter_identifier_generate -> generate_block_identifier LBRACKET constant_expression RBRACKET
Rule 344 ps_parameter_identifier_generate -> generate_block_identifier
Rule 345 ps_type_identifier -> type_identifier
Rule 346 ps_type_identifier -> LOCAL DOUBLECOLON type_identifier
Rule 347 ps_type_identifier -> package_scope type_identifier
Rule 348 sequence_identifier -> identifier
Rule 349 signal_identifier -> identifier
Rule 350 simple_identifier -> ID
Rule 351 specparam_identifier -> identifier
Rule 352 system_tf_identifier -> DOLLAR ID
Rule 353 task_identifier -> identifier
Rule 354 tf_identifier -> identifier
Rule 355 terminal_identifier -> identifier
Rule 356 topmodule_identifier -> identifier
Rule 357 type_identifier -> identifier
Rule 358 udp_identifier -> identifier
Rule 359 variable_identifier -> identifier
Rule 360 cond_predicate -> AT
Rule 361 implicit_class_handle -> AT
Rule 362 integral_number -> AT
Rule 363 lifetime -> AT
Rule 364 list_of_interface_identifiers -> AT
Rule 365 list_of_port_declarations_list -> AT
Rule 366 list_of_port_identifiers -> AT
Rule 367 list_of_ports -> AT
Rule 368 list_of_variable_decl_assignments -> AT
Rule 369 list_of_variable_identifiers -> AT
Rule 370 module_item_list -> AT
Rule 371 module_path_primary -> AT
Rule 372 non_port_module_item -> AT
Rule 373 non_port_module_item_list -> AT
Rule 374 open_range_list -> AT
Rule 375 operator_assignment -> AT
Rule 376 optional_enum_name_declaration_list -> AT
Rule 377 optional_identifier_constant_bit_select_list -> AT
Rule 378 optional_modport_identifier -> AT
Rule 379 optional_packed_dimension -> AT
Rule 380 optional_packed_dimension_list -> AT
Rule 381 optional_parameter_value_assignment -> AT
Rule 382 optional_signing -> AT
Rule 383 optionnal_package_scope -> AT
Rule 384 package_import_declaration_list -> AT
Rule 385 parameter_port_list -> AT
Rule 386 primary -> AT
Rule 387 random_qualifier -> AT
Rule 388 struct_union_member_list -> AT
Rule 389 variable_lvalue -> AT
Rule 390 empty -> <empty>
In LR parsing, we often talk about "items": an item is a production with a progress marker, usually written with a • but sometimes with a simple .
. A state is just a collection of items; in effect, the state tells you the set of productions the parse might be inside.
There is one particularly special type of item: the item with a dot at the end:
(134) attribute_instance_optional_list -> attribute_instance_list .
This represents a production which could be finished, since the progress marker is at the end. If that is the correct production, the parser must then substitute the right-hand side for the left-hand side: this is the action referred to as "reducing" (since it is the opposite of "producing", which is what a "production" does).
However, the mere fact that you are in a state with a possible reduction does not mean that the reduction is possible. It is also necessary that the next token be consistent with the result of the reduction. If the next token could not follow the reduced non-terminal (in the context of the parser's state), then the reduction cannot be performed, so the parser will attempt a shift if one is possible.
Shifts are really simple. A shift is possible if one or more items in the state have the dot before the current lookahead symbol. Here, there is no question about additional lookahead because Ply (like many LALR parser generators) only creates LALR(1) parsers which only have a single lookahead in any state, so the only thing we have to go on is the symbol we are currently looking at, and it is reasonably obvious that we can only process it if some available item has that symbol in the next position.
If a given state with a given lookahead symbol can both shift and reduce, then you have a shift-reduce conflict; the parser doesn't know what to do. (If it has neither a shift nor a reduce available, that indicates that the input has a syntax error. That's how LR parsers identify syntax errors.)
The one important aspect of LR parsing is that a reduction must be performed immediately if it is going to be performed at all. That is, if we are in a state with a possible reduction, and the item's lookahead set indicates that the lookahead character is feasible, we must perform the reduction. We can't wait and see if it would be possible later, because there is no later for a reduction. In other words, anything to the left of the • in an item has already been reduced as much as it could be. (This is the R
in LR
parsing, which indicates that every reduction is "rightmost". If the use of "rightmost" doesn't make sense, don't worry about it; I only mentioned this fact in case you were wondering.)
Another thing which I might as well mention is that in LALR parsing ("Lookahead LR parsing"), a state is precisely defined by the set of items. Each item has an applicable lookahead set, but the lookahead sets don't form part of the state's identity. If the parser generator ends up producing two states with the same items but different lookahead sets, it must merge them into a single state, forming the union of each lookahead set. For full LR parsing, this limitation doesn't exist; you can (and do) have more than one state for a given set of items, and the result is that the parsing table is much larger and slightly more powerful.
Now, if a shift action is possible, you can mechanically figure out which state will be active after the shift. For example, from
(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN
after shifting an LPAREN
, the next state will have just one item:
(138) attribute_instance -> LPAREN . ASTERISK attr_spec_list ASTERISK RPAREN
(Note how the dot has moved.)
That was a simple case, since the next symbol is a terminal, ASTERISK
. Most of the time, the next symbol after a shift will be a non-terminal, and in that case we need to add all of the productions for that non-terminal, with the dot at the beginning. (That's how states end up with more than one item.) So, for example, given the new state with one item and an input of ASTERISK
(anything else will be an error, since this state has no reduction possibilities), then we will shift into a state which has the shifted item:
(138) attribute_instance -> LPAREN ASTERISK . attr_spec_list ASTERISK RPAREN
plus all the productions for attr_spec_list
:
(139) attr_spec_list -> . attr_spec_list COMMA attr_spec
(140) attr_spec_list -> . attr_spec
plus all the productions for attr_spec
(since we just added an item with the dot before attr_spec
):
(141) attr_spec -> . attr_name
(142) attr_spec -> . attr_name EQUALS constant_expression
plus the production for attr_name
:
(143) attr_name -> . identifier
and so on until we stop seeing new non-terminals:
(297) identifier -> . simple_identifier
(298) identifier -> . escaped_identifier
(350) simple_identifier -> . ID
(279) escaped_identifier -> . ESCAPED_ID
OK, now the next token will have to be ID
or ESCAPED_ID
. Suppose it is ID
. Now what? Well, we will shift into a state
(350) simple_identifier -> ID .
with a possible reduction; assuming the lookahead symbol matches the lookahead set (I haven't and don't intend to explain how lookahead sets are computed for each state; there's an algorithm but its details aren't relevant here), then the ID
will be reduced to simple_identifier
. Then where does the parser go? Logically, it goes back to the state which generated the simple_identifier
production, and shift the simple_identifier
. As it happens, the state is the one we just created
(138) attribute_instance -> LPAREN ASTERISK . attr_spec_list ASTERISK RPAREN
(139) attr_spec_list -> . attr_spec_list COMMA attr_spec
(140) attr_spec_list -> . attr_spec
(141) attr_spec -> . attr_name
(142) attr_spec -> . attr_name EQUALS constant_expression
(143) attr_name -> . identifier
(297) identifier -> . simple_identifier
(298) identifier -> . escaped_identifier
(350) simple_identifier -> . ID
(279) escaped_identifier -> . ESCAPED_ID
and after we shift the simple_identifier
, we end up with
(297) identifier -> simple_identifier .
which is a state which requires a reduction to identifier
, so once again back to the same state after which we find ourselves in
(143) attr_name -> identifier .
and then
(141) attr_spec -> attr_name .
(142) attr_spec -> attr_name . EQUALS constant_expression
But how did the parser know which state to go back to on each of those reductions? The answer is that the parser pushes the current state onto the parsing stack with every symbol. When it does a reduction, it pops the symbols from the right-hand side, discarding each associated state number, until it gets to the beginning of the right-hand-side, at which point the stack indicates which state that right-hand side came from. It then takes a look at that state, shifts the reduced non-terminal, and pushes the new shifted state onto the parse stack.
So I think that answers the questions "What do the lines at the beginning of the state description mean?" and "What state does the parser go to after a reduction?" The other two questions are easy to answer: "No, it doesn't compute all the possible predecessor states", and "Yes, it could (although it might end up adding predecessors which are actually not possible with any input) but it isn't useful for the parse." but since they're not horribly relevant to solving the shift-reduce conflict, I won't explain the answer further.
Going back to the actual shift-reduce conflict, the situation is that we are in the state
(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN
which has a possible reduction, and we are considering the case where we see an LPAREN
, for which there is a possible shift, and it turns out that the lookahead set for the first item also include LPAREN
. Although the lookahead set is not shown in the PLY output, we can dig around in the grammar to see where it might have come from. The immediate source is attribute_instance_optional_list
, of course, and we can find that in the grammar,although there are quite a few possibilities:
(27) module_nonansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_ports SEMICOLON
(28) module_ansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_port_declarations_list SEMICOLON
(29) module_implicit_header -> attribute_instance_optional_list module_keyword lifetime module_identifier LPAREN DOT ASTERISK RPAREN SEMICOLON
(36) port_declaration -> attribute_instance_optional_list inout_declaration
(37) port_declaration -> attribute_instance_optional_list input_declaration
(38) port_declaration -> attribute_instance_optional_list output_declaration
(39) port_declaration -> attribute_instance_optional_list ref_declaration
(40) port_declaration -> attribute_instance_optional_list interface_port_declaration
(125) struct_union_member -> attribute_instance_optional_list data_type_or_void list_of_variable_decl_assignments
(126) struct_union_member -> attribute_instance_optional_list random_qualifier data_type_or_void list_of_variable_decl_assignments
(144) inc_or_dec_expression -> inc_or_dec_operator attribute_instance_optional_list variable_lvalue
(145) inc_or_dec_expression -> variable_lvalue attribute_instance_optional_list inc_or_dec_operator
(146) conditional_expression -> cond_predicate INTERROGATION attribute_instance_optional_list expression COLON expression
(148) constant_expression -> unary_operator attribute_instance_optional_list constant_primary
(149) constant_expression -> constant_expression binary_operator attribute_instance_optional_list constant_expression
(150) constant_expression -> constant_expression INTERROGATION attribute_instance_optional_list constant_expression COLON constant_expression
(167) expression -> unary_operator attribute_instance_optional_list primary
(170) expression -> expression binary_operator attribute_instance_optional_list expression
(181) module_path_conditional_expression -> module_path_expression INTERROGATION attribute_instance_optional_list module_path_expression COLON module_path_expression
(183) module_path_expression -> unary_module_path_operator attribute_instance_optional_list module_path_primary
(184) module_path_expression -> module_path_expression binary_module_path_operator attribute_instance_optional_list module_path_expression
As far as I can see, attribute_instance_optional_list
does not appear at the end of any of those productions, which simplifies working out where the LPAREN
conflict comes from. In all those cases, it is followed by a non-terminal, the possibilities being:
module_keyword
inout_declaration
input_declaration
output_declaration
ref_declaration
interface_port_declaration
data_type_or_void
random_qualifier
variable_lvalue
inc_or_dec_operator
constant_primary
constant_expression
primary
expression
module_path_primary
module_path_expression
Now, if any of those non-terminals could start with an LPAREN
, we have a possible shift-reduce conflict. And a couple of culprits spring out of the list: expression
and similar.
So, there is the problem, in summary: an attribute_instance
can start with a parenthesis, but an attribute_instance_list
can also be followed by a parenthesis. So when you're in the middle of an attribute_instance_list and you see a (, you have no way of knowing whether to shift or reduce.
来源:https://stackoverflow.com/questions/41775631/how-to-understand-and-fix-conflicts-in-ply