Python/YACC: Resolving a shift/reduce conflict

前端未结

关注

 1  537

I\'m using PLY. Here is one of my states from parser.out:

state 3

    (5) course_data -> course .
    (6) course_data -> course . course_


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  执笔经年        
                
              
                            
                2021-01-28 05:52
              
            
            
                                                                       
Your basic problem is that you need two tokens of lookahead to do what you want -- when the input seen so far is a course and the lookahead is a OR_CONJ you don't know whether to reduce the course to a course_data or shift without looking ahead two tokens to the token after the OR_CONJ.  There are a number of ways you can deal with this


use an LR(2) or LR(k) or GLR parser generator -- any can deal with this.
use a lexer hack to do the lookahead -- basically have the lexer return two different OR_CONJ tokens depending on whether the following token is a COURSE_NUMBER or not.
factor the grammar to get rid of the conflict, which may result in a grammar that parses something slightly different from what you want (need some extra post-parse checks to reject some invalid constructs) and will generally make the grammar much harder to understand.


Note that your grammar as given is also ambiguous related to which way three or more courses connected in a single statement associate.  This is easily fixed by rewriting the grammar into a clearer left-recursive form:

Rule 1    statement -> course
Rule 2    statement -> statement OR_CONJ course
Rule 3    course -> DEPT_CODE course_list
Rule 4    course -> DEPT CODE course_list OR_CONJ COURSE_NUMBER
Rule 5    course_list -> COURSE_NUMBER
Rule 6    course_list -> course_list , COURSE_NUMBER


This could also be rewritten as right-recursive for an LL parser generator, but it still has the 2-token lookahead problem.  One way of refactoring it to make that go away would be to make COURSE_NUMBER by itself a valid course and recombine it with the previous course in a post-pass (or give an error if its the first course in a statement).  Then rule 4 becomes:

Rule 4    course -> COURSE_NUMBER


and you have no conflicts. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复