define CRF++ template file

后端未结

关注

 2  1765

This is my issue, but it doesn\'t say HOW to define the template file correctly.

My training file looks like this:

上   B-NR
海   L-NR
浦   B-NR
东   L-NR
开


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  我在风中等你        
                
              
                            
                2021-01-28 12:14
              
            
            
                                                                       
It seems that this issue arises from not clearly understanding how CRF++ is processing the training file.  Your features may not include the values in the last column.  These are the labels!  If you were to include them in your features, your model would be trivially perfect!  When you define your template file, because you only have two columns, it can only include rules of the form %x[n,0].  It is hardcoded into CRF++ (though not clearly documented, as far as I can tell), that -4 <= n <= 4.  
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉话见心        
                
              
                            
                2021-01-28 12:16
              
            
            
                                                                       
CRF++ is extremely easy to use. The instructions on the website explains it clearly.
http://crfpp.googlecode.com/svn/trunk/doc/index.html
Suppose we extract feature for the line
东   L-NR
Unigram

U02:%x[0,0] #means column 0 of the current line
U03:%x[1,0] #means column 0 of the next line

So the underlying feature is "column0=开"
Similar for bigrams
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复