Multi-term named entities in Stanford Named Entity Recognizer

前端未结

关注

 8  1352

春和景丽 2021-01-31 19:33

I\'m using the Stanford Named Entity Recognizer http://nlp.stanford.edu/software/CRF-NER.shtml and it\'s working fine. This is

    List&


      
      
        
          8条回答        

        
                    
            
            
                         
                
              
              
                
                   鱼传尺愫
                                             
                
                
                (楼主)
            
              
              
                2021-01-31 20:00
              

            
            
                        
Another approach to deal with multi words entities.
This code combines multiple tokens together if they have the same annotation and go in a row.

Restriction:

If the same token has two different annotations, the last one will be saved.

private Document getEntities(String fullText) {

    Document entitiesList = new Document();
    NERClassifierCombiner nerCombClassifier = loadNERClassifiers();

    if (nerCombClassifier != null) {

        List> results = nerCombClassifier.classify(fullText);

        for (List coreLabels : results) {

            String prevLabel = null;
            String prevToken = null;

            for (CoreLabel coreLabel : coreLabels) {

                String word = coreLabel.word();
                String annotation = coreLabel.get(CoreAnnotations.AnswerAnnotation.class);

                if (!"O".equals(annotation)) {

                    if (prevLabel == null) {
                        prevLabel = annotation;
                        prevToken = word;
                    } else {

                        if (prevLabel.equals(annotation)) {
                            prevToken += " " + word;
                        } else {
                            prevLabel = annotation;
                            prevToken = word;
                        }
                    }
                } else {

                    if (prevLabel != null) {
                        entitiesList.put(prevToken, prevLabel);
                        prevLabel = null;
                    }
                }
            }
        }
    }

    return entitiesList;
}


Imports:

Document: org.bson.Document;
NERClassifierCombiner: edu.stanford.nlp.ie.NERClassifierCombiner;

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它8个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复