Multi-term named entities in Stanford Named Entity Recognizer

前端 未结 8 1352
春和景丽
春和景丽 2021-01-31 19:33

I\'m using the Stanford Named Entity Recognizer http://nlp.stanford.edu/software/CRF-NER.shtml and it\'s working fine. This is

    List&         


        
8条回答
  •  鱼传尺愫
    2021-01-31 20:00

    Another approach to deal with multi words entities. This code combines multiple tokens together if they have the same annotation and go in a row.

    Restriction:
    If the same token has two different annotations, the last one will be saved.

    private Document getEntities(String fullText) {
    
        Document entitiesList = new Document();
        NERClassifierCombiner nerCombClassifier = loadNERClassifiers();
    
        if (nerCombClassifier != null) {
    
            List> results = nerCombClassifier.classify(fullText);
    
            for (List coreLabels : results) {
    
                String prevLabel = null;
                String prevToken = null;
    
                for (CoreLabel coreLabel : coreLabels) {
    
                    String word = coreLabel.word();
                    String annotation = coreLabel.get(CoreAnnotations.AnswerAnnotation.class);
    
                    if (!"O".equals(annotation)) {
    
                        if (prevLabel == null) {
                            prevLabel = annotation;
                            prevToken = word;
                        } else {
    
                            if (prevLabel.equals(annotation)) {
                                prevToken += " " + word;
                            } else {
                                prevLabel = annotation;
                                prevToken = word;
                            }
                        }
                    } else {
    
                        if (prevLabel != null) {
                            entitiesList.put(prevToken, prevLabel);
                            prevLabel = null;
                        }
                    }
                }
            }
        }
    
        return entitiesList;
    }
    

    Imports:

    Document: org.bson.Document;
    NERClassifierCombiner: edu.stanford.nlp.ie.NERClassifierCombiner;
    

提交回复
热议问题