Multi-term named entities in Stanford Named Entity Recognizer

前端 未结 8 1342
春和景丽
春和景丽 2021-01-31 19:33

I\'m using the Stanford Named Entity Recognizer http://nlp.stanford.edu/software/CRF-NER.shtml and it\'s working fine. This is

    List&         


        
8条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-31 20:26

    The counterpart of the classifyToCharacterOffsets method is that (AFAIK) you can't access the label of the entities.

    As proposed by Christopher, here is an example of a loop which assembles "adjacent non-O things". This example also counts the number of occurrences.

    public HashMap> extractEntities(String text){
    
        HashMap> entities =
                new HashMap>();
    
        for (List lcl : classifier.classify(text)) {
    
            Iterator iterator = lcl.iterator();
    
            if (!iterator.hasNext())
                continue;
    
            CoreLabel cl = iterator.next();
    
            while (iterator.hasNext()) {
                String answer =
                        cl.getString(CoreAnnotations.AnswerAnnotation.class);
    
                if (answer.equals("O")) {
                    cl = iterator.next();
                    continue;
                }
    
                if (!entities.containsKey(answer))
                    entities.put(answer, new HashMap());
    
                String value = cl.getString(CoreAnnotations.ValueAnnotation.class);
    
                while (iterator.hasNext()) {
                    cl = iterator.next();
                    if (answer.equals(
                            cl.getString(CoreAnnotations.AnswerAnnotation.class)))
                        value = value + " " +
                               cl.getString(CoreAnnotations.ValueAnnotation.class);
                    else {
                        if (!entities.get(answer).containsKey(value))
                            entities.get(answer).put(value, 0);
    
                        entities.get(answer).put(value,
                                entities.get(answer).get(value) + 1);
    
                        break;
                    }
                }
    
                if (!iterator.hasNext())
                    break;
            }
        }
    
        return entities;
    }
    

提交回复
热议问题