Formatting NER output from Stanford Corenlp

前端 未结 4 1835
耶瑟儿~
耶瑟儿~ 2021-01-15 16:22

I am working with Stanford CoreNLP and using it for NER. But when I extract organization names, I see that each word is tagged with the annotation. So, if the entity is \"NE

4条回答
  •  迷失自我
    2021-01-15 17:08

    If you just want the complete strings of each named entity found by Stanford NER, try this:

    String text = "";
    AbstractSequenceClassifier ner = CRFClassifier.getDefaultClassifier();
    List> entities = ner.classifyToCharacterOffsets(text);
    for (Triple entity : entities)
        System.out.println(text.substring(entity.second, entity.third), entity.second));
    

    In case you're wondering, the entity class is indicated by entity.first.

    Alternatively, you can use ner.classifyWithInlineXML(text) to get output that looks like Bill Smith went to Paris .

提交回复
热议问题