Formatting NER output from Stanford Corenlp

前端未结

关注

 4  1838

耶瑟儿～ 2021-01-15 16:22

I am working with Stanford CoreNLP and using it for NER. But when I extract organization names, I see that each word is tagged with the annotation. So, if the entity is \"NE

4条回答

一向 (楼主)

2021-01-15 17:22

From Stanford CoreNLP 3.6 and onwards, You can use entitymentions in Pipeline and get list of all Entities. I have shown an example here. It works.

Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, regexner,entitymentions");
props.put("regexner.mapping", "jg-regexner.txt");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);


String inputText = "I have done Bachelor of Arts and Bachelor of Laws so that I can work at British Broadcasting Corporation"; 
Annotation annotation = new Annotation(inputText);

pipeline.annotate(annotation); 

List multiWordsExp = annotation.get(MentionsAnnotation.class);
for (CoreMap multiWord : multiWordsExp) {
      String custNERClass = multiWord.get(NamedEntityTagAnnotation.class);
      System.out.println(multiWord +" : " +custNERClass);
}

0 讨论(0)

查看其它4个回答