I am working with Stanford CoreNLP and using it for NER. But when I extract organization names, I see that each word is tagged with the annotation. So, if the entity is \"NE
If you just want the complete strings of each named entity found by Stanford NER, try this:
String text = "";
AbstractSequenceClassifier ner = CRFClassifier.getDefaultClassifier();
List> entities = ner.classifyToCharacterOffsets(text);
for (Triple entity : entities)
System.out.println(text.substring(entity.second, entity.third), entity.second));
In case you're wondering, the entity class is indicated by entity.first
.
Alternatively, you can use ner.classifyWithInlineXML(text)
to get output that looks like