Java OpenNLP extract all nouns from a sentence

冷暖自知 提交于 2019-12-13 07:22:33

问题


I am using Java8 and OpenNLP. I am trying to extract all noun words from sentences.

I have tried this example, but it extracts all noun phrases ("NP"). Does anyone know how I can just extract the individual noun words?

Thanks


回答1:


What have you tried so far? I haven't looked at the example you link to in a lot of detail, but I'm pretty sure that you could get where you want to with modifying that example. In any case, it's not very difficult:

InputStream modelIn = null;
POSModel POSModel = null;
try{
    File f = new File("<location to your tagger model here>");
    modelIn = new FileInputStream(f);
    POSModel = new POSModel(modelIn);
    POSTaggerME tagger = new POSTaggerME(POSModel);
    SimpleTokenizer tokenizer= new SimpleTokenizer();
    String tokens[] = tokenizer.tokenize("This is a sample sentence.");
    String[] tagged = tagger.tag(tokens);
    for (int i = 0; i < tagged.length; i++){
        if (tagged[i].equalsIgnoreCase("nn")){
            System.out.println(tokens[i]);
        }
    }

}
catch(IOException e){
    throw new BadRequestException(e.getMessage());
}

You can download the tagger models here: http://opennlp.sourceforge.net/models-1.5/

And I should say that the SimpleTokenizer is deprecated. You may want to look into a bit more sophisticated one, but in my experience, the more fancy ones from OpenNLP are also a lot slower (and in general unacceptably slow for just tokenisation).



来源:https://stackoverflow.com/questions/40603865/java-opennlp-extract-all-nouns-from-a-sentence

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!