Getting word stems with JWI and Wordnet

大城市里の小女人 提交于 2019-12-04 11:32:11

You don't need an additional library, but you do need a dictionary. You can download one from Princeton: https://wordnet.princeton.edu/wordnet/download/current-version/

I recommend downloading only the dictionary from the section "WordNet 3.1 DATABASE FILES ONLY" Extract the archive. Supposing that PATH/dict is the location of the output you can use this code:

Dictionary dict = new Dictionary(new File("PATH/dict"));
dict.open();
WordnetStemmer stemmer = new WordnetStemmer(dict);

List<String> test = stemmer.findStems("feet", POS.NOUN);
for (int i = 0; i < test.size(); i++) {
    System.out.println(test.get(i));
}

The output for this example is "foot".

This is meant as a comment to sakthi's answer: you actually have to precise which POS you're looking for (noun, adjective, verb, etc.) when calling the findStems method (JWI v2.2.3): http://projects.csail.mit.edu/jwi/api/edu/mit/jwi/morph/IStemmer.html

jar files used are edu.mit.jwi_2.1.4.jar and edu.sussex.nlp.jws.beta.11.jar

JWS ws = new JWS("C:/Program Files/WordNet","2.1");  
WordnetStemmer stem =  new WordnetStemmer(ws.getDictionary());
System.out.println("test" + stem.findStems("reading") );
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!