Stop words and stemmer in java

后端 未结 3 836
广开言路
广开言路 2021-02-06 16:02

I\'m thinking of putting a stop words in my similarity program and then a stemmer (going for porters 1 or 2 depends on what easiest to implement)

I was wondering that si

3条回答
  •  感情败类
    2021-02-06 16:20

    Yes, you can wrap any stemmer so that you can write something like

    String stemmedString = stemmer.stemAndRemoveStopwords(inputString, stopWordList);
    

    Internally, your stemAndRemoveStopwords would

    • place all stopWords in a Map for fast reference
    • initialize an empty StringBuilder to holde the output string
    • iterate over all words in the input string, and for each word
      • search for it in the stopWordList; if found, continue to top of loop
      • otherwise, stem it using your preferred stemmer, and add it to to the output string
    • return the output string

提交回复
热议问题