simple in memory full text search solution

后端 未结 3 1983
一向
一向 2021-01-18 03:13

I have a small website running on Java with probably a dozen of markdown files. I want to provide a full text search for user to quickly access those markdown files. Since i

相关标签:
3条回答
  • 2021-01-18 03:24

    As a side project I have implemented a simple in memory text search solution for java.

    https://github.com/bradforj287/SimpleTextSearch

    Key Features:

    • Inverted Index
    • Cosine Similarity algorithm w/ TFIDF ranking
    • MultiThreadded index creation and searching
    • Word Stemming (snowball stemmer)
    • Strips HTML tags automatically
    • Stop words
    • String tokenizer (Stanford NLP)

    Might want to take a look.

    0 讨论(0)
  • 2021-01-18 03:25

    Drop in Apache Lucene, the more-or-less gold standard in full-text search. It is happy to operate in memory.

    0 讨论(0)
  • 2021-01-18 03:36

    Use one of the in-memory databases, either H2 or HSQLDB. Then, for the full text search part, just use Hibernate Search. It will work with either of the two DBs and it will keep you from having to deal with Lucene at all: you can just annotate your entities, and go: all the indexing will happen automatically, and if you want to do things like boost fields, you can do that with a simple annotation.

    0 讨论(0)
提交回复
热议问题