xapian

Document search on partial words

你说的曾经没有我的故事 提交于 2019-11-30 13:03:52
I am looking for a document search engine (like Xapian, Whoosh, Lucene, Solr, Sphinx or others) which is capable of searching partial terms. For example when searching for the term "brit" the search engine should return documents containing either "britney" or "britain" or in general any document containing a word matching r *brit* Tangentially, I noticed most engines use TF-IDF (Term frequency-Inverse document frequency) or its derivatives which are based on full terms and not partial terms. Are there any other techniques that have been successfully implemented besides TF-IDF for document

ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? [closed]

狂风中的少年 提交于 2019-11-26 10:56:28
I'm currently looking at other search methods rather than having a huge SQL query. I saw elasticsearch recently and played with whoosh (a Python implementation of a search engine). Can you give reasons for your choice(s)? kimchy As the creator of ElasticSearch, maybe I can give you some reasoning on why I went ahead and created it in the first place :). Using pure Lucene is challenging. There are many things that you need to take care for if you want it to really perform well, and also, its a library, so no distributed support, it's just an embedded Java library that you need to maintain. In