How to search for text fragments in a database

后端 未结 10 1264
旧时难觅i
旧时难觅i 2021-02-06 10:51

Are there any open source or commercial tools available that allow for text fragment indexing of database contents and can be queried from Java?

Background of the questi

10条回答
  •  一生所求
    2021-02-06 11:31

    Shingle search could do the trick.

    http://en.wikipedia.org/wiki/W-shingling

    For example, if you use 3-character shingles, you can split "Roisonic" to: "roi", "son", "ic ", and store all three values, associating them with original entry. When searching for "oison", you first will search for "ois", "iso", "son". First you fuzzy-match all entries by shingles (finding the one with "son"), and then you can refine the search by using exact string matching.

    Note that 3-character shingle require the fragment in query to be at least 5 characters long, 4-char shingle requires 7-char query and so on.

提交回复
热议问题