Hibernate search to find partial matches of a phrase

廉价感情. 提交于 2019-12-12 01:57:07

问题


In my project we are using hibernate search 4.5 with lucene-analyzers and solar. I provide a text field to my clients. When they type in a phrase I would like to find all User entities whose names include the given phrase.

For example consider having list of entries in database with following titles:

[ Alan Smith, John Cane, Juno Taylor, Tom Caner Junior ]

jun should return Juno Taylor and Tom Caner Junior

an should return Alan Smith, John Cane and Tom Caner Junior

    @AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
            @TokenFilterDef(factory = LowerCaseFilterFactory.class),
            @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })

    })
@Analyzer(definition = "customanalyzer")
    public class Student implements Serializable {

        @Column(name = "Fname")
        @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
        private String fname;

        @Column(name = "Lname")
        @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
        private String lname;

    }

I have tried with wildcard search but

Wildcard queries do not apply the analyzer on the matching terms. Otherwise the risk of * or ? being mangled is too high.

Query luceneQuery = mythQB
    .keyword()
      .wildcard()
    .onFields("fname")
    .matching("ju*")
    .createQuery();

How can I achieve this?


回答1:


First, you didn't assign the analyzer to your field, so it isn't used currently. You should use @Field.analyzer.

Second, to answer your question, this kind of text is best analyzed with an EdgeNGramFilter. You should add this filter to your analyzer definition.

EDIT: Also, to avoid queries such as "sathya" to match "sanchana" for instance, you should use a different analyzer when querying.

Below is a full example.

@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
        @TokenFilterDef(factory = LowerCaseFilterFactory.class),
        @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
        @TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = { @Parameter(name = "maxGramSize", value = "15") })

})
@AnalyzerDef(name = "customanalyzer_query", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
        @TokenFilterDef(factory = LowerCaseFilterFactory.class),
        @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })

})
public class Student implements Serializable {

    @Column(name = "Fname")
    @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer"))
    private String fname;

    @Column(name = "Lname")
    @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer")))
    private String lname;

}

And then specifically mention that you want to use this "query" analyzer when building your query:

QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Student.class)
    // Here come the assignments of "query" analyzers
    .overridesForField( "fname", "customanalyzer_query" )
    .overridesForField( "lname", "customanalyzer_query" )
    .get();
// Then it's business as usual
Query luceneQuery = queryBuilder.keyword().onFields("fname", "lname").matching("sathya").createQuery();
FullTextQuery query = fullTextEntityManager.createFullTextQuery(luceneQuery, Student.class);

See also: https://stackoverflow.com/a/43047342/6692043


By the way, if your data includes only first and last names, you shouldn't use stemming (SnowballPorterFilterFactory): it will only make the search less accurate for no good reason.




回答2:


Why not use a standard TypedQuery?

(where String term is your search-term)

TypedQuery<Student> q = em.createQuery(
        "SELECT s " +
        "FROM Student s " +
        "WHERE s.fname like :search " +
        "OR s.lname like :search";
q.setParameter("search", "%" + term + "%");

Didn't test this one, but something like this should do the trick.



来源:https://stackoverflow.com/questions/44028095/hibernate-search-to-find-partial-matches-of-a-phrase

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!