问题
In my project we are using hibernate search 4.5 with lucene-analyzers and solar.
I provide a text field to my clients. When they type in a phrase I would like to find all User
entities whose names include the given phrase.
For example consider having list of entries in database with following titles:
[ Alan Smith, John Cane, Juno Taylor, Tom Caner Junior ]
jun
should return Juno Taylor
and Tom Caner Junior
an
should return Alan Smith
, John Cane
and Tom Caner Junior
@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
})
@Analyzer(definition = "customanalyzer")
public class Student implements Serializable {
@Column(name = "Fname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
private String fname;
@Column(name = "Lname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
private String lname;
}
I have tried with wildcard search but
Wildcard queries do not apply the analyzer on the matching terms. Otherwise the risk of * or ? being mangled is too high.
Query luceneQuery = mythQB
.keyword()
.wildcard()
.onFields("fname")
.matching("ju*")
.createQuery();
How can I achieve this?
回答1:
First, you didn't assign the analyzer to your field, so it isn't used currently. You should use @Field.analyzer.
Second, to answer your question, this kind of text is best analyzed with an EdgeNGramFilter
. You should add this filter to your analyzer definition.
EDIT: Also, to avoid queries such as "sathya" to match "sanchana" for instance, you should use a different analyzer when querying.
Below is a full example.
@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
@TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = { @Parameter(name = "maxGramSize", value = "15") })
})
@AnalyzerDef(name = "customanalyzer_query", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
})
public class Student implements Serializable {
@Column(name = "Fname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer"))
private String fname;
@Column(name = "Lname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer")))
private String lname;
}
And then specifically mention that you want to use this "query" analyzer when building your query:
QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Student.class)
// Here come the assignments of "query" analyzers
.overridesForField( "fname", "customanalyzer_query" )
.overridesForField( "lname", "customanalyzer_query" )
.get();
// Then it's business as usual
Query luceneQuery = queryBuilder.keyword().onFields("fname", "lname").matching("sathya").createQuery();
FullTextQuery query = fullTextEntityManager.createFullTextQuery(luceneQuery, Student.class);
See also: https://stackoverflow.com/a/43047342/6692043
By the way, if your data includes only first and last names, you shouldn't use stemming (SnowballPorterFilterFactory
): it will only make the search less accurate for no good reason.
回答2:
Why not use a standard TypedQuery
?
(where String term
is your search-term)
TypedQuery<Student> q = em.createQuery(
"SELECT s " +
"FROM Student s " +
"WHERE s.fname like :search " +
"OR s.lname like :search";
q.setParameter("search", "%" + term + "%");
Didn't test this one, but something like this should do the trick.
来源:https://stackoverflow.com/questions/44028095/hibernate-search-to-find-partial-matches-of-a-phrase