问题
I have two fields in an entity class:
- establishmentName
- contactType
contactType has values like PBX, GSM, TEL and FAX
I want a scoring mechanism as to get the most matching data first then PBX, TEL, GSM and FAX.
Scoring:
- On establishmentName to get the most matching data first
- On contactType to get first PBX then TEL and so on
My final query is:
(+establishmentName:kamran~1^2.5 +(contactType:PBX^2.0 contactType:TEL^1.8 contactType:GSM^1.6 contactType:FAX^1.4))
But it not returning the result.
My question is, how to boost a specific field on different values basis ?
We can use the following query for two different fields:
Query query = qb.keyword()
.onField( field_one).boostedTo(2.0f)
.andField( field_two)
.matching( searchTerm)
.createQuery();
But i need to boost a field on its values as in my case it is contactType.
My dataset:
(establishmentName : Concert Decoration, contactType : GSM),
(establishmentName : Elissa Concert, contactType : TEL),
(establishmentName : Yara Concert, contactType : FAX),
(establishmentName : E Concept, contactType : TEL),
(establishmentName : Infinity Concept, contactType : FAX),
(establishmentName : SD Concept, contactType : PBX),
(establishmentName : Broadcom Technical Concept, contactType : GSM),
(establishmentName : Concept Businessmen, contactType : PBX)
By searching the term=concert(fuzzy query on establishmentName), it should return me the list as below: (establishmentName : Elissa Concert, contactType : TEL)
[term=concert, exact matching so it will be on top by keeping the order as PBX, TEL, GSM and FAX]
(establishmentName : Concert Decoration, contactType : GSM)
[term=concert, exact matching and by keeping the order as PBX, TEL, GSM and FAX]
(establishmentName : Yara Concert, contactType : FAX)
[term=concert, exact matching and by keeping the order as PBX, TEL, GSM and FAX]
(establishmentName : Concept Businessmen, contactType : PBX)
[term=concert, partial matching and keeping the order as PBX, TEL, GSM and FAX]
(establishmentName : SD Concept, contactType : PBX)
[term=concert, partial matching and keeping the order as PBX, TEL, GSM and FAX]
(establishmentName : E Concept, contactType : TEL)
[term=concert, partial matching and keeping the order as PBX, TEL, GSM and FAX]
(establishmentName : Broadcom Technical Concept, contactType : GSM)
[term=concert, partial matching and keeping the order as PBX, TEL, GSM and FAX]
(establishmentName : Infinity Concept, contactType : FAX)
[term=concert, partial matching and keeping the order as PBX, TEL, GSM and FAX]
回答1:
From what I understand you basically want a two-phase sort:
- Put exact matches before other (fuzzy) matches.
- Sort by contact type.
The second sort is trivial, but the first one will require a bit of work. You can actually rely on scoring to implement it.
Essentially the idea would be to run a disjunction of multiple queries, and to assign a constant score to each query.
Instead of doing this:
Query query = qb.keyword()
.fuzzy().withEditDistanceUpTo(1)
.boostedTo(2.5f)
.onField("establishmentName")
.matching(searchTerm)
.createQuery();
Do this:
Query query = qb.bool()
.should(qb.keyword()
.withConstantScore().boostedTo(100.0f) // Higher score, sort first
.onField("establishmentName")
.matching(searchTerm)
.createQuery())
.should(qb.keyword()
.fuzzy().withEditDistanceUpTo(1)
.withConstantScore().boostedTo(1.0f) // Lower score, sort last
.onField("establishmentName")
.matching(searchTerm)
.createQuery())
.createQuery();
The matched documents will be the same, but now the query will assign predictable scores: 1.0 for fuzzy-only matches, and 101.0 (1 from the fuzzy query and 100 from the exact query) for exact matches.
This way, you can define the sort as follows:
fullTextQuery.setSort(qb.sort()
.byScore()
.andByField("contactType")
.createSort());
This may not be a very elegant, or optimized solution, but I think it will work.
To customize the relative order of contact types, I would suggest a different approach: use a custom bridge to index numbers instead of the "PBX"/"TEL"/etc., assigning to each contact type the ordinal you expect. Essentially something like that:
public class Establishment {
@Field(name = "contactType_sort", bridge = @FieldBridge(impl = ContactTypeOrdinalBridge.class))
private ContactType contactType;
}
public class ContactTypeOrdinalBridge implements MetadataProvidingFieldBridge {
@Override
public void set(String name, Object value, Document document, LuceneOptions luceneOptions) {
if ( value != null ) {
int ordinal = getOrdinal((ContactType) value);
luceneOptions.addNumericFieldToDocument(name, ordinal, document);
luceneOptions.addNumericDocValuesFieldToDocument(name, ordinal, document);
}
}
@Override
public void configureFieldMetadata(String name, FieldMetadataBuilder builder) {
builder.field(name, FieldType.INTEGER).sortable(true);
}
private int getOrdinal(ContactType value) {
switch( value ) {
case PBX: return 0;
case TEL: return 1;
case GSM: return 2;
case PBX: return 3;
default: return 4;
}
}
}
Then reindex, and sort like this:
fullTextQuery.setSort(qb.sort()
.byScore()
.andByField("contactType_sort")
.createSort());
来源:https://stackoverflow.com/questions/59658935/how-to-boost-hibernate-search-query-with-field-values