How can I boost the field length norm in elasticsearch function score?

匿名 (未验证) 提交于 2019-12-03 08:41:19

问题:

I know that elasticsearch takes in account the length of a field when computing the score of the documents retrieved by a query. The shorter the field, the higher the weight (see The field-length norm).

I like this behaviour: when I search for iphone I am much more interested in iphone 6 than in Crappy accessories for: iphone 5 iphone 5s iphone 6.

Now, I would like to try to boost this stuff, let's say that I want to double its importance.

I know that one can modify the score using the function score, and I guess that I can achieve what I want via script score.

I tried to add another field-length norm to the score like this:

    {      "query": {        "function_score": {          "boost_mode": "replace",          "query": {...},          "script_score": {              "script": "_score + norm(doc)"          }        }      }    }

But I failed badly, getting this error: [No parser for element [function_score]]

EDIT:

My first error was that I hadn't wrapped the function score in a "query". Now I edited the code above. My new error says

GroovyScriptExecutionException[MissingMethodException [No signature of method: Script5.norm() is applicable for argument types: (org.elasticsearch.search.lookup.DocLookup) values:  [<org.elasticsearch.search.lookup.DocLookup@2c935f6f>] Possible solutions: notify(), wait(), run(), run(), dump(), any()]]

EDIT: I provided a first answer, but I'm hoping for a better one

回答1:

It looks like you could achieve that using a field of type token_count together with a field_value_factor function score.

So, something like this in the field mapping:

"name": {    "type": "string",   "fields": {     "length": {        "type":     "token_count",       "analyzer": "standard"     }   } }

This will use the number of tokens in the field. If you want to use the number of characters, you can change the analyzer from standard to a custom one that tokenizes each character.

Then in the query:

"function_score": {   ...,   "field_value_factor": {     "field": "name.length",     "modifier": "reciprocal"   } }


回答2:

I have something that kind of works. With the following, I deduct the length of a field of my interest from the score.

{  "query": {    "function_score": {      "boost_mode": "replace",      "query": {...},      "script_score": {          "script": "_score  - doc['<field_name>'].value.length()"      }    }  } }

Hovever, I cannot control the relative weight of this number I am subtracting, compared to the old score. That's why I am not accepting my answer: I'll wait for better ones for a while. Ideally, I'd love to have a way to access the field length norm function within the script_score, or to get an equivalent result.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!