How to score based on number of substring occurrences within a field in ElasticSearch

前端 未结 1 1208
野性不改
野性不改 2021-01-28 00:32

I have an ElasticSearch document setup like so

{
  metadata: {
    content: \"\",
    other_fields: ...
  },
  other_fields: ...
}

and am query

相关标签:
1条回答
  • 2021-01-28 01:13

    I could not replicate your query -- too many typos/syntax issues. Here's what I could reconstruct:

    Let's first create some sample docs

    POST metaa/_doc
    {
      "metadata": {
        "content": "test"
      }
    }
    
    POST metaa/_doc
    {
      "metadata": {
        "content": "test test"
      }
    }
    
    POST metaa/_doc
    {
      "metadata": {
        "content": "test test test"
      }
    }
    

    then querying & using script_score, inspired by this cool answer:

    GET metaa/_search
    {
      "query": {
        "function_score": {
          "query": {
            "multi_match": {
              "query": "test",
              "fields": [
                "metadata.content",
                "other_fields"
              ]
            }
          },
          "functions": [
            {
              "script_score": {
                "script": {
                  "source": """
                    def docval = doc['metadata.content.keyword'].value;
                    String temp = docval.replace('test', "");
                    return (docval.length() - temp.length()) / 4;
                  """
                }
              },
              "filter": {
                "match": {
                  "metadata.content": {
                    "query": "test"
                  }
                }
              },
              "weight": 3
            }
          ],
          "boost_mode": "replace",
          "score_mode": "sum"
        }
      }
    }
    

    yielding

    [
      {
        "_index":"metaa",
        "_type":"_doc",
        "_id":"suh763EBG_KW3EFnjwNq",
        "_score":9.0,
        "_source":{
          "metadata":{
            "content":"test test test"
          }
        }
      },
      {
        "_index":"metaa",
        "_type":"_doc",
        "_id":"seh763EBG_KW3EFnewP6",
        "_score":6.0,
        "_source":{
          "metadata":{
            "content":"test test"
          }
        }
      },
      {
        "_index":"metaa",
        "_type":"_doc",
        "_id":"s-h_63EBG_KW3EFnMQMU",
        "_score":3.0,
        "_source":{
          "metadata":{
            "content":"test"
          }
        }
      }
    ]
    

    _score = 9 -> 6 -> 3.

    Note: you may want to perform some validity checks within the script (could be a simple try/catch). But that's an exercise for the reader.

    0 讨论(0)
提交回复
热议问题