Elastic search query using match_phrase_prefix and fuzziness at the same time?

為{幸葍}努か 提交于 2019-12-23 11:59:12

问题


I am new to elastic search, so I am struggling a bit to find the optimal query for our data.

Imagine I want to match the following word "Handelsstandens Boldklub".

Currently, I'm using the following query:

{
    query: {
      bool: {
        should: [
          {
            match: {
              name: {
                query: query, slop: 5, type: "phrase_prefix"
              }
            }
          },
          {
            match: {
              name: {
                query: query,
                fuzziness: "AUTO",
                operator: "and"
              }
            }
          }
        ]
      }
    }
  }

It currently list the word if I am searching for "Hand", but if I search for "Handle" the word will no longer be listed as I did a typo. However if I reach to the end with "Handlesstandens" it will be listed again, as the fuzziness will catch the typo, but only when I have typed the whole word.

Is it somehow possible to do phrase_prefix and fuzziness at the same time? So in the above case, if I make a typo on the way, it will still list the word?

So in this case, if I search for "Handle", it will still match the word "Handelsstandens Boldklub".

Or what other workarounds are there to achieve the above experience? I like the phrase_prefix matching as its also supports sloppy matching (hence I can search for "Boldklub han" and it will list the result)

Or can the above be achieved by using the completion suggester?


回答1:


Okay, so after investigating elasticsearch even further, I came to the conclusion that I should use ngrams.

Here is a really good explaniation of what it does and how it works. https://qbox.io/blog/an-introduction-to-ngrams-in-elasticsearch

Here is the settings and mapping I used: (This is elasticsearch-rails syntax)

settings analysis: {
  filter: {
    ngram_filter: {
      type: "ngram",
      min_gram: "2",
      max_gram: "20"
    }
  },
  analyzer: {
    ngram_analyzer: {
      type: "custom",
      tokenizer: "standard",
      filter: ["lowercase", "ngram_filter"]
    }
  }
} do
  mappings do
    indexes :name, type: "string", analyzer: "ngram_analyzer"
    indexes :country_id, type: "integer"
  end
end

And the query: (This query actually search in two different indexes at the same time)

{
    query: {
      bool: {
        should: [
          {
            bool: {
              must: [
                { match: { "club.country_id": country.id } },
                { match: { name: query } }
              ]
            }
          },
          {
            bool: {
              must: [
                { match: { country_id: country.id } },
                { match: { name: query } }
              ]
            }
          }
        ],
        minimum_should_match: 1
      }
    }
  }

But basically you should just do a match or multi match query, depending on how many fields you want to search in.

I hope someone find it helpful, as I was personally thinking to much in terms of fuzziness instead of ngrams (Didn't know about before). This led me in the wrong direction.



来源:https://stackoverflow.com/questions/39119209/elastic-search-query-using-match-phrase-prefix-and-fuzziness-at-the-same-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!