Elastic Search wildcard search with spaces

耗尽温柔 提交于 2019-12-23 08:33:37

问题


I have the following query. I'm trying to find values of 'hello world', but it returns zero results. However, when value = 'hello*', it does give me that expected result. Any idea how I can change my query to give me that hello world result? I've tried *hello world*, but for some reason it just won't search anything with spaces.

I think it has something to do with the spaces as when I try to search "* *", it gives me no results. But I know I have many values in there with spaces. Any ideas would help!

 {
  "query": {
    "filtered": {
      "filter": {
        "and": [
          {
            "terms": {
              "variant": [
                "collection"
              ]
            }
          }
        ]
      },
      "query": {
        "wildcard": {
          "name": {
            "value": "hello world"
          }
        }
      }
    }
  }
}

回答1:


What is the mapping you have used for your field name? If you have not defined any mapping or you have just defined the type as string (without any analyzer) then the field will be analyzed using the standard analyzer. This will create tokens as "hello" and "world" separately. This means wildcard query will work for something like *ell* or *wor* but not with spaces.

You have to change your mapping to have the field "name" as not_analyzed then wildcard searches with space will work.

A word of caution: Wildcard searches are heavy. If you want to do partial matching search (equivalent of %like%) You can use ngram token filter in your analyzer and do term search. It will take care of matching partial string and have better performance too.




回答2:


The "string" type is legacy and with index "not_analyzed" it is mapped to the type "keyword" which is not divided into substrings. I had problems with queries including spaces before though and solved it by splitting the query in substrings at the blank spaces and making a combined query, adding a wildcard-object for every substring, using "bool" and "must":

{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "name": "*hello*"
          }
        },
        {
          "wildcard": {
            "name": "*world*"
          }
        }
      ]
    }
  }
}

This method has the small drawback that "hell world!" and other unexpected strings end up in your result. You could solve that by changing "wildcard" to "match" for all but the last substring.

You should try to solve it by first changing the type of the field:

PUT your_index
{
  "mappings": {
    "your_index": {
      "properties": {
        "your_field1": {
           "type": "keyword"
            },
        "your_field2": {
            "type": "string",
            "index": "not_analyzed"
            }
         }
      }
    }
  }
}


来源:https://stackoverflow.com/questions/30113753/elastic-search-wildcard-search-with-spaces

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!