Elasticsearch Map case insensitive to not_analyzed documents

后端未结

关注

 8  1587

I have a type with following mapping

PUT /testindex
{
    \"mappings\" : {
        \"products\" : {
            \"properties\" : {
                \"category


                      
              相关标签:


      
      
        
          8条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  执念已碎        
                
              
                            
                2020-12-08 15:39
              
            
            
                                                                       
I believe this Gist answers your question best:
 * https://gist.github.com/mtyaka/2006966

You can index a field several times during mapping and we do this all the time where one is not_analyzed and another is. We typically set the not_analyzed version to .raw

Like John P. wrote, you can set up analyzer during runtime, or you can set one up in the config at server start like in link above:

# Register the custom 'lowercase_keyword' analyzer. It doesn't do anything else
# other than changing everything to lower case.
index.analysis.analyzer.lowercase_keyword.type: custom
index.analysis.analyzer.lowercase_keyword.tokenizer: keyword
index.analysis.analyzer.lowercase_keyword.filter: [lowercase]


Then you define your mapping for your field(s) with both the not_analyzed version and the analyzed one:

# Map the 'tags' property to two fields: one that isn't analyzed,
# and one that is analyzed with the 'lowercase_keyword' analyzer.
curl -XPUT 'http://localhost:9200/myindex/images/_mapping' -d '{
  "images": {
    "properties": {
      "tags": {
        "type": "multi_field",
        "fields": {
          "tags": {
            "index": "not_analyzed",
            "type": "string"
          },
          "lowercased": {
            "index": "analyzed",
            "analyzer": "lowercase_keyword",
            "type": "string"
          }
        }
      }
    }
  }
}'


And finally your query (note lowercased values before building query to help find match):

# Issue queries against the index. The search query must be manually lowercased.
curl -XPOST 'http://localhost:9200/myindex/images/_search?pretty=true' -d '{
  "query": {
    "terms": {
      "tags.lowercased": [
        "event:battle at the boardwalk"
      ]
    }
  },
  "facets": {
    "tags": {
      "terms": {
        "field": "tags",
        "size": "500",
        "regex": "^team:.*"
      }
    }
  }
}'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  轮回少年        
                
              
                            
                2020-12-08 15:42
              
            
            
                                                                       
To this scenarios, I suggest that you could combine lowercase filter and keyword tokenizer into your custom analyzer. And lowercase your search-input keywords.

1.Create index with the analyzer combined with lowercase filter and keyword tokenizer

curl -XPUT localhost:9200/test/ -d '
{
  "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "your_custom_analyzer":{
                 "tokenizer":"keyword",
                 "filter": ["lowercase"]
              }
           }
        }
    }
}'


2.Put mappings and set the field properties with the analyzer

curl -XPUT localhost:9200/test/_mappings/twitter -d '
{
    "twitter": {
        "properties": {
            "content": {"type": "string", "analyzer": "your_custom_analyzer" }
        }
    }
}'


3.You could search what you want in wildcard query.

curl -XPOST localhost:9200/test/twitter/ -d '{

    "query": {
        "wildcard": {"content": "**the words you want to search**"}
    }  
}'


Another way for search a filed in different way. I offser a suggestion for U was that using the multi_fields type.

You could set the field in multi_field

curl -XPUT localhost:9200/test/_mapping/twitter -d '
{
    "properties": {
        "content": {
            "type": "multi_field",
            "fields": {
                "default": {"type": "string"},
                "search": {"type": "string", "analyzer": "your_custom_analyzer"}
            }
        }
    }
}'


So you could index data with above mappings properties. and finally search it in two way (default/your_custom_analyzer)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复