Why HTML tag is searchable even if it was filtered in elastic search

前端 未结 1 1929
花落未央
花落未央 2021-01-21 14:45

I am new to elasticsearch and was testing html_strip filter. Ideally I should not be able to search on HTML tags. Following is steps.

Index:

curl -XPOST          


        
1条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-21 15:20

    You need to apply analyzer before indexing on the mapping. This will make sure all documents that are indexed passes through this mapping and all the tags are stripped out before indexing. In your case , you applied the analyzer while querying and this will only affect your search phrase and not the data you search.

    You can read more on creating mapping here

    I dont believe there is format like this -

    http://localhost:9200/foo/test/_search?tokenizer=standard&char_filters=html_strip&q=title
    

    Rather if you can set the analyzer as follows , it should work fine -

    curl -XPUT "http://localhost:9200/foo " -d'
    {
      "foo": {
        "settings": {
          "analysis": {
            "analyzer": {
              "html_analyzer": {
                "type": "custom",
                "tokenizer": "standard",
                "filter": [
                  "standard",
                  "lowercase",
                  "stop",
                  "asciifolding"
                ],
                "char_filter": [
                  "html_strip"
                ]
              },
              "whitespace_analyzer": {
                "type": "custom",
                "tokenizer": "whitespace",
                "filter": [
                  "standard",
                  "lowercase",
                  "stop",
                  "asciifolding"
                ]
              }
            }
          }
        },
        "mappings": {
          "test": {
            "properties": {
              "content": {
                "type": "string",
                "analyzer": "html_analyzer"
              }
            }
          }
        }
      }
    }'
    

    Here i made the analyzer common for indexing and searching

    0 讨论(0)
提交回复
热议问题