negative lookahead Regexp doesnt work in ES dsl query

后端 未结 1 1097
无人及你
无人及你 2021-01-21 08:14

The mapping of my Elastic search looks like below:

{
  \"settings\": {
    \"index\": {
      \"number_of_shards\": \"5\",
      \"number_of_replicas\": \"1\"
           


        
相关标签:
1条回答
  • 2021-01-21 09:08

    ElasticSearch Lucene regex engine does not support any type of lookarounds. The ES regex documentation is rather ambiguous saying matching everything like .* is very slow as well as using lookaround regular expressions (which is not only ambiguous, but also wrong since lookarounds, when used wisely, may greatly speed up regex matching).

    Since you want to match any string that contains f04 and does not contain z, you may actually use

    [^z]*fo4[^z]*
    

    Details

    • [^z]* - any 0+ chars other than z
    • fo4 - fo4 substring
    • [^z]* - any 0+ chars other than z.

    In case you have a multicharacter string to "exclude" (say, z4 rather than z), you may use your approach using a complement operator:

    .*f04.*&~(.*z4.*)
    

    This means almost the same but does not support line breaks:

    • .* - any chars other than newline, as many as possible
    • f04 - f04
    • .* - any chars other than newline, as many as possible
    • & - AND
    • ~(.*z4.*) - any string other than the one having z4
    0 讨论(0)
提交回复
热议问题