Negative lookahead regex to ignore list of words

匿名 (未验证) 提交于 2019-12-03 01:20:02

问题:

I am trying to write a regular expression that will find any word that is followed by a space so long as that word is not AND, OR, NOT.

I've tried a negative lookahead after searching for similar problems, this is my current regex: (?!AND|OR|NOT).*?\\s

If I try this with "AND " I get a match on "ND". If I try with "OR " I get "R" and if I try with "NOT " I get "OT".

Can anyone help?

回答1:

Try with this pattern:

\\b(?!(?:AND|OR|NOT)\\b)[a-zA-Z]+\\s 

I have added some word boundaries (\b) and used the character class [a-zA-Z] (you can replace it by [a-z] in a case insensitive context) to avoid the lazy quantifier.

or more performant (with case insensitive):

\\b(?>(?>[b-mp-z])|(?!(?>and|or|not)\\b)[aon])(?>[a-z]*)\\s 

if you want to match:

  • words between double-quotes without the double quotes or spaces:

(?<=(\"?)\\b)(?!(?:AND|OR|NOT)\\b)[a-zA-Z]+(?=\\1(?:\\s|$))

  • words between double-quotes with the double quotes:

(\"?)(?<=\\b)(?!(?:AND|OR|NOT)\\b)[a-zA-Z]+\\1(?=\\s|$)

  • words between parenthesis without parenthesis:

(?<=(\\()\\b)(?!(?:AND|OR|NOT)\\b)[a-zA-Z]+(?=(?(1)\\)|(?:\\s|$)))

  • words between parenthesis and double-quotes without both:

(?<=(\\()?(\"?)\\b)(?!(?:AND|OR|NOT)\\b)[a-zA-Z]+(?=(?(1)\\)|\\2(?:\\s|$)))

  • words that are not AND OR NOT without all that you want:

\\b(?!(?:AND|OR|NOT)\\b)[a-zA-Z]+\\b



回答2:

Hmm, I'm not 100% sure if I understood correctly, but could you try this and see if it's what you were looking for?

(?<=\bAND|\bOR|\bNOT)\s.* 

This will match XYZ in your comment (though with the preceding white character). I tested it here after adding a word in between.

EDIT: If there are no more characters to the right and you need the last three characters, you could use either:

\w+$ 

or:

[^\s]+$ 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!